Garak: Open-Source AI Security Scanner for LLMs

▼ Summary
– Garak is a free, open-source tool designed to test LLM weaknesses like hallucinations, prompt injections, jailbreaks, and toxic outputs.
– It works with a wide range of models and platforms, including Hugging Face, Replicate, OpenAI API, LiteLLM, and REST-accessible systems.
– Garak generates multiple logs: a main garak.log for debugging, a JSONL report per run with probing details, and a hit log for vulnerabilities.
– The tool helps developers identify model failures and improve safety by running various tests.
– Garak is available for free on GitHub.
When deploying large language models in production environments, ensuring their security and reliability becomes a top priority. These advanced systems, while powerful, can sometimes produce unexpected outputs, disclose sensitive information, or respond to malicious prompts in ways that compromise system integrity. Garak emerges as a vital open-source security scanner, purpose-built to identify such vulnerabilities through systematic probing and analysis.
This freely available tool conducts a battery of tests designed to uncover weaknesses including prompt injection attacks, model jailbreaks, factual inaccuracies, and toxic content generation. By simulating adversarial interactions, it provides developers with clear insights into where a model may fail, enabling them to strengthen defenses and improve overall robustness.
Garak boasts impressive compatibility, supporting a broad spectrum of models and deployment platforms. It works seamlessly with generative models from the Hugging Face Hub, text generation models on Replicate, and API-based models from OpenAI including both chat and completion endpoints. The tool also integrates with LiteLLM and any system accessible via REST API. Additionally, it handles GGUF format models, such as those running on llama.cpp version 1046 or later, making it adaptable to many popular LLM implementations.
During operation, Garak produces detailed diagnostic logs to assist with debugging and analysis. A primary log file named garak.log captures debugging data from both the core tool and its plugins, persisting across multiple sessions. Each scanning run also generates a dedicated JSONL report containing records of every probe attempt. This report file is created at the start of a run and finalized upon successful completion, with entries updated in real-time as results are processed and evaluated. A status attribute within each entry indicates the current stage of the probing attempt. Furthermore, Garak maintains a specialized hit log that specifically documents instances where a vulnerability was successfully identified.
Available at no cost on GitHub, Garak provides an accessible and powerful solution for teams committed to building safer AI systems.
(Source: HelpNet Security)