Betterleaks: Open-Source Secrets Scanner for Enhanced Security

â–¼ Summary
– Betterleaks is a new tool by Gitleaks’ original author, Zach Rice, designed to scan for leaked credentials like API keys and passwords in various sources.
– It was created because Rice no longer has full control over Gitleaks, and it functions as a direct, compatible replacement using the same configurations.
– Its key technical improvement is the “Token Efficiency” method, which uses byte pair encoding to filter false positives more effectively than the entropy method used by Gitleaks.
– The tool offers enhanced features like parallelized scanning, support for encoded secrets, multiple output formats, and is built in pure Go for easier deployment.
– Future updates are planned, including LLM-assisted classification, auto-revocation of detected credentials, and output controls tailored for AI coding agents.
Secrets scanning is now a fundamental security practice for development teams, and Gitleaks has long been a popular choice for this task. The creator of that original tool has introduced a new, open-source project named Betterleaks. This new scanner is engineered to detect exposed credentials, API keys, tokens, and passwords within git repositories, directories, or data piped through standard input.
The project is led by Zach Rice, who initially developed Gitleaks roughly eight years ago. Rice currently holds the position of Head of Secrets Scanning at Aikido Security. He decided to launch this new initiative after losing full administrative control over the original Gitleaks repository and its branding. Betterleaks is designed as a direct, compatible replacement, so users can switch over using their existing command-line flags and configuration files without any changes.
A major technical advancement in Betterleaks is its method for filtering potential secrets. Traditional scanners, including Gitleaks, often use Shannon entropy to flag strings that appear random. Betterleaks employs a novel technique called Token Efficiency, which is based on byte pair encoding (BPE) tokenization. This approach analyzes how effectively a BPE tokenizer can compress a given string. Natural language compresses efficiently into longer tokens, resulting in high token efficiency. In contrast, secrets and random strings compress poorly, breaking into many short tokens and showing low efficiency. This metric helps the tool filter out false positives. According to Rice, testing against the CredData dataset showed Token Efficiency achieved a 98.6 percent recall rate, significantly outperforming the 70.4 percent recall of entropy-based methods.
The validation logic within Betterleaks is written using the Common Expression Language (CEL). This provides rule authors with programmatic control to define precisely what constitutes a verified secret. The tool also natively handles secrets that have been encoded two or three times and supports parallelized git scanning to improve performance. It is written in pure Go without CGO dependencies, eliminating the need for the Hyperscan library and simplifying deployment across various environments. Additional features include the ability to scan archive files, even nested ones, and output findings in multiple formats like JSON, CSV, JUnit, SARIF, and custom templates.
Looking ahead, the project’s roadmap outlines several planned features for a future v2 release. These include LLM-assisted classification, where anonymized candidate secrets could be analyzed by a local or remote language model for additional context. The team also plans auto-revocation support for services that offer credential revocation APIs, and permissions mapping to illustrate the exact access level a compromised secret would grant.
The tool’s design also considers modern development workflows, including those involving AI coding assistants. It offers flag-based output control, allowing AI agents in platforms like Claude Code or Cursor to execute it as a subprocess and parse its results without consuming excessive tokens. Betterleaks is freely available for download and use on its GitHub repository.
(Source: Help Net Security)