tartufo: searches through git repositories for secrets, digging deep into commit history and branches

tartufo

tartufo searches through git repositories for secrets, digging deep into commit history and branches. This is effective at finding secrets accidentally committed. tartufo also can be used by git pre-commit scripts to screen changes for secrets before they are committed to the repository.

This tool will go through the entire commit history of each branch, check each diff from each commit, and check for secrets. This is both by regex and by entropy. For entropy checks, tartufo will evaluate the shannon entropy for both the base64 char set and hexidecimal char set for every blob of text greater than 20 characters comprised of those character sets in each diff. If at any point a high entropy string > 20 characters is detected, it will print to the screen.

 

secrets git repositories

Features

Modes of Operation

While tartufo started its life with one primary mode of operation, scanning the history of a git repository, it has grown other time to have a number of additional uses and modes of operation. These are all invoked via different sub-commands of tartufo.

Git Repository History Scan

This is the “classic” use case for tartufo: Scanning the history of a git repository. There are two ways to invoke this functionality, depending if you are scanning a repository which you already have cloned locally, or one on a remote system.

Pre-commit Hook

This mode of operation instructs tartufo to scan staged, uncommitted changes in a local repository. This is the flip-side of the primary mode of operation. Instead of checking for secrets you have already checked in, this helps prevent you from committing new secrets!

When running this sub-command, the caller’s current working directory is assumed to be somewhere within the local clone’s tree and the repository root is determined automatically.

Scan Types

tartufo offers multiple types of scans, each of which can be optionally enabled or disabled, while looking through its target for secrets.

Regex Checking

tartufo can scan for a pre-built list of known signatures for things such as SSH keys, EC2 credentials, etc. These scans are activated by use of the --regex flag on the command line. They will be reported with an issue type of Regular Expression Match, and the issue detail will be the name of the regular expression which was matched.

Customizing

Additional rules can be specified as described in the Rule Patterns section of the Configuration document.

Things like subdomain enumeration, s3 bucket detection, and other useful regexes highly custom to the situation can be added.

If you would like to deactivate the default regex rules, using only your custom rule set, you can use the --no-default-regexes flag.

Feel free to also contribute high signal regexes upstream that you think will benefit the community. Things like Azure keys, Twilio keys, Google Compute keys, are welcome, provided a high signal regex can be constructed.

tartufo’s base rule set can be found in the file data/default_regexes.json.

High Entropy Checking

tartufo calculates the Shannon entropy of each commit, finding strings which appear to be generated from a stochastic source. In short, it looks for pieces of data which look random, as these are likely to be things such as cryptographic keys. These scans are activated by usage of the --entropy command line flag.

Scan Limiting (Exclusions)

By its very nature, especially when it comes to high entropy scans, tartufo can encounter a number of false positives. Whether those are things like links to git commit hashes, tokens/passwords used for tests, or any other variety of thing, there needs to be a way to tell tartufo to ignore those things, and not report them out as issues. For this reason, we provide multiple methods for excluding these items.

Install & Use

© Copyright 2019-2023, GoDaddy.com, LLC.