Secrets Scanning and Pre-Commit Hooks: Preventing Credential Leaks

Secrets Scanning and Pre-Commit Hooks: Preventing Credential Leaks

Hardcoded credentials in source code are a leading cause of data breaches. AWS keys, database passwords, API tokens — once they hit a git commit, they're in version history forever, even if you delete them in a later commit. Secrets scanning tools and pre-commit hooks catch these before they're committed. This guide covers the tools, the setup, and how to respond when you find leaked secrets.

Key Takeaways

Git history is permanent. A secret in a commit is findable even after you delete the file. Rotate the credential immediately — don't just remove it from code.

Pre-commit hooks are the first line of defense. They catch secrets before they enter history at all. Set them up for every developer on the team, not just in CI.

CI scanning catches what pre-commit hooks miss. Not everyone installs hooks. Scan on every push as a backstop.

False positive rate matters. Tools with too many false positives get ignored or disabled. Tune your baseline before enforcing.

GitHub Secret Scanning is free and instant. If your code is on GitHub, enable it now — it requires zero setup and catches secrets within minutes of a push.

Why Secrets Leak

Despite security awareness, credentials end up in git regularly:

  • Developer pastes a working config into source for debugging, forgets to remove it
  • Test files commit real credentials instead of fixtures
  • .env files are committed when .gitignore isn't set up
  • Credentials appear in log output that gets committed
  • Copy-paste from a working example that included real keys

The consequences are severe. GitHub actively monitors public repos for known credential formats and notifies providers — AWS, Stripe, Google — who may revoke the key. But for private repos, credentials can sit exposed for years.


Pre-Commit Hooks

Pre-commit hooks run before git commit completes. If the hook exits non-zero, the commit is aborted. They're the earliest possible intervention.

Using the pre-commit Framework

The pre-commit framework manages hooks across tools with a single config file:

pip install pre-commit

Create .pre-commit-config.yaml in your repo root:

repos:
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']

  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - id: gitleaks

  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: detect-private-key
      - id: check-added-large-files
        args: ['--maxkb=500']

Install hooks for all developers:

pre-commit install

Now every git commit runs these checks. Team members without the framework installed bypass the hook — which is why CI enforcement is also necessary.


Tool Comparison

detect-secrets

detect-secrets (by Yelp) uses heuristic pattern matching and entropy analysis. Its key feature is a baseline file — a snapshot of known-acceptable false positives that you can commit and maintain.

pip install detect-secrets

# Create initial baseline (scan existing codebase, accept current state)
detect-secrets scan > .secrets.baseline

<span class="hljs-comment"># Audit the baseline — review each finding
detect-secrets audit .secrets.baseline

<span class="hljs-comment"># Scan for new secrets (compared to baseline)
detect-secrets scan --baseline .secrets.baseline

The baseline approach is pragmatic: on an existing codebase, the first scan finds hundreds of false positives. Rather than blocking work, you mark them as acceptable and only block new secrets.

What it catches:

  • AWS access keys (pattern: AKIA[0-9A-Z]{16})
  • Private keys (RSA, DSA, EC)
  • Basic auth in URLs
  • High-entropy strings that look like secrets
  • Hardcoded passwords near keywords like password =, secret =, token =

False positive rate: Medium — entropy-based detection flags random-looking strings in test data and generated output.

Gitleaks

Gitleaks scans git history (not just the working tree) for known secret patterns. It uses a regex-based ruleset covering 150+ secret types.

# Install
brew install gitleaks
<span class="hljs-comment"># or
go install github.com/gitleaks/gitleaks/v8@latest

<span class="hljs-comment"># Scan current directory (staged files)
gitleaks protect --staged

<span class="hljs-comment"># Scan entire git history
gitleaks detect

<span class="hljs-comment"># Scan a specific commit range
gitleaks detect --log-opts=<span class="hljs-string">"main..HEAD"

CI integration:

# .github/workflows/gitleaks.yml
name: Gitleaks
on: [push, pull_request]
jobs:
  gitleaks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for gitleaks
      - uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

What it catches:

  • 150+ service-specific patterns (AWS, GCP, Azure, Stripe, GitHub, Slack, Twilio, etc.)
  • Custom rules via config file
  • Generic patterns (PEM keys, high-entropy strings in certain contexts)

False positive rate: Lower than detect-secrets — patterns are service-specific, so random strings don't trigger rules unless they match known formats.

Custom rules:

# .gitleaks.toml
[extend]
useDefault = true

[[rules]]
id = "internal-api-key"
description = "Internal API key format"
regex = '''HELP-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}'''
severity = "CRITICAL"
tags = ["helpmetest", "api-key"]

[allowlist]
description = "Allowlist for known false positives"
regexes = ['''HELP-00000000-0000-0000-0000-000000000000''']  # test fixture

TruffleHog

TruffleHog specializes in scanning git history, including GitHub repos via GitHub API. It's the tool most commonly cited in breach post-mortems because it finds secrets committed years ago.

# Install
pip install truffleHog3

<span class="hljs-comment"># Scan a repo (local)
trufflehog git file://. --only-verified

<span class="hljs-comment"># Scan a GitHub repo (via API)
trufflehog github --repo https://github.com/yourorg/yourrepo

<span class="hljs-comment"># Scan with verification (try to use found credentials)
trufflehog git file://. --only-verified --fail

--only-verified is key: TruffleHog can test discovered credentials against their respective APIs. An AWS key that's been rotated and is no longer valid won't trigger a failure — reducing noise significantly.

CI integration:

- name: TruffleHog Scan
  uses: trufflesecurity/trufflehog@main
  with:
    path: ./
    base: ${{ github.event.repository.default_branch }}
    head: HEAD
    extra_args: --only-verified

GitHub Secret Scanning

If your repository is on GitHub, Secret Scanning is the zero-configuration option. It's enabled by default for public repos and available for private repos with GitHub Advanced Security.

Go to Settings → Security → Secret scanning and enable it. GitHub scans every push for 200+ known secret patterns and alerts the repository owner immediately.

Push protection blocks the push if a secret is detected — before it enters history:

Settings → Code security → Secret scanning → Push protection → Enable

With push protection enabled, developers who try to push a commit containing a known secret see:

remote: error: GH013: Repository rule violations found for refs/heads/main.
remote: - GITHUB PUSH PROTECTION
remote:   —————————————————————————————————————————
remote:   rule: Your push has been blocked because it contains a secret.
remote:   secret type: AWS Access Key
remote:   ...

They can either remove the secret or submit a bypass request with justification.


Setting Up a Complete Secrets Pipeline

1. Developer Workstations (Pre-Commit)

# Install pre-commit
pip install pre-commit

<span class="hljs-comment"># Project .pre-commit-config.yaml
<span class="hljs-built_in">cat > .pre-commit-config.yaml << <span class="hljs-string">'EOF'
repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - <span class="hljs-built_in">id: gitleaks

  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - <span class="hljs-built_in">id: detect-private-key
      - <span class="hljs-built_in">id: check-added-large-files
        args: [<span class="hljs-string">'--maxkb=500']
EOF

<span class="hljs-comment"># Install for all developers in the repo
pre-commit install

2. CI Pipeline (Every Push)

# .github/workflows/secrets-scan.yml
name: Secrets Scanning
on:
  push:
  pull_request:

jobs:
  gitleaks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  detect-secrets:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install detect-secrets
      - run: |
          detect-secrets scan --baseline .secrets.baseline
          detect-secrets audit .secrets.baseline --report

3. Periodic Full History Scan

Run monthly or quarterly to catch anything that slipped through:

# .github/workflows/history-scan.yml
name: Full History Scan
on:
  schedule:
    - cron: '0 4 1 * *'  # First of every month at 4am

jobs:
  trufflehog:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: trufflesecurity/trufflehog@main
        with:
          path: ./
          base: ''  # Scan all history
          head: HEAD
          extra_args: --only-verified

Handling False Positives

Every secrets scanner generates false positives. Managing them correctly avoids alert fatigue.

Inline Suppression (gitleaks)

# gitleaks:allow
EXAMPLE_API_KEY = "sk_test_4eC39HqLyjWDarjtT7en6cOW"  # This is a test key from the docs

detect-secrets Baseline

The .secrets.baseline file records accepted false positives:

# After reviewing a finding as a false positive
detect-secrets audit .secrets.baseline
<span class="hljs-comment"># Mark the finding as false positive in the interactive audit

<span class="hljs-comment"># Commit the updated baseline
git add .secrets.baseline
git commit -m <span class="hljs-string">"chore: update secrets baseline — mark test keys as false positive"

gitleaks Allowlist

# .gitleaks.toml
[allowlist]
commits = [
  "a123b456",  # Old commit with test fixture that's not a real secret
]
paths = [
  '''test/fixtures/.*''',
  '''docs/examples/.*'''
]
regexes = [
  '''STRIPE_TEST_KEY=sk_test_''',  # Always a test key if starts with sk_test_
]

Responding to a Leaked Secret

If a secret is found in your history — treat it as compromised, regardless of whether you've "removed" it.

Immediate steps:

  1. Rotate the credential first — before anything else. Assume it's been exploited.
  2. Revoke the old credential in the issuing service (AWS IAM, Stripe dashboard, etc.)
  3. Check access logs for the credential — look for unauthorized usage in the window between commit and rotation
  4. Rewrite git history (optional, for private repos where you control all clones):
# Using git-filter-repo (preferred over filter-branch)
pip install git-filter-repo
git filter-repo --path-glob <span class="hljs-string">'**/*.env' --invert-paths
<span class="hljs-comment"># or replace specific string
git filter-repo --replace-text <(<span class="hljs-built_in">echo <span class="hljs-string">"sk_live_COMPROMISED_KEY==>REMOVED")
git push --force

Note: Force-pushing rewrites history. All collaborators must re-clone. This is often impractical in busy repositories — prioritize rotating the credential over cleaning history.

  1. Add the secret format to your allowlist to prevent re-occurrence
  2. Document in a security incident — when, what was exposed, rotation timestamp, log review findings

Testing with HelpMeTest

HelpMeTest can verify that your secrets scanning setup is working end-to-end:

*** Test Cases ***
Pre-Commit Hook Blocks AWS Key
    # Test that the pre-commit hook environment is set up correctly
    Run Process  git config core.hooksPath .git/hooks  shell=True
    ${result}=  Run Process  cat .git/hooks/pre-commit  shell=True
    Should Contain  ${result.stdout}  gitleaks

CI Secrets Scan Workflow Exists
    File Should Exist  .github/workflows/secrets-scan.yml
    ${content}=  Get File  .github/workflows/secrets-scan.yml
    Should Contain  ${content}  gitleaks

Gitleaks Config Has Internal Key Pattern
    File Should Exist  .gitleaks.toml
    ${content}=  Get File  .gitleaks.toml
    Should Contain  ${content}  internal-api-key

Tool Selection Guide

Requirement Recommended Tool
Zero-configuration start GitHub Secret Scanning
Developer pre-commit protection gitleaks (pre-commit hook)
Full history scan TruffleHog (--only-verified)
Existing codebase with false positives detect-secrets (baseline)
Custom credential format detection gitleaks (custom rules)
Verified active credential detection TruffleHog (--only-verified)

Conclusion

Secrets scanning is one of the highest-ROI security investments you can make. The tools are free, setup takes an hour, and the alternative — a compromised AWS key — costs days of incident response and potentially millions in unauthorized infrastructure charges.

Start with GitHub Secret Scanning (zero effort for GitHub repos). Add gitleaks as a pre-commit hook. Run TruffleHog on your full history now to find anything already committed. Then enforce in CI so new secrets can't enter the repository even when pre-commit hooks aren't installed.

The goal is to make committing a secret harder than not committing it — and for developers to see the block before the credential ever leaves their machine.

Read more