SAST Tools Compared: Semgrep vs SonarQube vs CodeQL

SAST Tools Compared: Semgrep vs SonarQube vs CodeQL

Static Application Security Testing (SAST) tools scan your source code for vulnerabilities without executing it. Semgrep, SonarQube, and CodeQL are the three most widely used — each with different strengths, pricing models, and CI integration stories. This guide compares them head-to-head so you can pick the right one for your team.

Key Takeaways

Semgrep is the fastest to add to CI. Write a custom rule in minutes. Open-source rules cover OWASP Top 10 out of the box.

SonarQube is best for broad code quality. It tracks security, bugs, code smells, and coverage in one dashboard — ideal for teams that want a single quality gate.

CodeQL is deepest for semantic analysis. GitHub-native, understands data flow across functions and files. Best for catching subtle injection flaws that pattern matchers miss.

None of them replace pen testing. SAST finds what the code says, not what the running system allows. Always combine with DAST and manual review.

Start with Semgrep if you're greenfield. Zero infrastructure, instant CI integration, and the community ruleset covers the most common mistakes.

What Is SAST?

Static Application Security Testing analyzes source code, bytecode, or binaries for security vulnerabilities — without running the application. It works by parsing your code into an abstract syntax tree (AST) or control-flow graph, then applying rules that look for dangerous patterns.

SAST excels at catching:

  • SQL and command injection sinks
  • Hardcoded credentials
  • Insecure cryptography choices
  • Path traversal vulnerabilities
  • Insecure deserialization patterns

SAST does not catch:

  • Business logic flaws
  • Authentication bypass via runtime configuration
  • Infrastructure misconfigurations
  • Vulnerabilities introduced by third-party services

Semgrep

Semgrep is a fast, open-source static analysis tool built around pattern matching. Rules are written in YAML and look like the code they search for — making custom rules accessible to developers, not just security specialists.

What Semgrep Catches

Semgrep's community registry (semgrep.dev/r) contains thousands of rules across languages. Out of the box it catches:

  • SQL injection via string concatenation
  • Use of eval() on untrusted input
  • JWT verification disabled (verify=False)
  • Hardcoded secrets (API keys, passwords in strings)
  • Dangerous deserialization (pickle.loads, yaml.load)
  • SSRF sinks (user-controlled URLs passed to HTTP clients)

CI Integration

# .github/workflows/semgrep.yml
name: Semgrep
on: [push, pull_request]
jobs:
  semgrep:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: semgrep/semgrep-action@v1
        with:
          config: >-
            p/owasp-top-ten
            p/python
            p/javascript
        env:
          SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}

For open-source projects you can skip the token and just run locally:

pip install semgrep
semgrep --config p/owasp-top-ten ./src

Writing Custom Rules

Semgrep's rule format matches the shape of the code you're searching for. Here's a rule that catches SQL queries built with f-strings:

rules:
  - id: sql-injection-fstring
    patterns:
      - pattern: |
          cursor.execute(f"... {$VAR} ...")
      - pattern-not: |
          cursor.execute(f"... {$VAR} ...", ...)
    message: "Possible SQL injection via f-string. Use parameterized queries."
    languages: [python]
    severity: ERROR

Pricing

  • Open-source (OSS): Free, CLI only, community rules
  • Team: $40/dev/month — includes Semgrep Cloud Platform, findings triage, PR comments
  • Enterprise: Custom pricing — SSO, policies, compliance reports

Limitations

Semgrep is pattern-based. It doesn't track data flow across function boundaries. A value that enters as user input on line 10, gets passed through three helper functions, and reaches a SQL sink on line 200 may not be flagged unless the rule explicitly accounts for the intermediate steps.


SonarQube

SonarQube is the market leader for continuous code quality. It combines security analysis with code smell detection, test coverage tracking, and technical debt estimation — giving engineering managers a single dashboard that covers quality across every dimension.

What SonarQube Catches

SonarQube's security engine covers:

  • Injection flaws (SQL, XPath, LDAP, OS commands)
  • Cross-site scripting (XSS)
  • Sensitive data exposure (logging of passwords, PII)
  • Insecure random number generation
  • Weak cryptographic algorithms (MD5, SHA-1)
  • XXE vulnerabilities
  • Open redirects

SonarQube also reports:

  • Bugs (null dereference, resource leaks)
  • Code smells (duplication, complexity)
  • Test coverage gaps

CI Integration

SonarQube uses a scanner that runs in your CI and posts results back to the server:

# .github/workflows/sonar.yml
name: SonarQube Analysis
on: [push]
jobs:
  sonar:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: SonarQube Scan
        uses: sonarsource/sonarqube-scan-action@master
        env:
          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
          SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}

Configure project settings in sonar-project.properties:

sonar.projectKey=my-project
sonar.sources=src
sonar.tests=tests
sonar.python.coverage.reportPaths=coverage.xml

Quality Gates

SonarQube enforces a "Quality Gate" — a policy that blocks PRs if new code introduces security issues above a severity threshold. The default gate blocks merges if:

  • New code has any Security Hotspot that hasn't been reviewed
  • Coverage on new code drops below 80%
  • Reliability or maintainability rating drops below A

Pricing

  • Community Edition: Free, self-hosted, limited languages
  • Developer Edition: ~$150/year for 100k lines — branch analysis, PR decoration
  • Enterprise Edition: ~$20k+/year — portfolio management, advanced security engine
  • SonarCloud: SaaS version, free for public repos, $10/dev/month for private

Limitations

SonarQube requires running your own server (Community/Enterprise) or paying for SonarCloud. Setup time is non-trivial. The security rules are broader than CodeQL's data-flow analysis — you'll see more false positives on complex codebases.


CodeQL

CodeQL is GitHub's semantic code analysis engine. Unlike pattern-matching tools, CodeQL compiles your code into a queryable database and lets you write SQL-like queries over it. This enables data-flow analysis across the entire codebase — tracking a tainted value from its source to a dangerous sink regardless of how many function calls it passes through.

What CodeQL Catches

CodeQL's default security queries cover:

  • SQL injection (cross-function data-flow analysis)
  • Command injection
  • Path traversal
  • SSRF with source-to-sink tracking
  • Reflected and stored XSS
  • XML injection
  • Prototype pollution (JavaScript)
  • Log injection

Because CodeQL understands data flow, it catches vulnerabilities that Semgrep misses when the taint path spans multiple files or goes through framework internals.

CI Integration (GitHub Actions)

CodeQL ships as a GitHub Action and requires zero configuration for supported languages:

# .github/workflows/codeql.yml
name: CodeQL
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  schedule:
    - cron: '30 2 * * 1'

jobs:
  analyze:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
    strategy:
      matrix:
        language: [javascript, python]
    steps:
      - uses: actions/checkout@v4
      - uses: github/codeql-action/init@v3
        with:
          languages: ${{ matrix.language }}
      - uses: github/codeql-action/autobuild@v3
      - uses: github/codeql-action/analyze@v3

Results appear in GitHub's Security → Code scanning alerts tab.

Writing Custom Queries

CodeQL queries use QL, a logic-based language:

import python
import semmle.python.security.dataflow.SqlInjectionQuery

from SqlInjectionConfiguration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink,
  "SQL injection from $@.", source.getNode(), "user-controlled input"

Custom queries can extend built-in taint tracking to cover project-specific sanitizers and sources.

Pricing

  • Free for public repos: Full CodeQL analysis at no cost
  • GitHub Advanced Security: Required for private repos — included in GitHub Enterprise at ~$49/active committer/month
  • GitHub Enterprise Cloud: Bundled

Limitations

CodeQL only runs on GitHub (or self-hosted runners with GitHub Advanced Security). Analysis time is slow on large codebases — a 500k LOC project can take 20-30 minutes per scan. It doesn't cover all languages equally: JavaScript, Python, Java, and C/C++ have mature queries; others are limited.


Head-to-Head Comparison

Feature Semgrep SonarQube CodeQL
Analysis type Pattern matching Pattern + taint Data-flow (taint)
Setup time Minutes Hours Minutes (GitHub)
False positive rate Medium High Low
Custom rules Easy (YAML) Medium (Java) Hard (QL)
Cross-function tracking No Partial Yes
CI integration Any Any GitHub-native
Self-hosted option Yes Yes Limited
Free tier Yes (OSS) Yes (Community) Yes (public repos)
Language coverage Wide Wide Narrower

Which Tool Should You Use?

Use Semgrep if:

  • You want results in your CI today, not next week
  • You need to write custom rules for project-specific patterns
  • You're working with multiple languages including less common ones
  • Your team doesn't want to maintain SAST infrastructure

Use SonarQube if:

  • You need code quality (bugs, smells, coverage) alongside security
  • Your organization requires a quality gate enforced in PRs
  • You want a single dashboard for engineering leadership
  • You're in an enterprise context with compliance requirements

Use CodeQL if:

  • Your code lives on GitHub and you have Advanced Security
  • You're dealing with complex data flows where pattern matching fails
  • You want the lowest false-positive rate for injection flaws
  • You're willing to invest in custom QL queries for deep analysis

Combining Tools

In practice, the highest-ROI approach is to combine at least two:

  1. Semgrep as your first line of defense — fast, runs on every push, catches obvious mistakes
  2. CodeQL (if on GitHub) or SonarQube for deeper weekly scans

This gives you fast feedback on PRs from Semgrep while CodeQL handles the hard-to-find data-flow vulnerabilities that pattern matching misses.

Running Security Tests with HelpMeTest

SAST finds what the code says — but you also need to verify how the running application behaves. HelpMeTest lets you write end-to-end security smoke tests in plain English that verify your security controls at runtime:

*** Test Cases ***
SQL Injection Returns 400
    Go To  https://app.example.com/search
    Input Text  id=query  ' OR '1'='1
    Click Button  id=search-btn
    Page Should Not Contain  admin
    Response Status Code Should Be  400

Authentication Header Required
    Go To  https://api.example.com/users
    Response Status Code Should Be  401

These run on every deploy, giving you a continuous sanity check that your injection defenses are active in production — complementing what Semgrep and CodeQL find in the code.

Conclusion

All three tools are production-ready and actively maintained. The choice comes down to your infrastructure preferences and the depth of analysis you need. Start with Semgrep for fast, zero-infrastructure wins. Add CodeQL if you're on GitHub and want data-flow analysis. Add SonarQube if your team needs a unified code quality dashboard.

The most important thing: pick one and run it consistently. A SAST tool that runs on every PR — even an imperfect one — catches far more vulnerabilities than the perfect tool that runs quarterly.

Read more