SAST Tools Compared: Semgrep vs SonarQube vs CodeQL
Static Application Security Testing (SAST) tools scan your source code for vulnerabilities without executing it. Semgrep, SonarQube, and CodeQL are the three most widely used — each with different strengths, pricing models, and CI integration stories. This guide compares them head-to-head so you can pick the right one for your team.
Key Takeaways
Semgrep is the fastest to add to CI. Write a custom rule in minutes. Open-source rules cover OWASP Top 10 out of the box.
SonarQube is best for broad code quality. It tracks security, bugs, code smells, and coverage in one dashboard — ideal for teams that want a single quality gate.
CodeQL is deepest for semantic analysis. GitHub-native, understands data flow across functions and files. Best for catching subtle injection flaws that pattern matchers miss.
None of them replace pen testing. SAST finds what the code says, not what the running system allows. Always combine with DAST and manual review.
Start with Semgrep if you're greenfield. Zero infrastructure, instant CI integration, and the community ruleset covers the most common mistakes.
What Is SAST?
Static Application Security Testing analyzes source code, bytecode, or binaries for security vulnerabilities — without running the application. It works by parsing your code into an abstract syntax tree (AST) or control-flow graph, then applying rules that look for dangerous patterns.
SAST excels at catching:
- SQL and command injection sinks
- Hardcoded credentials
- Insecure cryptography choices
- Path traversal vulnerabilities
- Insecure deserialization patterns
SAST does not catch:
- Business logic flaws
- Authentication bypass via runtime configuration
- Infrastructure misconfigurations
- Vulnerabilities introduced by third-party services
Semgrep
Semgrep is a fast, open-source static analysis tool built around pattern matching. Rules are written in YAML and look like the code they search for — making custom rules accessible to developers, not just security specialists.
What Semgrep Catches
Semgrep's community registry (semgrep.dev/r) contains thousands of rules across languages. Out of the box it catches:
- SQL injection via string concatenation
- Use of
eval()on untrusted input - JWT verification disabled (
verify=False) - Hardcoded secrets (API keys, passwords in strings)
- Dangerous deserialization (
pickle.loads,yaml.load) - SSRF sinks (user-controlled URLs passed to HTTP clients)
CI Integration
# .github/workflows/semgrep.yml
name: Semgrep
on: [push, pull_request]
jobs:
semgrep:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: semgrep/semgrep-action@v1
with:
config: >-
p/owasp-top-ten
p/python
p/javascript
env:
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}For open-source projects you can skip the token and just run locally:
pip install semgrep
semgrep --config p/owasp-top-ten ./srcWriting Custom Rules
Semgrep's rule format matches the shape of the code you're searching for. Here's a rule that catches SQL queries built with f-strings:
rules:
- id: sql-injection-fstring
patterns:
- pattern: |
cursor.execute(f"... {$VAR} ...")
- pattern-not: |
cursor.execute(f"... {$VAR} ...", ...)
message: "Possible SQL injection via f-string. Use parameterized queries."
languages: [python]
severity: ERRORPricing
- Open-source (OSS): Free, CLI only, community rules
- Team: $40/dev/month — includes Semgrep Cloud Platform, findings triage, PR comments
- Enterprise: Custom pricing — SSO, policies, compliance reports
Limitations
Semgrep is pattern-based. It doesn't track data flow across function boundaries. A value that enters as user input on line 10, gets passed through three helper functions, and reaches a SQL sink on line 200 may not be flagged unless the rule explicitly accounts for the intermediate steps.
SonarQube
SonarQube is the market leader for continuous code quality. It combines security analysis with code smell detection, test coverage tracking, and technical debt estimation — giving engineering managers a single dashboard that covers quality across every dimension.
What SonarQube Catches
SonarQube's security engine covers:
- Injection flaws (SQL, XPath, LDAP, OS commands)
- Cross-site scripting (XSS)
- Sensitive data exposure (logging of passwords, PII)
- Insecure random number generation
- Weak cryptographic algorithms (MD5, SHA-1)
- XXE vulnerabilities
- Open redirects
SonarQube also reports:
- Bugs (null dereference, resource leaks)
- Code smells (duplication, complexity)
- Test coverage gaps
CI Integration
SonarQube uses a scanner that runs in your CI and posts results back to the server:
# .github/workflows/sonar.yml
name: SonarQube Analysis
on: [push]
jobs:
sonar:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: SonarQube Scan
uses: sonarsource/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}Configure project settings in sonar-project.properties:
sonar.projectKey=my-project
sonar.sources=src
sonar.tests=tests
sonar.python.coverage.reportPaths=coverage.xmlQuality Gates
SonarQube enforces a "Quality Gate" — a policy that blocks PRs if new code introduces security issues above a severity threshold. The default gate blocks merges if:
- New code has any Security Hotspot that hasn't been reviewed
- Coverage on new code drops below 80%
- Reliability or maintainability rating drops below A
Pricing
- Community Edition: Free, self-hosted, limited languages
- Developer Edition: ~$150/year for 100k lines — branch analysis, PR decoration
- Enterprise Edition: ~$20k+/year — portfolio management, advanced security engine
- SonarCloud: SaaS version, free for public repos, $10/dev/month for private
Limitations
SonarQube requires running your own server (Community/Enterprise) or paying for SonarCloud. Setup time is non-trivial. The security rules are broader than CodeQL's data-flow analysis — you'll see more false positives on complex codebases.
CodeQL
CodeQL is GitHub's semantic code analysis engine. Unlike pattern-matching tools, CodeQL compiles your code into a queryable database and lets you write SQL-like queries over it. This enables data-flow analysis across the entire codebase — tracking a tainted value from its source to a dangerous sink regardless of how many function calls it passes through.
What CodeQL Catches
CodeQL's default security queries cover:
- SQL injection (cross-function data-flow analysis)
- Command injection
- Path traversal
- SSRF with source-to-sink tracking
- Reflected and stored XSS
- XML injection
- Prototype pollution (JavaScript)
- Log injection
Because CodeQL understands data flow, it catches vulnerabilities that Semgrep misses when the taint path spans multiple files or goes through framework internals.
CI Integration (GitHub Actions)
CodeQL ships as a GitHub Action and requires zero configuration for supported languages:
# .github/workflows/codeql.yml
name: CodeQL
on:
push:
branches: [main]
pull_request:
branches: [main]
schedule:
- cron: '30 2 * * 1'
jobs:
analyze:
runs-on: ubuntu-latest
permissions:
security-events: write
strategy:
matrix:
language: [javascript, python]
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3Results appear in GitHub's Security → Code scanning alerts tab.
Writing Custom Queries
CodeQL queries use QL, a logic-based language:
import python
import semmle.python.security.dataflow.SqlInjectionQuery
from SqlInjectionConfiguration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink,
"SQL injection from $@.", source.getNode(), "user-controlled input"Custom queries can extend built-in taint tracking to cover project-specific sanitizers and sources.
Pricing
- Free for public repos: Full CodeQL analysis at no cost
- GitHub Advanced Security: Required for private repos — included in GitHub Enterprise at ~$49/active committer/month
- GitHub Enterprise Cloud: Bundled
Limitations
CodeQL only runs on GitHub (or self-hosted runners with GitHub Advanced Security). Analysis time is slow on large codebases — a 500k LOC project can take 20-30 minutes per scan. It doesn't cover all languages equally: JavaScript, Python, Java, and C/C++ have mature queries; others are limited.
Head-to-Head Comparison
| Feature | Semgrep | SonarQube | CodeQL |
|---|---|---|---|
| Analysis type | Pattern matching | Pattern + taint | Data-flow (taint) |
| Setup time | Minutes | Hours | Minutes (GitHub) |
| False positive rate | Medium | High | Low |
| Custom rules | Easy (YAML) | Medium (Java) | Hard (QL) |
| Cross-function tracking | No | Partial | Yes |
| CI integration | Any | Any | GitHub-native |
| Self-hosted option | Yes | Yes | Limited |
| Free tier | Yes (OSS) | Yes (Community) | Yes (public repos) |
| Language coverage | Wide | Wide | Narrower |
Which Tool Should You Use?
Use Semgrep if:
- You want results in your CI today, not next week
- You need to write custom rules for project-specific patterns
- You're working with multiple languages including less common ones
- Your team doesn't want to maintain SAST infrastructure
Use SonarQube if:
- You need code quality (bugs, smells, coverage) alongside security
- Your organization requires a quality gate enforced in PRs
- You want a single dashboard for engineering leadership
- You're in an enterprise context with compliance requirements
Use CodeQL if:
- Your code lives on GitHub and you have Advanced Security
- You're dealing with complex data flows where pattern matching fails
- You want the lowest false-positive rate for injection flaws
- You're willing to invest in custom QL queries for deep analysis
Combining Tools
In practice, the highest-ROI approach is to combine at least two:
- Semgrep as your first line of defense — fast, runs on every push, catches obvious mistakes
- CodeQL (if on GitHub) or SonarQube for deeper weekly scans
This gives you fast feedback on PRs from Semgrep while CodeQL handles the hard-to-find data-flow vulnerabilities that pattern matching misses.
Running Security Tests with HelpMeTest
SAST finds what the code says — but you also need to verify how the running application behaves. HelpMeTest lets you write end-to-end security smoke tests in plain English that verify your security controls at runtime:
*** Test Cases ***
SQL Injection Returns 400
Go To https://app.example.com/search
Input Text id=query ' OR '1'='1
Click Button id=search-btn
Page Should Not Contain admin
Response Status Code Should Be 400
Authentication Header Required
Go To https://api.example.com/users
Response Status Code Should Be 401These run on every deploy, giving you a continuous sanity check that your injection defenses are active in production — complementing what Semgrep and CodeQL find in the code.
Conclusion
All three tools are production-ready and actively maintained. The choice comes down to your infrastructure preferences and the depth of analysis you need. Start with Semgrep for fast, zero-infrastructure wins. Add CodeQL if you're on GitHub and want data-flow analysis. Add SonarQube if your team needs a unified code quality dashboard.
The most important thing: pick one and run it consistently. A SAST tool that runs on every PR — even an imperfect one — catches far more vulnerabilities than the perfect tool that runs quarterly.