Testing Go CLI Applications with Cobra and testscript

HelpMeTest

17 May 2026 — 7 min read

CLI tools are deceptively hard to test. They interact through stdin and stdout, depend on environment variables and working directories, return exit codes instead of values, and often spawn subprocesses. Most teams settle for a few manual smoke tests and call it done. When the CLI breaks in a way that only surfaces on a specific flag combination or with a specific environment variable set, the bug ships.

This guide covers the full testing stack for Go CLIs built with Cobra: unit testing individual commands, capturing output, integration testing with os/exec, and the testscript package that turns human-readable script files into hermetic integration tests.

Why CLI Testing is Different

Unit testing a library function is clean: pass inputs, check outputs, mock dependencies. CLI testing involves:

stdin/stdout/stderr — output goes to file descriptors, not return values
Exit codes — non-zero exit communicates errors, but panics also produce non-zero exits
Environment variables — $HOME, $PATH, $CONFIG_DIR affect behavior globally
Working directory — relative paths resolve against os.Getwd(), which changes across tests
Subprocesses — your CLI may call git, docker, or other tools
Configuration files — many CLIs read config from ~/.config/tool/config.yaml

Testing any one of these in isolation is fine. Testing them together, in combinations, while keeping tests hermetic and parallel-safe, is the hard part.

Project Structure

A testable Cobra CLI has clean separation between the command definition and the business logic:

myapp/
  cmd/
    root.go        # Root command setup
    deploy.go      # deploy subcommand
    version.go     # version subcommand
  internal/
    deployer/
      deployer.go  # Business logic, not cobra-specific
  main.go
  cmd/root_test.go
  cmd/deploy_test.go
  testdata/
    scripts/       # testscript .txtar files

The deployer package handles actual work. The cmd/ package handles CLI parsing and delegates to deployer. This means you can test the business logic independently of the CLI, and test the CLI independently of the actual deployment logic.

Unit Testing Cobra Commands

Cobra commands support redirecting output via SetOut, SetErr, and setting args via SetArgs. This lets you test command behavior without running a subprocess:

// cmd/version.go
package cmd

import (
    "fmt"
    "github.com/spf13/cobra"
)

var Version = "dev"

func newVersionCmd() *cobra.Command {
    return &cobra.Command{
        Use:   "version",
        Short: "Print the version",
        RunE: func(cmd *cobra.Command, args []string) error {
            fmt.Fprintln(cmd.OutOrStdout(), Version)
            return nil
        },
    }
}

The test:

// cmd/version_test.go
package cmd

import (
    "bytes"
    "testing"
)

func TestVersionCommand(t *testing.T) {
    Version = "1.2.3" // Set for test

    var buf bytes.Buffer
    cmd := newVersionCmd()
    cmd.SetOut(&buf)

    cmd.SetArgs([]string{})
    err := cmd.Execute()
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }

    got := buf.String()
    want := "1.2.3\n"
    if got != want {
        t.Errorf("version output: got %q, want %q", got, want)
    }
}

For commands with flags, set them via SetArgs:

func TestDeployCommand(t *testing.T) {
    tests := []struct {
        name    string
        args    []string
        wantOut string
        wantErr bool
    }{
        {
            name:    "deploy to staging",
            args:    []string{"--env", "staging", "--version", "1.2.3"},
            wantOut: "Deploying 1.2.3 to staging",
            wantErr: false,
        },
        {
            name:    "fails with missing version",
            args:    []string{"--env", "staging"},
            wantErr: true,
        },
        {
            name:    "fails with invalid env",
            args:    []string{"--env", "production_typo", "--version", "1.2.3"},
            wantErr: true,
        },
    }

    for _, tc := range tests {
        t.Run(tc.name, func(t *testing.T) {
            var outBuf, errBuf bytes.Buffer

            // Create a fresh deployer mock
            mockDeployer := &MockDeployer{}
            cmd := newDeployCmd(mockDeployer)
            cmd.SetOut(&outBuf)
            cmd.SetErr(&errBuf)
            cmd.SetArgs(tc.args)

            err := cmd.Execute()

            if tc.wantErr && err == nil {
                t.Error("expected error, got nil")
            }
            if !tc.wantErr && err != nil {
                t.Errorf("unexpected error: %v", err)
            }
            if tc.wantOut != "" && !bytes.Contains(outBuf.Bytes(), []byte(tc.wantOut)) {
                t.Errorf("output %q does not contain %q", outBuf.String(), tc.wantOut)
            }
        })
    }
}

Mocking Dependencies

Dependency injection makes Cobra commands testable. Instead of calling os.MkdirAll directly, accept an interface:

// internal/deployer/deployer.go
package deployer

type FileSystem interface {
    MkdirAll(path string, perm os.FileMode) error
    WriteFile(name string, data []byte, perm os.FileMode) error
    ReadFile(name string) ([]byte, error)
}

type Deployer struct {
    fs FileSystem
}

func New(fs FileSystem) *Deployer {
    return &Deployer{fs: fs}
}

The test uses a fake filesystem:

// internal/deployer/fake_fs_test.go
type fakeFS struct {
    files map[string][]byte
    dirs  map[string]bool
    mu    sync.Mutex
}

func newFakeFS() *fakeFS {
    return &fakeFS{
        files: make(map[string][]byte),
        dirs:  make(map[string]bool),
    }
}

func (f *fakeFS) MkdirAll(path string, perm os.FileMode) error {
    f.mu.Lock()
    defer f.mu.Unlock()
    f.dirs[path] = true
    return nil
}

func (f *fakeFS) WriteFile(name string, data []byte, perm os.FileMode) error {
    f.mu.Lock()
    defer f.mu.Unlock()
    f.files[name] = data
    return nil
}

func (f *fakeFS) ReadFile(name string) ([]byte, error) {
    f.mu.Lock()
    defer f.mu.Unlock()
    data, ok := f.files[name]
    if !ok {
        return nil, os.ErrNotExist
    }
    return data, nil
}

Integration Testing with os/exec

For true end-to-end testing, build the binary and run it as a subprocess:

// cmd/integration_test.go
package cmd_test

import (
    "bytes"
    "os"
    "os/exec"
    "path/filepath"
    "testing"
)

var binaryPath string

func TestMain(m *testing.M) {
    // Build the binary once for all integration tests
    tmpDir, err := os.MkdirTemp("", "myapp-test-*")
    if err != nil {
        panic(err)
    }
    defer os.RemoveAll(tmpDir)

    binaryPath = filepath.Join(tmpDir, "myapp")
    build := exec.Command("go", "build", "-o", binaryPath, "./...")
    build.Dir = "../" // repo root
    if out, err := build.CombinedOutput(); err != nil {
        panic(string(out))
    }

    os.Exit(m.Run())
}

func runCLI(t *testing.T, args ...string) (stdout, stderr string, exitCode int) {
    t.Helper()
    var outBuf, errBuf bytes.Buffer

    cmd := exec.Command(binaryPath, args...)
    cmd.Stdout = &outBuf
    cmd.Stderr = &errBuf

    err := cmd.Run()
    exitCode = 0
    if exitErr, ok := err.(*exec.ExitError); ok {
        exitCode = exitErr.ExitCode()
    } else if err != nil {
        t.Fatalf("unexpected error running CLI: %v", err)
    }

    return outBuf.String(), errBuf.String(), exitCode
}

func TestVersionIntegration(t *testing.T) {
    stdout, _, exitCode := runCLI(t, "version")
    if exitCode != 0 {
        t.Errorf("expected exit code 0, got %d", exitCode)
    }
    if stdout == "" {
        t.Error("expected version output, got empty string")
    }
}

func TestHelpFlag(t *testing.T) {
    stdout, _, exitCode := runCLI(t, "--help")
    if exitCode != 0 {
        t.Errorf("expected exit code 0, got %d", exitCode)
    }
    if !bytes.Contains([]byte(stdout), []byte("Usage:")) {
        t.Errorf("help output does not contain 'Usage:', got: %s", stdout)
    }
}

testscript: The Best Way to Integration Test CLIs

The testscript package (from golang.org/x/tools/txtar and rsc.io/script/scripttest) provides a scripting language designed exactly for testing CLI tools. Tests are .txtar files containing script commands and file contents. They are hermetic (each runs in its own temp directory), readable (plain text), and maintainable (no test framework boilerplate).

Install:

go get github.com/rogpeppe/go-internal/testscript@latest

A test script looks like this:

# testdata/scripts/deploy.txtar

# Setup: create a config file
mkdir -p config
cp config.yaml config/config.yaml

# Test: successful deploy to staging
exec myapp deploy --env staging --version 1.2.3
stdout 'Deploying 1.2.3 to staging'
! stderr .

# Test: fails with unknown environment
! exec myapp deploy --env unknown --version 1.2.3
stderr 'Unknown environment'

-- config/config.yaml --
endpoint: https://staging.example.com
timeout: 30s

The Go test that runs it:

// cmd/testscript_test.go
package cmd_test

import (
    "os"
    "testing"

    "github.com/rogpeppe/go-internal/testscript"
)

func TestScript(t *testing.T) {
    testscript.Run(t, testscript.Params{
        Dir: "testdata/scripts",
        Cmds: map[string]func(*testscript.TestScript, bool, []string){
            // Register 'myapp' as a command that runs our binary
        },
        Setup: func(env *testscript.Env) error {
            // Make our binary available as 'myapp' in scripts
            env.Setenv("PATH", binaryDir+":"+os.Getenv("PATH"))
            return nil
        },
    })
}

The testscript language has commands for:

exec cmd args — run a command
! exec cmd args — run a command, expect failure
stdout pattern — assert stdout matches regex
stderr pattern — assert stderr matches regex
! stdout pattern — assert stdout does NOT match
cmp file1 file2 — compare file contents
exists file — assert file exists
mkdir dir — create directory
cp src dst — copy file
env NAME=value — set environment variable

Golden File Testing

Golden file testing is the pattern of storing expected output in files and comparing against them. It is useful for commands that produce complex or multi-line output:

func TestDeployOutputGolden(t *testing.T) {
    stdout, _, exitCode := runCLI(t, "deploy", "--env", "staging", "--version", "1.2.3")
    if exitCode != 0 {
        t.Fatalf("unexpected exit code: %d", exitCode)
    }

    goldenPath := filepath.Join("testdata", "golden", "deploy-staging.txt")

    if os.Getenv("UPDATE_GOLDEN") == "1" {
        // Update the golden file
        err := os.WriteFile(goldenPath, []byte(stdout), 0644)
        if err != nil {
            t.Fatalf("writing golden file: %v", err)
        }
        t.Logf("Updated golden file: %s", goldenPath)
        return
    }

    expected, err := os.ReadFile(goldenPath)
    if err != nil {
        t.Fatalf("reading golden file: %v", err)
    }

    if string(expected) != stdout {
        t.Errorf("output mismatch:\nwant:\n%s\ngot:\n%s", expected, stdout)
    }
}

Update golden files with:

UPDATE_GOLDEN=1 go test ./cmd/... -run TestDeployOutputGolden

Commit the updated golden files. The diff in your PR shows exactly what changed in the output, making reviews straightforward.

Table-Driven Tests for Flag Combinations

Table-driven tests cover flag combinations efficiently:

func TestDeployFlags(t *testing.T) {
    cases := []struct {
        name     string
        args     []string
        wantCode int
        wantOut  string
        wantErr  string
    }{
        {"staging deploy", []string{"deploy", "--env", "staging", "--version", "1.0.0"}, 0, "Deploying", ""},
        {"production deploy", []string{"deploy", "--env", "production", "--version", "2.0.0"}, 0, "Deploying", ""},
        {"missing env flag", []string{"deploy", "--version", "1.0.0"}, 1, "", "required flag"},
        {"missing version flag", []string{"deploy", "--env", "staging"}, 1, "", "required flag"},
        {"invalid env", []string{"deploy", "--env", "qa", "--version", "1.0.0"}, 2, "", "Unknown environment"},
        {"invalid version format", []string{"deploy", "--env", "staging", "--version", "latest"}, 3, "", "Invalid version"},
        {"dry run flag", []string{"deploy", "--env", "staging", "--version", "1.0.0", "--dry-run"}, 0, "DRY RUN", ""},
    }

    for _, tc := range cases {
        t.Run(tc.name, func(t *testing.T) {
            t.Parallel()
            stdout, stderr, exitCode := runCLI(t, tc.args...)

            if exitCode != tc.wantCode {
                t.Errorf("exit code: got %d, want %d\nstdout: %s\nstderr: %s",
                    exitCode, tc.wantCode, stdout, stderr)
            }
            if tc.wantOut != "" && !bytes.Contains([]byte(stdout), []byte(tc.wantOut)) {
                t.Errorf("stdout %q does not contain %q", stdout, tc.wantOut)
            }
            if tc.wantErr != "" && !bytes.Contains([]byte(stderr), []byte(tc.wantErr)) {
                t.Errorf("stderr %q does not contain %q", stderr, tc.wantErr)
            }
        })
    }
}

CI Integration

# .github/workflows/test.yml
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-go@v5
        with:
          go-version: "1.22"
          cache: true

      - name: Run unit tests
        run: go test ./... -race -count=1

      - name: Run integration tests
        run: go test ./cmd/... -run TestScript -v

      - name: Run with coverage
        run: go test ./... -coverprofile=coverage.out -covermode=atomic

      - name: Upload coverage
        uses: codecov/codecov-action@v4
        with:
          file: coverage.out

The -race flag detects data races. The -count=1 flag prevents Go's test caching (important for integration tests that depend on external state).

A well-tested CLI is one where every documented flag combination has a test, every error path has a test, and go test ./... gives you confidence before every release. testscript in particular deserves wider adoption — its .txtar format makes test intent clear to everyone on the team, not just the Go developers.

HelpMeTest adds production monitoring and AI-powered test generation for Go services — start free at helpmetest.com

Testing Go CLI Applications with Cobra and testscript

HelpMeTest

Why CLI Testing is Different

Project Structure

Unit Testing Cobra Commands

Mocking Dependencies

Integration Testing with os/exec

testscript: The Best Way to Integration Test CLIs

Golden File Testing

Table-Driven Tests for Flag Combinations

CI Integration

Read more

Blue-Green Deployment Testing: Smoke Tests, Traffic Cutover & Rollback Validation

Kong API Gateway Testing Guide: kong-pongo, Rate Limiting & Plugin Tests

Chaos Mesh Testing Guide: PodChaos, NetworkChaos & Kubernetes Fault Injection

AWS API Gateway Testing Guide: Lambda Authorizers, Mock Integrations & CDK Tests