Testing Kustomize Configurations: conftest, Datree, and Schema Validation
Kustomize makes Kubernetes manifest management tractable. It does not make it safe. Overlays can silently remove resource limits, merge in insecure defaults, or produce structurally invalid YAML that only breaks at apply time. Testing Kustomize configurations before kubectl apply means catching those failures in CI—not during an incident.
The Testing Stack
Four tools cover different failure modes:
| Tool | Catches |
|---|---|
kubeconform |
Schema violations, unknown fields, API version drift |
conftest (OPA) |
Policy violations, security anti-patterns |
Datree |
Kubernetes best-practice rules, misconfigs |
diff tests |
Unintended overlay changes |
Use all four. They don't overlap much.
Schema Validation with kubeconform
kubeconform validates rendered manifests against Kubernetes JSON schemas. It's faster and more accurate than the deprecated kubeval.
Install and run:
brew install kubeconform
# Render kustomize build, pipe to kubeconform
kustomize build overlays/production <span class="hljs-pipe">| kubeconform \
-strict \
-ignore-missing-schemas \
-schema-location default \
-schema-location <span class="hljs-string">'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
-summary \
-output tapThe CRDs-catalog URL handles custom resources (Cert-manager, Argo, Flux, etc.) that aren't in the core Kubernetes schemas. Without it, kubeconform silently skips CRDs or errors on unknown schemas.
For all overlays in a monorepo:
#!/bin/bash
<span class="hljs-comment"># scripts/validate-schemas.sh
FAILED=0
<span class="hljs-keyword">for overlay <span class="hljs-keyword">in overlays/*/; <span class="hljs-keyword">do
<span class="hljs-built_in">echo <span class="hljs-string">"Validating $overlay..."
kustomize build <span class="hljs-string">"$overlay" <span class="hljs-pipe">| kubeconform \
-strict \
-ignore-missing-schemas \
-schema-location default \
-schema-location <span class="hljs-string">'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
-summary 2>&1
<span class="hljs-keyword">if [ <span class="hljs-variable">${PIPESTATUS[1]} -ne 0 ]; <span class="hljs-keyword">then
<span class="hljs-built_in">echo <span class="hljs-string">"FAILED: $overlay"
FAILED=1
<span class="hljs-keyword">fi
<span class="hljs-keyword">done
<span class="hljs-built_in">exit <span class="hljs-variable">$FAILEDCommon failures this catches:
apiVersion: apps/v1beta1(removed in Kubernetes 1.16)- Missing required fields like
selectorin Deployments - Type mismatches (string where integer expected)
Policy Testing with conftest
conftest evaluates Rego policies against any structured data. For Kubernetes, you pipe rendered manifests and check them against security and operational policies.
Project structure:
policy/
kubernetes/
deny_latest_tag.rego
require_limits.rego
require_labels.rego
no_privileged.rego
lib/
kubernetes.regoA policy library to avoid repetition:
# policy/lib/kubernetes.rego
package lib.kubernetes
is_deployment {
input.kind == "Deployment"
}
is_container(container) {
container := input.spec.template.spec.containers[_]
}
name := input.metadata.name
namespace := input.metadata.namespaceActual policy files:
# policy/kubernetes/require_limits.rego
package main
import future.keywords.every
deny[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
not container.resources.limits.memory
msg := sprintf(
"Deployment '%s': container '%s' is missing memory limit",
[input.metadata.name, container.name]
)
}
deny[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
not container.resources.limits.cpu
msg := sprintf(
"Deployment '%s': container '%s' is missing CPU limit",
[input.metadata.name, container.name]
)
}
warn[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
not container.resources.requests.memory
msg := sprintf(
"Deployment '%s': container '%s' has no memory request",
[input.metadata.name, container.name]
)
}# policy/kubernetes/no_privileged.rego
package main
deny[msg] {
input.kind == "Deployment"
container := input.spec.template.spec.containers[_]
container.securityContext.privileged == true
msg := sprintf(
"Deployment '%s': container '%s' runs as privileged",
[input.metadata.name, container.name]
)
}
deny[msg] {
input.kind == "Deployment"
input.spec.template.spec.hostNetwork == true
msg := sprintf(
"Deployment '%s' uses hostNetwork",
[input.metadata.name]
)
}# policy/kubernetes/require_labels.rego
package main
required_labels := {"app", "version", "component"}
deny[msg] {
input.kind == "Deployment"
existing := {label | input.metadata.labels[label]}
missing := required_labels - existing
count(missing) > 0
msg := sprintf(
"Deployment '%s' is missing labels: %v",
[input.metadata.name, missing]
)
}Run against all overlays:
# Test a single overlay
kustomize build overlays/staging <span class="hljs-pipe">| conftest <span class="hljs-built_in">test - \
--policy policy/kubernetes \
--all-namespaces
<span class="hljs-comment"># Test with specific output format for CI
kustomize build overlays/production <span class="hljs-pipe">| conftest <span class="hljs-built_in">test - \
--policy policy/kubernetes \
--output tap \
--all-namespacesTest the policies themselves with conftest's verify command:
# policy/kubernetes/require_limits_test.rego
package main
test_deny_missing_memory_limit {
deny[_] with input as {
"kind": "Deployment",
"metadata": {"name": "test-app"},
"spec": {
"template": {
"spec": {
"containers": [{
"name": "app",
"resources": {"limits": {"cpu": "100m"}}
}]
}
}
}
}
}
test_allow_with_limits {
count(deny) == 0 with input as {
"kind": "Deployment",
"metadata": {"name": "test-app"},
"spec": {
"template": {
"spec": {
"containers": [{
"name": "app",
"resources": {
"limits": {"cpu": "100m", "memory": "128Mi"},
"requests": {"cpu": "50m", "memory": "64Mi"}
}
}]
}
}
}
}
}conftest verify --policy policy/kubernetesDatree Validation
Datree provides 100+ built-in rules covering Kubernetes best practices without writing Rego. It's faster to start with than maintaining custom policies.
# Install
brew tap datreeio/datree
brew install datree
<span class="hljs-comment"># Run against rendered manifests
kustomize build overlays/staging <span class="hljs-pipe">| datree <span class="hljs-built_in">test -Custom rules in .datree/policies.yaml:
# .datree/policies.yaml
apiVersion: v1
policies:
- name: production-policy
isDefault: true
rules:
- identifier: CONTAINERS_MISSING_MEMORY_REQUEST_KEY
messageOnFailure: "Container must have memory request"
- identifier: DEPLOYMENT_MISSING_LABEL_ENV_VALUE
messageOnFailure: "Deployment must have 'env' label"
- identifier: CONTAINERS_MISSING_READINESS_PROBE_KEY
messageOnFailure: "Container must have readinessProbe"
- identifier: CONTAINERS_MISSING_LIVENESS_PROBE_KEY
messageOnFailure: "Container must have livenessProbe"Diff Testing Between Environments
The most dangerous Kustomize mistakes are silent: an overlay that was supposed to only change replica count also removes a sidecar, or changes a ConfigMap in an unexpected way. Diff tests catch this.
The pattern: render all overlays, commit the output to a rendered/ directory, and fail CI if the diff is unexpected.
#!/bin/bash
<span class="hljs-comment"># scripts/render-all.sh
<span class="hljs-built_in">mkdir -p rendered
<span class="hljs-keyword">for overlay <span class="hljs-keyword">in overlays/*/; <span class="hljs-keyword">do
name=$(<span class="hljs-built_in">basename <span class="hljs-string">"$overlay")
kustomize build <span class="hljs-string">"$overlay" > <span class="hljs-string">"rendered/${name}.yaml"
<span class="hljs-keyword">doneIn CI, compare rendered output against what's in the PR:
# .github/workflows/kustomize-diff.yml
name: Kustomize Diff
on:
pull_request:
paths:
- 'base/**'
- 'overlays/**'
- 'components/**'
jobs:
diff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install tools
run: |
brew install kustomize kubeconform conftest
- name: Render current branch
run: bash scripts/render-all.sh
- name: Validate schemas
run: bash scripts/validate-schemas.sh
- name: Run conftest policies
run: |
for f in rendered/*.yaml; do
conftest test "$f" --policy policy/kubernetes --output tap
done
- name: Show diff against main
run: |
git fetch origin main
git diff origin/main -- rendered/For a more targeted approach, use dyff instead of raw git diff—it understands YAML structure and produces human-readable Kubernetes-aware diffs:
brew install dyff
# Compare staging overlay between branches
git show origin/main:rendered/staging.yaml > /tmp/staging-main.yaml
dyff between /tmp/staging-main.yaml rendered/staging.yamlGitHub Actions Full Pipeline
# .github/workflows/kustomize-ci.yml
name: Kustomize CI
on:
pull_request:
paths: ['base/**', 'overlays/**', 'components/**', 'policy/**']
jobs:
validate:
runs-on: ubuntu-latest
strategy:
matrix:
overlay: [development, staging, production]
steps:
- uses: actions/checkout@v4
- name: Setup kustomize
uses: imranismail/setup-kustomize@v2
- name: Render overlay
run: kustomize build overlays/${{ matrix.overlay }} > /tmp/rendered.yaml
- name: kubeconform
run: |
curl -L https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz | tar xz
./kubeconform -strict -ignore-missing-schemas \
-schema-location default \
-schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
-summary /tmp/rendered.yaml
- name: conftest
run: |
wget -q https://github.com/open-policy-agent/conftest/releases/latest/download/conftest_Linux_x86_64.tar.gz
tar xzf conftest_Linux_x86_64.tar.gz
./conftest test /tmp/rendered.yaml --policy policy/kubernetes --output tap
- name: Upload rendered manifest
uses: actions/upload-artifact@v4
with:
name: rendered-${{ matrix.overlay }}
path: /tmp/rendered.yamlSchema validation catches structural errors, conftest catches policy violations, Datree catches operational best-practice violations, and diff tests catch unexpected changes. Running all four takes under 30 seconds and prevents the class of Kubernetes misconfigurations that cause silent misbehavior in production.