Testing

Object Storage Testing Best Practices: Mocking vs Real Buckets, CORS, CDN Invalidation, and Encryption

HelpMeTest

18 May 2026 — 7 min read

Object storage tests fall into three tiers: unit tests with mocks (fast, free, no network), integration tests against local emulators (slower, but real behavior), and end-to-end tests against real cloud buckets (slowest, costs money, but catches IAM and CDN issues nothing else will). The key is knowing which tier each test belongs in and keeping them separated.

Key Takeaways

Mocks are for testing your code, not the cloud service. If the test is about whether your application handles a 403 correctly, mock it. If it's about whether your CORS configuration allows your frontend domain, you need a real bucket or a faithful emulator.

CORS policy testing must run against real endpoints. The browser enforces CORS via preflight requests — you can't verify a CORS policy works without making actual HTTP requests to a server that evaluates it.

CDN cache invalidation is notoriously hard to test. The only reliable approach is testing the invalidation API call itself, then verifying with a GET that returns a cache-miss header — never sleep and hope.

Server-side encryption testing verifies metadata, not the encrypted content. You can't decrypt and inspect SSE-encrypted objects — you test that the ServerSideEncryption header in the response matches what you configured.

Cost-aware test design matters. A test that creates 1000 objects in a real S3 bucket in CI will cost you money every day. Parameterize bucket names, tag test resources, and clean them up in teardown — every time.

Teams who write good unit tests for their business logic but skip integration tests for their storage configuration get burned. The CORS policy you wrote in Terraform looks fine but your frontend still gets blocked. The CDN invalidation you call in your deploy script doesn't actually clear the right paths. The SSE-KMS key you specified in your bucket policy doesn't apply to uploads from a particular role. None of these are catchable with mocks.

Here's how to test each layer correctly.

The Three-Tier Test Strategy

Before writing any test, classify it:

Tier	Tool	Speed	Cost	What it catches
Unit	moto / mocks	< 1s	$0	Application logic errors
Integration	Azurite / minio / LocalStack	5–30s	$0	SDK usage, protocol behavior
E2E	Real AWS/Azure/GCS bucket	30s–2min	Low but real	IAM, CDN, CORS, SSE-KMS

The unit tier runs on every commit. Integration runs on every PR. E2E runs on merge to main or nightly.

# conftest.py — separate markers for each tier
import pytest

def pytest_configure(config):
    config.addinivalue_line("markers", "unit: fast, mocked tests")
    config.addinivalue_line("markers", "integration: requires local emulator (minio/azurite)")
    config.addinivalue_line("markers", "e2e: requires real cloud bucket, runs nightly")

# Run only unit tests in CI on every push:
# pytest -m unit
# Run integration + unit on PR:
# pytest -m "unit or integration"
# Run everything nightly:
# pytest  (no -m filter)

Testing CORS Policies

CORS is enforced by the browser, which means it must be tested with real HTTP requests that include Origin headers — mocks won't help here. LocalStack Pro or a real bucket is required.

import pytest
import requests
import json
import boto3

# This test requires a real S3 bucket or LocalStack Pro
@pytest.mark.e2e
def test_cors_allows_frontend_origin(s3_client, real_bucket, frontend_origin="https://app.example.com"):
    """Test that the CORS policy allows requests from the frontend domain."""
    
    # Set the CORS configuration
    cors_config = {
        "CORSRules": [
            {
                "AllowedOrigins": [frontend_origin],
                "AllowedMethods": ["GET", "PUT", "POST", "DELETE"],
                "AllowedHeaders": ["*"],
                "ExposeHeaders": ["ETag", "x-amz-request-id"],
                "MaxAgeSeconds": 3600
            }
        ]
    }
    
    s3_client.put_bucket_cors(
        Bucket=real_bucket,
        CORSConfiguration=cors_config
    )
    
    # Get the bucket's endpoint URL
    bucket_url = f"https://{real_bucket}.s3.amazonaws.com/test-object.txt"
    
    # Simulate a browser preflight request
    preflight_response = requests.options(
        bucket_url,
        headers={
            "Origin": frontend_origin,
            "Access-Control-Request-Method": "GET",
            "Access-Control-Request-Headers": "content-type"
        }
    )
    
    assert preflight_response.status_code == 200
    assert preflight_response.headers.get("Access-Control-Allow-Origin") == frontend_origin
    assert "GET" in preflight_response.headers.get("Access-Control-Allow-Methods", "")

@pytest.mark.e2e  
def test_cors_blocks_unknown_origin(s3_client, real_bucket):
    """Verify that unlisted origins are blocked."""
    
    bucket_url = f"https://{real_bucket}.s3.amazonaws.com/test-object.txt"
    
    preflight_response = requests.options(
        bucket_url,
        headers={
            "Origin": "https://evil-attacker.com",
            "Access-Control-Request-Method": "GET"
        }
    )
    
    # S3 returns 403 or omits CORS headers for disallowed origins
    allowed_origin = preflight_response.headers.get("Access-Control-Allow-Origin", "")
    assert "evil-attacker.com" not in allowed_origin

For unit tests of CORS configuration generation (e.g., your IaC code), test the JSON structure:

@pytest.mark.unit
def test_cors_config_generator_produces_correct_structure():
    """Unit test for code that generates CORS config JSON."""
    from myapp.storage_config import build_cors_config
    
    config = build_cors_config(
        allowed_origins=["https://app.example.com", "https://staging.example.com"],
        allowed_methods=["GET", "PUT"],
        max_age_seconds=3600
    )
    
    assert len(config["CORSRules"]) == 1
    rule = config["CORSRules"][0]
    assert "https://app.example.com" in rule["AllowedOrigins"]
    assert "GET" in rule["AllowedMethods"]
    assert "PUT" in rule["AllowedMethods"]
    assert rule["MaxAgeSeconds"] == 3600

Testing CDN Cache Invalidation

CDN invalidation is one of the hardest things to test reliably. Sleep-and-hope approaches ("wait 30 seconds then check") are unreliable and slow. The right approach: test the invalidation API call itself, then verify the response headers on a subsequent request show a cache miss.

import boto3
import time

@pytest.mark.e2e
def test_cloudfront_invalidation_is_submitted(cloudfront_client, distribution_id):
    """Test that invalidation is created and enters the invalidation queue."""
    
    response = cloudfront_client.create_invalidation(
        DistributionId=distribution_id,
        InvalidationBatch={
            "Paths": {
                "Quantity": 2,
                "Items": ["/images/*", "/assets/bundle.js"]
            },
            "CallerReference": f"test-{int(time.time())}"
        }
    )
    
    invalidation = response["Invalidation"]
    
    assert invalidation["Status"] in ("InProgress", "Completed")
    assert len(invalidation["InvalidationBatch"]["Paths"]["Items"]) == 2

@pytest.mark.e2e
def test_invalidation_completes_within_timeout(cloudfront_client, distribution_id):
    """Test that an invalidation completes within a reasonable time."""
    
    create_response = cloudfront_client.create_invalidation(
        DistributionId=distribution_id,
        InvalidationBatch={
            "Paths": {"Quantity": 1, "Items": ["/*"]},
            "CallerReference": f"test-wait-{int(time.time())}"
        }
    )
    
    invalidation_id = create_response["Invalidation"]["Id"]
    
    # Poll until complete or timeout
    deadline = time.time() + 300  # 5 minute timeout
    while time.time() < deadline:
        status_response = cloudfront_client.get_invalidation(
            DistributionId=distribution_id,
            Id=invalidation_id
        )
        status = status_response["Invalidation"]["Status"]
        if status == "Completed":
            break
        time.sleep(10)
    
    assert status == "Completed", f"Invalidation did not complete within 5 minutes, status: {status}"

For unit tests of the code that calls invalidation, mock the boto3 call:

from unittest.mock import MagicMock, patch
import pytest

@pytest.mark.unit
def test_deploy_triggers_cdn_invalidation():
    """Unit test: verify deploy code calls invalidation with correct paths."""
    from myapp.deploy import DeployService
    
    mock_cloudfront = MagicMock()
    mock_cloudfront.create_invalidation.return_value = {
        "Invalidation": {"Id": "ABCDEF", "Status": "InProgress"}
    }
    
    service = DeployService(cloudfront_client=mock_cloudfront, distribution_id="E123")
    service.deploy(version="2.0.0")
    
    mock_cloudfront.create_invalidation.assert_called_once()
    call_args = mock_cloudfront.create_invalidation.call_args
    paths = call_args[1]["InvalidationBatch"]["Paths"]["Items"]
    
    assert "/*" in paths or any(p.endswith("*") for p in paths)

Testing Server-Side Encryption

SSE testing verifies that objects are encrypted with the expected algorithm and key — you can't inspect the encrypted content itself, but you can inspect the metadata AWS returns.

from moto import mock_aws
import boto3
import pytest

@pytest.mark.unit
@mock_aws
def test_sse_s3_encryption_applied():
    """Test that SSE-S3 (AES-256) encryption is applied on upload."""
    s3 = boto3.client("s3", region_name="us-east-1")
    s3.create_bucket(Bucket="encrypted-bucket")
    
    # Set default encryption on the bucket
    s3.put_bucket_encryption(
        Bucket="encrypted-bucket",
        ServerSideEncryptionConfiguration={
            "Rules": [
                {
                    "ApplyServerSideEncryptionByDefault": {
                        "SSEAlgorithm": "AES256"
                    },
                    "BucketKeyEnabled": False
                }
            ]
        }
    )
    
    # Upload an object
    s3.put_object(Bucket="encrypted-bucket", Key="sensitive/data.json", Body=b'{"ssn": "123-45-6789"}')
    
    # Verify the encryption header
    head = s3.head_object(Bucket="encrypted-bucket", Key="sensitive/data.json")
    assert head.get("ServerSideEncryption") == "AES256"

@pytest.mark.unit
@mock_aws
def test_sse_kms_encryption_with_specific_key():
    """Test SSE-KMS with a specific KMS key ID."""
    s3 = boto3.client("s3", region_name="us-east-1")
    s3.create_bucket(Bucket="kms-bucket")
    
    kms_key_id = "arn:aws:kms:us-east-1:123456789012:key/abcd1234-ef56-7890-ab12-cd34ef567890"
    
    s3.put_object(
        Bucket="kms-bucket",
        Key="secret/payload.json",
        Body=b"encrypted payload",
        ServerSideEncryption="aws:kms",
        SSEKMSKeyId=kms_key_id
    )
    
    head = s3.head_object(Bucket="kms-bucket", Key="secret/payload.json")
    
    assert head.get("ServerSideEncryption") == "aws:kms"
    assert head.get("SSEKMSKeyId") == kms_key_id

@pytest.mark.unit
@mock_aws
def test_bucket_enforces_encryption_at_rest():
    """Verify bucket policy blocks non-encrypted uploads."""
    import json
    from botocore.exceptions import ClientError
    
    s3 = boto3.client("s3", region_name="us-east-1")
    s3.create_bucket(Bucket="policy-enforced-bucket")
    
    # Policy that denies PutObject without SSE
    deny_policy = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "DenyUnencryptedObjectUploads",
                "Effect": "Deny",
                "Principal": "*",
                "Action": "s3:PutObject",
                "Resource": "arn:aws:s3:::policy-enforced-bucket/*",
                "Condition": {
                    "StringNotEquals": {
                        "s3:x-amz-server-side-encryption": "AES256"
                    }
                }
            }
        ]
    }
    
    s3.put_bucket_policy(
        Bucket="policy-enforced-bucket",
        Policy=json.dumps(deny_policy)
    )
    
    # Verify the policy is saved correctly
    policy_response = s3.get_bucket_policy(Bucket="policy-enforced-bucket")
    saved_policy = json.loads(policy_response["Policy"])
    
    assert saved_policy["Statement"][0]["Effect"] == "Deny"
    assert "s3:x-amz-server-side-encryption" in str(saved_policy)

Cost-Aware Test Design

Real buckets cost money. A poorly designed test suite running 50 times a day can cost more than you expect. Here are the rules:

import boto3
import uuid
import pytest

@pytest.fixture(scope="session")
def e2e_bucket():
    """Create a uniquely named test bucket and clean it up after all E2E tests."""
    s3 = boto3.client("s3", region_name="us-east-1")
    
    # Unique name prevents collisions between parallel CI runs
    bucket_name = f"hmt-test-{uuid.uuid4().hex[:12]}"
    
    s3.create_bucket(
        Bucket=bucket_name,
        CreateBucketConfiguration={"LocationConstraint": "us-east-1"}
    )
    
    # Tag for cost allocation and cleanup
    s3.put_bucket_tagging(
        Bucket=bucket_name,
        Tagging={
            "TagSet": [
                {"Key": "Environment", "Value": "test"},
                {"Key": "ManagedBy", "Value": "pytest"},
                {"Key": "AutoDelete", "Value": "true"}
            ]
        }
    )
    
    yield bucket_name
    
    # Cleanup: delete all objects then the bucket
    paginator = s3.get_paginator("list_objects_v2")
    for page in paginator.paginate(Bucket=bucket_name):
        objects = [{"Key": obj["Key"]} for obj in page.get("Contents", [])]
        if objects:
            s3.delete_objects(Bucket=bucket_name, Delete={"Objects": objects})
    
    s3.delete_bucket(Bucket=bucket_name)

# Lambda that queries for test buckets that were abandoned (CI failed before teardown)
def find_abandoned_test_buckets(s3_client, max_age_hours=24):
    """Find test buckets that were not cleaned up."""
    from datetime import datetime, timezone, timedelta
    
    cutoff = datetime.now(timezone.utc) - timedelta(hours=max_age_hours)
    abandoned = []
    
    response = s3_client.list_buckets()
    for bucket in response["Buckets"]:
        if bucket["Name"].startswith("hmt-test-") and bucket["CreationDate"] < cutoff:
            abandoned.append(bucket["Name"])
    
    return abandoned

The teardown runs even if tests fail because pytest fixtures with yield always run their cleanup block. The unique bucket name per test session prevents concurrent CI runs from interfering with each other.

Bucket Policy Testing

Test your IAM bucket policies verify the policy JSON you generate — not AWS's IAM evaluation engine (that requires real AWS):

import json
from moto import mock_aws
import boto3

@pytest.mark.unit
@mock_aws
def test_bucket_policy_restricts_access_to_vpc():
    """Verify the bucket policy generator produces a VPC-condition policy."""
    from myapp.policy_generator import generate_vpc_restricted_policy
    
    vpc_id = "vpc-0123456789abcdef0"
    bucket_name = "production-assets"
    
    policy = generate_vpc_restricted_policy(bucket_name=bucket_name, vpc_id=vpc_id)
    
    s3 = boto3.client("s3", region_name="us-east-1")
    s3.create_bucket(Bucket=bucket_name)
    s3.put_bucket_policy(Bucket=bucket_name, Policy=json.dumps(policy))
    
    # Read back and verify
    saved = json.loads(s3.get_bucket_policy(Bucket=bucket_name)["Policy"])
    
    # Find the VPC condition statement
    vpc_statement = next(
        (s for s in saved["Statement"] if "aws:SourceVpc" in str(s.get("Condition", {}))),
        None
    )
    
    assert vpc_statement is not None
    assert vpc_id in str(vpc_statement["Condition"])
    assert vpc_statement["Effect"] == "Deny"

The pattern throughout all these tests is the same: unit tests verify your code's output structure with mocks, integration tests verify the SDK behaves correctly against a local server, and E2E tests verify the cloud service enforces your configuration. Match the test tier to what you're actually testing.

Testing React Router v7 with Vite + Vitest: Setup and Best Practices

E2E Testing React Router v7 Apps with Playwright

Migrating from Remix to React Router v7: Testing Your Migration

Testing React Router v7 Loaders and Actions with Vitest