ScyllaDB Testing Guide: Cassandra Driver Compatibility, Shard-per-Core Testing & Performance Regression

ScyllaDB Testing Guide: Cassandra Driver Compatibility, Shard-per-Core Testing & Performance Regression

ScyllaDB delivers Cassandra-compatible APIs with a rewritten Seastar-based engine that achieves dramatically higher throughput. Testing ScyllaDB applications requires validating both Cassandra compatibility and ScyllaDB-specific behaviors like shard-per-core data distribution. This guide covers both angles.

ScyllaDB Testing Landscape

ScyllaDB is a drop-in replacement for Cassandra at the API level—which means most Cassandra tests work against ScyllaDB unchanged. But ScyllaDB adds distinct behaviors:

  • Shard-per-core architecture: Each CPU core owns specific token ranges; multi-shard queries have higher latency
  • ScyllaDB-specific extensions: Tablets, workload prioritization, CDC (Change Data Capture)
  • Higher write throughput: ScyllaDB typically handles 10x more ops/sec than Cassandra on the same hardware
  • Different compaction strategies: TWCS (TimeWindowCompactionStrategy) and LCS behavior differ subtly

Setting Up ScyllaDB with Testcontainers

ScyllaDB doesn't have an official Testcontainers module but works with the generic container:

@Testcontainers
class ScyllaDBIntegrationTest {

    @Container
    static GenericContainer<?> scylla = new GenericContainer<>(
        DockerImageName.parse("scylladb/scylla:6.0")
    )
    .withExposedPorts(9042)
    .withCommand("--smp 2 --memory 512M --developer-mode 1")
    .waitingFor(
        Wait.forLogMessage(".*Starting listening for CQL clients.*", 1)
            .withStartupTimeout(Duration.ofMinutes(3))
    );

    static CqlSession session;

    @BeforeAll
    static void connect() throws InterruptedException {
        // ScyllaDB needs a moment after the log message
        Thread.sleep(2000);

        session = CqlSession.builder()
            .addContactPoint(new InetSocketAddress(
                scylla.getHost(),
                scylla.getMappedPort(9042)
            ))
            .withLocalDatacenter("datacenter1")
            .build();

        // Create keyspace
        session.execute(
            "CREATE KEYSPACE IF NOT EXISTS test_ks " +
            "WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 1}"
        );
        session.execute("USE test_ks");
    }

    @AfterAll
    static void disconnect() {
        if (session != null) session.close();
    }

    @BeforeEach
    void cleanup() {
        session.execute("TRUNCATE test_ks.events");
    }
}

Cassandra Driver Compatibility Tests

ScyllaDB's CQL compatibility means existing Cassandra tests should pass unchanged. This test class verifies the compatibility contract:

class CassandraCompatibilityTest {

    @Test
    void shouldSupportCassandraDataTypes() throws Exception {
        session.execute("""
            CREATE TABLE IF NOT EXISTS test_ks.type_coverage (
                id UUID PRIMARY KEY,
                text_col TEXT,
                int_col INT,
                bigint_col BIGINT,
                float_col FLOAT,
                double_col DOUBLE,
                boolean_col BOOLEAN,
                timestamp_col TIMESTAMP,
                blob_col BLOB,
                list_col LIST<TEXT>,
                map_col MAP<TEXT, INT>,
                set_col SET<TEXT>
            )
        """);

        UUID id = UUID.randomUUID();
        Instant now = Instant.now();
        ByteBuffer blob = ByteBuffer.wrap("test-bytes".getBytes());

        session.execute(SimpleStatement.builder(
            "INSERT INTO test_ks.type_coverage " +
            "(id, text_col, int_col, bigint_col, float_col, double_col, " +
            "boolean_col, timestamp_col, blob_col, list_col, map_col, set_col) " +
            "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
        )
        .addPositionalValues(
            id, "hello", 42, 9999999999L, 3.14f, 2.718281828,
            true, now, blob,
            List.of("a", "b", "c"),
            Map.of("key1", 1, "key2", 2),
            Set.of("x", "y")
        )
        .build());

        Row row = session.execute(
            "SELECT * FROM test_ks.type_coverage WHERE id = ?", id
        ).one();

        assertNotNull(row);
        assertEquals("hello", row.getString("text_col"));
        assertEquals(42, row.getInt("int_col"));
        assertEquals(9999999999L, row.getLong("bigint_col"));
        assertEquals(3.14f, row.getFloat("float_col"), 0.001f);
        assertTrue(row.getBoolean("boolean_col"));
        assertThat(row.getInstant("timestamp_col")).isCloseTo(now, within(1, SECONDS));
        assertEquals(3, row.getList("list_col", String.class).size());
        assertEquals(2, row.getMap("map_col", String.class, Integer.class).size());
        assertEquals(2, row.getSet("set_col", String.class).size());
    }

    @Test
    void shouldSupportLightweightTransactions() {
        session.execute("""
            CREATE TABLE IF NOT EXISTS test_ks.counters (
                key TEXT PRIMARY KEY,
                value INT
            )
        """);

        // CAS operation
        ResultSet result = session.execute(
            "INSERT INTO test_ks.counters (key, value) VALUES ('lock-1', 1) IF NOT EXISTS"
        );
        assertTrue(result.wasApplied());

        // Second insert should fail (already exists)
        ResultSet secondAttempt = session.execute(
            "INSERT INTO test_ks.counters (key, value) VALUES ('lock-1', 2) IF NOT EXISTS"
        );
        assertFalse(secondAttempt.wasApplied());

        // Value should still be 1
        Row row = session.execute(
            "SELECT value FROM test_ks.counters WHERE key = 'lock-1'"
        ).one();
        assertEquals(1, row.getInt("value"));
    }

    @Test
    void shouldSupportCounterTables() {
        session.execute("""
            CREATE TABLE IF NOT EXISTS test_ks.page_views (
                page_id TEXT PRIMARY KEY,
                view_count COUNTER
            )
        """);

        // Increment counter
        for (int i = 0; i < 5; i++) {
            session.execute(
                "UPDATE test_ks.page_views SET view_count = view_count + 1 WHERE page_id = 'home'"
            );
        }

        Row row = session.execute(
            "SELECT view_count FROM test_ks.page_views WHERE page_id = 'home'"
        ).one();

        assertEquals(5L, row.getLong("view_count"));
    }
}

Shard-per-Core Testing

ScyllaDB's shard-per-core architecture routes requests to specific CPU cores based on the partition key hash. Understanding this helps write performance-aware tests.

Verifying Shard Awareness in the Driver

The ScyllaDB Cassandra driver extension adds shard-aware routing:

@Test
void shouldRouteToCorrectShardForPartitionKey() {
    // ScyllaDB-specific: verify shard-aware routing
    // With the scylla-driver extension, requests go directly to the owning shard

    session.execute("""
        CREATE TABLE IF NOT EXISTS test_ks.shard_test (
            partition_key TEXT,
            cluster_key INT,
            value TEXT,
            PRIMARY KEY (partition_key, cluster_key)
        )
    """);

    // Insert 100 different partition keys
    Map<String, Integer> partitionToShard = new HashMap<>();

    PreparedStatement insert = session.prepare(
        "INSERT INTO test_ks.shard_test (partition_key, cluster_key, value) VALUES (?, ?, ?)"
    );

    for (int i = 0; i < 100; i++) {
        String key = "partition-" + i;
        session.execute(insert.bind(key, i, "value-" + i));

        // Retrieve and verify immediately (shard-aware routing should be transparent)
        Row row = session.execute(
            "SELECT value FROM test_ks.shard_test WHERE partition_key = ? AND cluster_key = ?",
            key, i
        ).one();

        assertNotNull(row, "Row for partition " + key + " should be immediately readable");
        assertEquals("value-" + i, row.getString("value"));
    }
}

Testing Tablet-Based Distribution (ScyllaDB 6.0+)

@Test
void shouldCreateTableWithTabletConfiguration() {
    // Tablets are ScyllaDB's unit of distribution (replacing vnodes)
    session.execute("""
        CREATE TABLE IF NOT EXISTS test_ks.tablet_test (
            id UUID PRIMARY KEY,
            data TEXT
        ) WITH tablets = {'initial': 8}
    """);

    // Verify table was created successfully
    UUID testId = UUID.randomUUID();
    session.execute("INSERT INTO test_ks.tablet_test (id, data) VALUES (?, ?)",
        testId, "test-data");

    Row row = session.execute(
        "SELECT data FROM test_ks.tablet_test WHERE id = ?", testId
    ).one();

    assertNotNull(row);
    assertEquals("test-data", row.getString("data"));
}

Performance Regression Testing

ScyllaDB's value proposition is throughput—regression tests should quantify it.

Write Throughput Baseline

@Test
void shouldMeetWriteThroughputSLA() throws Exception {
    session.execute("""
        CREATE TABLE IF NOT EXISTS test_ks.perf_events (
            partition_id INT,
            event_id TIMEUUID,
            payload TEXT,
            PRIMARY KEY (partition_id, event_id)
        ) WITH CLUSTERING ORDER BY (event_id DESC)
    """);

    PreparedStatement insert = session.prepare(
        "INSERT INTO test_ks.perf_events (partition_id, event_id, payload) VALUES (?, now(), ?)"
    );

    int threadCount = 4;
    int opsPerThread = 2500;  // 10,000 total ops
    CountDownLatch latch = new CountDownLatch(threadCount);
    AtomicLong totalOps = new AtomicLong(0);
    AtomicLong errors = new AtomicLong(0);

    long start = System.currentTimeMillis();

    for (int t = 0; t < threadCount; t++) {
        final int threadId = t;
        Thread.ofVirtual().start(() -> {
            try {
                for (int i = 0; i < opsPerThread; i++) {
                    session.execute(insert.bind(
                        threadId % 100,  // Distribute across 100 partitions
                        "payload-" + threadId + "-" + i
                    ));
                    totalOps.incrementAndGet();
                }
            } catch (Exception e) {
                errors.incrementAndGet();
            } finally {
                latch.countDown();
            }
        });
    }

    latch.await(60, TimeUnit.SECONDS);
    long elapsed = System.currentTimeMillis() - start;

    assertEquals(0, errors.get(), "No errors expected during throughput test");
    assertEquals(threadCount * opsPerThread, totalOps.get());

    double opsPerSecond = (totalOps.get() * 1000.0) / elapsed;
    System.out.printf("Write throughput: %.0f ops/sec%n", opsPerSecond);

    // ScyllaDB in dev mode should still handle >1000 ops/sec on a container
    assertThat(opsPerSecond)
        .as("Write throughput should exceed 1000 ops/sec in dev mode")
        .isGreaterThan(1000.0);
}

Read Latency Baseline

@Test
void shouldMeetReadLatencySLA() throws Exception {
    session.execute("""
        CREATE TABLE IF NOT EXISTS test_ks.perf_read (
            id UUID PRIMARY KEY,
            value TEXT
        )
    """);

    // Seed data
    UUID[] ids = new UUID[1000];
    PreparedStatement insert = session.prepare(
        "INSERT INTO test_ks.perf_read (id, value) VALUES (?, ?)"
    );
    for (int i = 0; i < ids.length; i++) {
        ids[i] = UUID.randomUUID();
        session.execute(insert.bind(ids[i], "value-" + i));
    }

    PreparedStatement select = session.prepare(
        "SELECT value FROM test_ks.perf_read WHERE id = ?"
    );

    // Measure p99 read latency
    long[] latencies = new long[ids.length];
    Random random = new Random();

    for (int i = 0; i < ids.length; i++) {
        long start = System.nanoTime();
        Row row = session.execute(select.bind(ids[random.nextInt(ids.length)])).one();
        latencies[i] = (System.nanoTime() - start) / 1_000_000; // Convert to ms
        assertNotNull(row);
    }

    Arrays.sort(latencies);
    long p50 = latencies[(int) (latencies.length * 0.50)];
    long p99 = latencies[(int) (latencies.length * 0.99)];

    System.out.printf("Read latency - p50: %dms, p99: %dms%n", p50, p99);

    assertThat(p50).as("P50 read latency should be under 10ms").isLessThan(10L);
    assertThat(p99).as("P99 read latency should be under 50ms").isLessThan(50L);
}

Regression Test Against Baseline

@Test
void shouldNotRegressFromPerformanceBaseline() throws Exception {
    // Load baseline from config (set after initial profiling)
    PerformanceBaseline baseline = PerformanceBaseline.load("scylladb-baseline.json");

    // Run standard benchmark
    PerformanceMeasurement current = runStandardBenchmark(session);

    // Assert no regression beyond threshold
    double writeRegressionPct = (baseline.writeOpsPerSec() - current.writeOpsPerSec())
        / baseline.writeOpsPerSec() * 100;

    assertThat(writeRegressionPct)
        .as("Write throughput should not regress more than 10%%")
        .isLessThan(10.0);

    double readRegressionPct = (current.readP99Ms() - baseline.readP99Ms())
        / baseline.readP99Ms() * 100;

    assertThat(readRegressionPct)
        .as("Read P99 latency should not regress more than 20%%")
        .isLessThan(20.0);
}

Testing ScyllaDB CDC (Change Data Capture)

@Test
void shouldCaptureChangesViaCDC() throws Exception {
    session.execute("""
        CREATE TABLE IF NOT EXISTS test_ks.orders (
            order_id UUID PRIMARY KEY,
            status TEXT,
            amount DECIMAL
        ) WITH cdc = {'enabled': true}
    """);

    UUID orderId = UUID.randomUUID();

    // Insert
    session.execute(
        "INSERT INTO test_ks.orders (order_id, status, amount) VALUES (?, 'pending', 99.99)",
        orderId
    );

    // Update
    session.execute(
        "UPDATE test_ks.orders SET status = 'shipped' WHERE order_id = ?",
        orderId
    );

    // CDC log table auto-created by ScyllaDB
    List<Row> cdcLog = session.execute(
        "SELECT * FROM test_ks.orders_scylla_cdc_log " +
        "WHERE \"cdc$stream_id\" IN (" +
        "  SELECT stream_id FROM system_distributed.cdc_streams_descriptions LIMIT 10" +
        ") ALLOW FILTERING"
    ).all();

    // At minimum verify CDC table is accessible and has records
    // (Full CDC consumer testing requires a dedicated consumer like Kafka connector)
    assertFalse(cdcLog.isEmpty(), "CDC log should have captured changes");
}

Failure Mode Testing

@Test
void shouldHandleTimeoutGracefully() {
    // Configure very short timeout
    CqlSession shortTimeoutSession = CqlSession.builder()
        .addContactPoint(new InetSocketAddress(scylla.getHost(), scylla.getMappedPort(9042)))
        .withLocalDatacenter("datacenter1")
        .withConfigLoader(DriverConfigLoader.programmaticBuilder()
            .withDuration(DefaultDriverOption.REQUEST_TIMEOUT, Duration.ofMillis(1))
            .build()
        )
        .build();

    // Large query that takes longer than 1ms
    assertThrows(DriverTimeoutException.class, () ->
        shortTimeoutSession.execute(
            "SELECT * FROM system.peers ALLOW FILTERING"
        )
    );

    shortTimeoutSession.close();
}

@Test
void shouldRetryOnTransientFailures() {
    // Test retry policy by using a policy that retries on read timeouts
    CqlSession retrySession = CqlSession.builder()
        .addContactPoint(new InetSocketAddress(scylla.getHost(), scylla.getMappedPort(9042)))
        .withLocalDatacenter("datacenter1")
        .withConfigLoader(DriverConfigLoader.programmaticBuilder()
            .withString(DefaultDriverOption.RETRY_POLICY_CLASS,
                "DefaultRetryPolicy")
            .withInt(DefaultDriverOption.REQUEST_MAX_ATTEMPTS, 3)
            .build()
        )
        .build();

    // Normal operations should succeed with retry policy
    Row row = retrySession.execute("SELECT release_version FROM system.local").one();
    assertNotNull(row);
    assertNotNull(row.getString("release_version"));

    retrySession.close();
}

ScyllaDB with HelpMeTest

For applications backed by ScyllaDB—real-time leaderboards, IoT data pipelines, time-series dashboards—HelpMeTest validates the user-visible behavior end-to-end:

*** Test Cases ***
Leaderboard Displays Top 10 In Real Time
    As    LoggedInUser
    Go To    https://app.example.com/leaderboard
    Wait Until Element Is Visible    [data-testid=leaderboard-table]    timeout=5s
    Element Count Should Equal    [data-testid=leaderboard-row]    10
    First Row Score Should Be Highest
    Submit New Score    9999
    Wait Until Element Contains    [data-testid=leaderboard-row]:first-child    9999

This catches ScyllaDB-specific issues like consistency-level mismatches causing stale reads in the UI, or partition key design choices causing hot spots that manifest as slow page loads.

Common Pitfalls

1. Not using --developer-mode 1 in containers Without developer mode, ScyllaDB validates hardware capabilities and may refuse to start on under-resourced CI containers.

2. Testing without the ScyllaDB driver extension The standard Cassandra driver works but doesn't implement shard-aware routing. The scylla-java-driver extension dramatically reduces latency by routing directly to the owning shard.

3. Assuming Cassandra and ScyllaDB are identical under load API compatibility doesn't mean behavior under load is identical. A test that passes on Cassandra at 1000 ops/sec may expose ScyllaDB-specific tuning needs (or vice versa).

4. Not testing token-aware routing ScyllaDB's benefit over Cassandra is maximized when the driver sends requests directly to the replica that owns the data. Test that your driver config enables token-aware load balancing.

Summary

ScyllaDB testing is layered across compatibility and performance:

  • CQL compatibility tests: Verify Cassandra driver compatibility for all data types, LWTs, and counter operations
  • Shard-awareness tests: Confirm driver routing optimization is active
  • Throughput tests: Quantify write ops/sec and establish baselines
  • Latency tests: Measure p50 and p99 read latency under realistic load
  • Regression tests: Alert when throughput drops or latency increases
  • Failure mode tests: Verify retry policies and timeout handling

The combination of Cassandra compatibility testing and ScyllaDB-specific performance validation gives you confidence at both the API and infrastructure levels.

Read more