Elasticsearch Testing Guide: Testcontainers, Index Testing, and Query Validation

Elasticsearch Testing Guide: Testcontainers, Index Testing, and Query Validation

Elasticsearch is notoriously difficult to test. Its distributed nature, complex query DSL, and stateful index management make traditional unit testing insufficient. This guide covers practical strategies for testing Elasticsearch applications—from spinning up real instances with Testcontainers to validating index mappings and query correctness.

Why Elasticsearch Testing Is Hard

Unlike relational databases with ACID guarantees, Elasticsearch operates with eventual consistency by default. Documents aren't immediately searchable after indexing—they require a refresh. Mappings lock field types permanently. Queries that look correct can return unexpected results due to analyzers, tokenization, and scoring.

Testing approaches that only mock the Elasticsearch client give you false confidence. You need a real instance.

Setting Up Testcontainers for Elasticsearch

Testcontainers provides an official Elasticsearch module:

<!-- Maven -->
<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>elasticsearch</artifactId>
    <scope>test</scope>
</dependency>
@Testcontainers
class ElasticsearchIntegrationTest {

    @Container
    static ElasticsearchContainer elasticsearch =
        new ElasticsearchContainer(
            DockerImageName.parse("docker.elastic.co/elasticsearch/elasticsearch:8.11.0")
        )
        .withEnv("ES_JAVA_OPTS", "-Xms512m -Xmx512m")
        .withEnv("xpack.security.enabled", "false");

    static RestHighLevelClient client;

    @BeforeAll
    static void setup() {
        client = new RestHighLevelClient(
            RestClient.builder(
                HttpHost.create(elasticsearch.getHttpHostAddress())
            )
        );
    }

    @AfterAll
    static void tearDown() throws IOException {
        client.close();
    }
}

For Elasticsearch 8.x with security enabled, configure SSL:

@Container
static ElasticsearchContainer elasticsearch =
    new ElasticsearchContainer(
        DockerImageName.parse("docker.elastic.co/elasticsearch/elasticsearch:8.11.0")
    )
    .withPassword("test-password");

static ElasticsearchClient client;

@BeforeAll
static void setup() throws Exception {
    SSLContext sslContext = SSLContexts.custom()
        .loadTrustMaterial(null, (chains, authType) -> true)
        .build();

    RestClient restClient = RestClient.builder(
        HttpHost.create("https://" + elasticsearch.getHttpHostAddress())
    )
    .setHttpClientConfigCallback(builder ->
        builder.setSSLContext(sslContext)
               .setDefaultCredentialsProvider(credentialsProvider())
    )
    .build();

    client = new ElasticsearchClient(new RestClientTransport(restClient, new JacksonJsonpMapper()));
}

Index Testing: Mappings and Settings

Validating Index Creation

@Test
void shouldCreateIndexWithCorrectMapping() throws IOException {
    String indexName = "products-test-" + UUID.randomUUID();

    // Create index with explicit mapping
    CreateIndexRequest request = CreateIndexRequest.of(i -> i
        .index(indexName)
        .mappings(m -> m
            .properties("id", p -> p.keyword(k -> k))
            .properties("name", p -> p.text(t -> t
                .analyzer("standard")
                .fields("keyword", f -> f.keyword(k -> k.ignoreAbove(256)))
            ))
            .properties("price", p -> p.double_(d -> d))
            .properties("createdAt", p -> p.date(d -> d
                .format("strict_date_optional_time||epoch_millis")
            ))
        )
        .settings(s -> s
            .numberOfShards("1")
            .numberOfReplicas("0")
            .refreshInterval(ri -> ri.time("1s"))
        )
    );

    CreateIndexResponse response = client.indices().create(request);
    assertTrue(response.acknowledged());

    // Verify mapping was applied
    GetMappingResponse mapping = client.indices()
        .getMapping(r -> r.index(indexName));

    TypeMapping indexMapping = mapping.result().get(indexName).mappings();
    assertNotNull(indexMapping.properties().get("name"));
    assertEquals("text", indexMapping.properties().get("name")._kind().jsonValue());

    // Cleanup
    client.indices().delete(r -> r.index(indexName));
}

Testing Mapping Conflicts

One common production issue is mapping conflicts—trying to index a document where a field type doesn't match the mapping:

@Test
void shouldRejectDocumentWithMappingConflict() throws IOException {
    String indexName = "strict-mapping-" + UUID.randomUUID();

    // Create index with strict mapping
    client.indices().create(r -> r
        .index(indexName)
        .mappings(m -> m
            .dynamic(DynamicMapping.Strict)
            .properties("count", p -> p.integer(i -> i))
        )
    );

    // Try indexing with wrong type
    Map<String, Object> doc = new HashMap<>();
    doc.put("count", "not-a-number");  // String instead of integer

    assertThrows(ElasticsearchException.class, () ->
        client.index(r -> r
            .index(indexName)
            .document(doc)
        )
    );

    client.indices().delete(r -> r.index(indexName));
}

Query Validation Testing

@Test
void shouldFindDocumentsByFullTextSearch() throws IOException {
    String indexName = "articles-" + UUID.randomUUID();

    client.indices().create(r -> r
        .index(indexName)
        .settings(s -> s.numberOfShards("1").numberOfReplicas("0"))
    );

    // Index test documents
    indexArticle(indexName, "1", "Elasticsearch Query DSL Guide",
        "Learn how to write complex queries with Elasticsearch");
    indexArticle(indexName, "2", "Redis Caching Strategies",
        "Improve application performance with Redis");
    indexArticle(indexName, "3", "Elasticsearch Performance Tuning",
        "Optimize your Elasticsearch cluster for production");

    // Force refresh so documents are searchable
    client.indices().refresh(r -> r.index(indexName));

    // Search for Elasticsearch-related articles
    SearchResponse<Map> response = client.search(r -> r
        .index(indexName)
        .query(q -> q
            .multiMatch(m -> m
                .query("elasticsearch")
                .fields("title^2", "body")  // boost title matches
            )
        ),
        Map.class
    );

    assertEquals(2, response.hits().total().value());
    // Elasticsearch-specific articles should rank higher
    assertEquals("1", response.hits().hits().get(0).id());

    client.indices().delete(r -> r.index(indexName));
}

private void indexArticle(String index, String id, String title, String body) throws IOException {
    Map<String, Object> doc = new HashMap<>();
    doc.put("title", title);
    doc.put("body", body);
    client.index(r -> r.index(index).id(id).document(doc));
}

Testing Aggregations

@Test
void shouldAggregateProductsByCategory() throws IOException {
    String indexName = "products-agg-" + UUID.randomUUID();

    client.indices().create(r -> r
        .index(indexName)
        .settings(s -> s.numberOfShards("1").numberOfReplicas("0"))
        .mappings(m -> m
            .properties("category", p -> p.keyword(k -> k))
            .properties("price", p -> p.double_(d -> d))
        )
    );

    // Index test products
    List<Map<String, Object>> products = List.of(
        Map.of("category", "electronics", "price", 299.99),
        Map.of("category", "electronics", "price", 149.99),
        Map.of("category", "books", "price", 19.99),
        Map.of("category", "books", "price", 24.99),
        Map.of("category", "electronics", "price", 89.99)
    );

    for (int i = 0; i < products.size(); i++) {
        int finalI = i;
        client.index(r -> r.index(indexName).id(String.valueOf(finalI)).document(products.get(finalI)));
    }

    client.indices().refresh(r -> r.index(indexName));

    // Aggregate by category with average price
    SearchResponse<Map> response = client.search(r -> r
        .index(indexName)
        .size(0)  // No hits needed, only aggregations
        .aggregations("by_category", a -> a
            .terms(t -> t.field("category"))
            .aggregations("avg_price", aa -> aa
                .avg(avg -> avg.field("price"))
            )
        ),
        Map.class
    );

    StringTermsAggregate categories = response.aggregations()
        .get("by_category").sterms();

    assertEquals(2, categories.buckets().array().size());

    // Find electronics bucket
    StringTermsBucket electronics = categories.buckets().array().stream()
        .filter(b -> b.key().stringValue().equals("electronics"))
        .findFirst()
        .orElseThrow();

    assertEquals(3, electronics.docCount());

    double avgPrice = electronics.aggregations().get("avg_price").avg().value();
    assertEquals(179.99, avgPrice, 0.01);

    client.indices().delete(r -> r.index(indexName));
}

Spring Data Elasticsearch Test Slices

Spring Boot provides @DataElasticsearchTest for testing the Elasticsearch layer in isolation:

@DataElasticsearchTest
@Testcontainers
class ProductRepositoryTest {

    @Container
    @ServiceConnection
    static ElasticsearchContainer elasticsearch =
        new ElasticsearchContainer(
            DockerImageName.parse("docker.elastic.co/elasticsearch/elasticsearch:8.11.0")
        )
        .withEnv("xpack.security.enabled", "false");

    @Autowired
    ProductRepository productRepository;

    @BeforeEach
    void setUp() {
        productRepository.deleteAll();
    }

    @Test
    void shouldSaveAndRetrieveProduct() {
        Product product = new Product();
        product.setId("prod-1");
        product.setName("MacBook Pro");
        product.setCategory("electronics");
        product.setPrice(1999.99);

        productRepository.save(product);

        Optional<Product> found = productRepository.findById("prod-1");
        assertTrue(found.isPresent());
        assertEquals("MacBook Pro", found.get().getName());
    }

    @Test
    void shouldSearchProductsByCategory() {
        productRepository.saveAll(List.of(
            createProduct("1", "iPhone 15", "electronics", 999.99),
            createProduct("2", "Java Programming Book", "books", 49.99),
            createProduct("3", "iPad Pro", "electronics", 1099.99)
        ));

        // Force index refresh
        productRepository.findAll();  // triggers refresh in some versions

        Page<Product> electronics = productRepository.findByCategory("electronics",
            PageRequest.of(0, 10));

        assertEquals(2, electronics.getTotalElements());
    }

    private Product createProduct(String id, String name, String category, double price) {
        Product p = new Product();
        p.setId(id);
        p.setName(name);
        p.setCategory(category);
        p.setPrice(price);
        return p;
    }
}

Testing Custom Repository Methods

@Repository
public interface ProductRepository extends ElasticsearchRepository<Product, String> {

    Page<Product> findByCategory(String category, Pageable pageable);

    @Query("{\"bool\": {\"must\": [{\"match\": {\"name\": \"?0\"}}, " +
           "{\"range\": {\"price\": {\"lte\": \"?1\"}}}]}}")
    List<Product> findByNameAndMaxPrice(String name, double maxPrice);
}

// Test the custom query
@Test
void shouldFindProductsByNameAndPriceLimit() {
    productRepository.saveAll(List.of(
        createProduct("1", "Laptop", "electronics", 999.99),
        createProduct("2", "Laptop Stand", "accessories", 49.99),
        createProduct("3", "Gaming Laptop", "electronics", 1999.99)
    ));

    // Refresh
    elasticsearchOperations.indexOps(Product.class).refresh();

    List<Product> affordable = productRepository.findByNameAndMaxPrice("Laptop", 1000.0);

    // Should find "Laptop" and "Laptop Stand" but not "Gaming Laptop" (over budget)
    assertThat(affordable).hasSize(2);
    assertThat(affordable).extracting(Product::getName)
        .containsExactlyInAnyOrder("Laptop", "Laptop Stand");
}

Testing the Refresh Problem

One of the trickiest Elasticsearch testing issues is forgetting to refresh:

@Test
void demonstratesRefreshProblem() throws IOException {
    String indexName = "refresh-demo-" + UUID.randomUUID();
    client.indices().create(r -> r.index(indexName)
        .settings(s -> s.numberOfShards("1").numberOfReplicas("0")));

    // Index a document
    client.index(r -> r
        .index(indexName)
        .id("1")
        .document(Map.of("name", "test"))
    );

    // Without refresh, document is NOT searchable yet
    SearchResponse<Map> immediateSearch = client.search(r -> r
        .index(indexName)
        .query(q -> q.matchAll(m -> m)),
        Map.class
    );
    assertEquals(0, immediateSearch.hits().total().value()); // Likely 0!

    // Force refresh
    client.indices().refresh(r -> r.index(indexName));

    // Now it's searchable
    SearchResponse<Map> afterRefresh = client.search(r -> r
        .index(indexName)
        .query(q -> q.matchAll(m -> m)),
        Map.class
    );
    assertEquals(1, afterRefresh.hits().total().value()); // Now 1

    client.indices().delete(r -> r.index(indexName));
}

Solution: Always call client.indices().refresh() after indexing in tests. Alternatively, set refresh=true on index requests:

client.index(r -> r
    .index(indexName)
    .id("1")
    .refresh(Refresh.True)  // Wait for refresh before responding
    .document(doc)
);

Performance Testing with Elasticsearch

@Test
void shouldHandleBulkIndexingWithinSLA() throws IOException {
    String indexName = "perf-test-" + UUID.randomUUID();
    client.indices().create(r -> r
        .index(indexName)
        .settings(s -> s
            .numberOfShards("1")
            .numberOfReplicas("0")
            .refreshInterval(ri -> ri.time("-1"))  // Disable auto-refresh for bulk load
        )
    );

    int documentCount = 10_000;
    BulkRequest.Builder bulk = new BulkRequest.Builder();

    for (int i = 0; i < documentCount; i++) {
        int finalI = i;
        bulk.operations(op -> op.index(idx -> idx
            .index(indexName)
            .id(String.valueOf(finalI))
            .document(Map.of(
                "id", finalI,
                "value", "test-" + finalI,
                "timestamp", System.currentTimeMillis()
            ))
        ));
    }

    long start = System.currentTimeMillis();
    BulkResponse bulkResponse = client.bulk(bulk.build());
    long duration = System.currentTimeMillis() - start;

    assertFalse(bulkResponse.errors(), "Bulk indexing had errors");
    assertEquals(documentCount, bulkResponse.items().size());
    assertThat(duration).isLessThan(10_000L); // Should complete within 10 seconds

    // Re-enable refresh and verify count
    client.indices().putSettings(r -> r
        .index(indexName)
        .settings(s -> s.refreshInterval(ri -> ri.time("1s")))
    );
    client.indices().refresh(r -> r.index(indexName));

    CountResponse count = client.count(r -> r.index(indexName));
    assertEquals(documentCount, count.count());

    client.indices().delete(r -> r.index(indexName));
}

Elasticsearch Testing with HelpMeTest

For end-to-end testing of applications backed by Elasticsearch—like search UIs, autocomplete, and filter interfaces—HelpMeTest lets you test the full user experience in plain English.

Instead of complex client code, define tests like:

*** Test Cases ***
Search Returns Relevant Results
    Go To    https://app.example.com/search
    Fill Text    [data-testid=search-input]    elasticsearch testing
    Click    [data-testid=search-button]
    Wait Until Element Contains    [data-testid=results-count]    results
    Element Should Be Visible    [data-testid=result-item]:first-child
    Element Should Contain    [data-testid=result-item]:first-child    elasticsearch

HelpMeTest handles the browser automation while your Testcontainers setup handles the backend—a clean separation of responsibilities.

Common Pitfalls

1. Not waiting for shard initialization After creating an index, shards need time to initialize. Use waitForYellowStatus() for tests that need the cluster to be ready immediately.

2. Sharing indices between tests Each test should create its own index with a random suffix. Shared indices cause test pollution from previous test data.

3. Ignoring analyzer effects The same text can be indexed differently based on the analyzer. Always test with the actual analyzer your production mapping uses—"iPhone" might be tokenized as "iphone" by the standard analyzer.

4. Testing against wrong Elasticsearch version Use the same major version in tests as in production. Query syntax and behavior differ significantly between versions 7.x and 8.x.

Summary

Effective Elasticsearch testing requires:

  • Real instances via Testcontainers—mocks miss too much behavior
  • Explicit refreshes after indexing in tests
  • Isolated indices per test (random suffix, deleted in cleanup)
  • Mapping validation before testing queries
  • Spring Data test slices for repository-layer testing
  • Bulk testing for performance characteristics

The investment in proper Elasticsearch tests pays dividends—it catches mapping conflicts, query bugs, and analyzer issues before they reach production.

Read more