Dialogflow CX Testing: Unit, Integration, and Regression Strategies

Dialogflow CX Testing: Unit, Integration, and Regression Strategies

Dialogflow CX agents require testing at three levels: the built-in Test Cases console feature for conversation regression, Python client library for automated route and parameter testing, and Flask test client for webhook validation. This guide covers all three with production-ready examples.

Dialogflow CX introduced a significantly more structured approach to conversational agents than its ES predecessor — flows, pages, routes, and parameters give you explicit control over dialog management. But that structure also means more things can go wrong: a mis-configured route condition, a missing parameter extraction, or a broken webhook can silently degrade your agent's behavior. Here's how to build a test suite that catches those failures.

The CX Architecture You're Testing

Before diving into testing, it helps to be explicit about what CX components need coverage:

  • Flows — top-level conversation scopes (e.g., "Main Menu", "Booking", "Support")
  • Pages — states within a flow; each page has entry fulfillment, form parameters, and routes
  • Routes — transition conditions triggered by intent matches or condition expressions
  • Parameters — entities extracted from user utterances, with required/optional flags
  • Webhooks — HTTP callbacks for dynamic fulfillment
  • Session entities — runtime overrides to your entity types

Each of these is a test target. A complete test suite covers the happy path through each flow, parameter extraction accuracy, route transition logic, and webhook behavior under both success and failure conditions.

Built-In Test Cases in the Console

Dialogflow CX ships with a Test Cases feature accessible via the console or the API. Test cases record conversation transcripts (input utterances + expected responses) and can be replayed to detect regressions.

Create test cases programmatically using the Python client:

from google.cloud.dialogflowcx_v3 import TestCasesClient, TestCase, ConversationTurn
from google.cloud.dialogflowcx_v3.types import TestConfig, QueryInput, TextInput

client = TestCasesClient()
agent = "projects/my-project/locations/us-central1/agents/my-agent"

test_case = TestCase(
    display_name="Booking happy path",
    test_config=TestConfig(flow=f"{agent}/flows/booking-flow"),
    test_case_conversations=[
        ConversationTurn(
            user_input=ConversationTurn.UserInput(
                input=QueryInput(text=TextInput(text="I want to book a table"))
            ),
            virtual_agent_output=ConversationTurn.VirtualAgentOutput(
                session_parameters={"booking_intent": "create"},
                triggered_intent=f"{agent}/intents/book-table",
                current_page=f"{agent}/flows/booking-flow/pages/collect-date",
            ),
        ),
        ConversationTurn(
            user_input=ConversationTurn.UserInput(
                input=QueryInput(text=TextInput(text="Tomorrow at 7pm for two people"))
            ),
            virtual_agent_output=ConversationTurn.VirtualAgentOutput(
                session_parameters={
                    "date": "2026-05-18",
                    "time": "19:00",
                    "party_size": "2",
                },
                current_page=f"{agent}/flows/booking-flow/pages/confirm-booking",
            ),
        ),
    ],
)

response = client.create_test_case(parent=agent, test_case=test_case)
print(f"Created: {response.name}")

To run test cases programmatically and assert on results:

from google.cloud.dialogflowcx_v3 import TestCasesClient
from google.cloud.dialogflowcx_v3.types import TestResult

client = TestCasesClient()

operation = client.run_test_case(name=test_case_name)
result = operation.result(timeout=60)

assert result.test_result == TestResult.PASSED, (
    f"Test case failed: {result.name}\n"
    f"Differences: {result.differences}"
)

Page Transition Testing

The core logic of a CX agent is in its route conditions. Test that specific utterances trigger the correct page transitions:

import pytest
from google.cloud.dialogflowcx_v3 import SessionsClient
from google.cloud.dialogflowcx_v3.types import QueryInput, TextInput, DetectIntentRequest

@pytest.fixture(scope="session")
def sessions_client():
    return SessionsClient()

@pytest.fixture
def session_path(sessions_client):
    import uuid
    agent = "projects/my-project/locations/us-central1/agents/my-agent"
    session_id = str(uuid.uuid4())
    return f"{agent}/sessions/{session_id}"

def detect_intent(client, session_path, text, language_code="en"):
    request = DetectIntentRequest(
        session=session_path,
        query_input=QueryInput(
            text=TextInput(text=text),
            language_code=language_code,
        ),
    )
    return client.detect_intent(request=request)

def test_booking_page_transition(sessions_client, session_path):
    response = detect_intent(sessions_client, session_path, "book a table")
    qr = response.query_result

    assert "collect-date" in qr.current_page.name, (
        f"Expected collect-date page, got: {qr.current_page.name}"
    )
    assert qr.match.intent.display_name == "book-table"

def test_confirmation_page_after_parameters(sessions_client, session_path):
    # First turn: trigger booking flow
    detect_intent(sessions_client, session_path, "book a table")
    # Second turn: fill all parameters
    response = detect_intent(sessions_client, session_path, "Friday at 8pm for 4 people")
    qr = response.query_result

    assert "confirm" in qr.current_page.name.lower()
    params = qr.parameters
    assert params["party_size"] == 4
    assert "friday" in params["date"].lower() or "2026" in str(params["date"])

Parameter Extraction Validation

Parameter extraction failures are one of the top causes of CX agent bugs. Test each entity type with boundary inputs:

@pytest.mark.parametrize("utterance,expected_party_size", [
    ("for two people", 2),
    ("party of 6", 6),
    ("just me", 1),
    ("table for twelve", 12),
])
def test_party_size_extraction(sessions_client, utterance, expected_party_size):
    session = make_fresh_session(sessions_client)
    detect_intent(sessions_client, session, "book a table")
    response = detect_intent(sessions_client, session, utterance)
    params = response.query_result.parameters
    assert params.get("party_size") == expected_party_size, (
        f"Utterance '{utterance}': expected {expected_party_size}, got {params.get('party_size')}"
    )

@pytest.mark.parametrize("utterance", [
    "I want to cancel",
    "never mind",
    "forget it",
    "stop",
])
def test_cancellation_route(sessions_client, utterance):
    session = make_fresh_session(sessions_client)
    detect_intent(sessions_client, session, "book a table")
    response = detect_intent(sessions_client, session, utterance)
    # Should route to cancellation confirmation, not continue booking
    page = response.query_result.current_page.name
    assert "cancel" in page.lower() or "main" in page.lower(), (
        f"Cancellation utterance '{utterance}' did not route to cancel/main, got: {page}"
    )

Webhook Testing with Flask Test Client

CX webhooks follow a strict request/response schema. Test your webhook handlers in isolation before deploying:

# webhook.py
from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/fulfillment', methods=['POST'])
def fulfillment():
    req = request.get_json()
    tag = req.get('fulfillmentInfo', {}).get('tag', '')
    params = req.get('sessionInfo', {}).get('parameters', {})

    if tag == 'validate-booking':
        date = params.get('date')
        if not is_date_available(date):
            return jsonify({
                "fulfillmentResponse": {
                    "messages": [{"text": {"text": [f"Sorry, {date} is fully booked."]}}]
                },
                "sessionInfo": {
                    "parameters": {"booking_available": False}
                },
            })
        return jsonify({
            "fulfillmentResponse": {
                "messages": [{"text": {"text": ["Great, that date is available!"]}}]
            },
            "sessionInfo": {
                "parameters": {"booking_available": True}
            },
        })
# test_webhook.py
import pytest
import json
from webhook import app

@pytest.fixture
def client():
    app.config['TESTING'] = True
    with app.test_client() as c:
        yield c

def make_webhook_request(tag, parameters):
    return {
        "fulfillmentInfo": {"tag": tag},
        "sessionInfo": {
            "session": "projects/p/locations/l/agents/a/sessions/test-session",
            "parameters": parameters,
        },
    }

def test_validate_booking_available(client):
    payload = make_webhook_request("validate-booking", {"date": "2026-06-15"})
    response = client.post('/fulfillment',
        data=json.dumps(payload),
        content_type='application/json')

    assert response.status_code == 200
    data = response.get_json()
    assert data["sessionInfo"]["parameters"]["booking_available"] is True
    text = data["fulfillmentResponse"]["messages"][0]["text"]["text"][0]
    assert "available" in text.lower()

def test_validate_booking_unavailable(client, mocker):
    mocker.patch('webhook.is_date_available', return_value=False)
    payload = make_webhook_request("validate-booking", {"date": "2026-06-20"})
    response = client.post('/fulfillment',
        data=json.dumps(payload),
        content_type='application/json')

    data = response.get_json()
    assert data["sessionInfo"]["parameters"]["booking_available"] is False
    text = data["fulfillmentResponse"]["messages"][0]["text"]["text"][0]
    assert "fully booked" in text.lower()

Session Entity Overrides

Session entities let you inject test-specific entity values without modifying your agent. This is essential for testing with controlled entity sets:

from google.cloud.dialogflowcx_v3 import SessionEntityTypesClient
from google.cloud.dialogflowcx_v3.types import SessionEntityType, EntityType

def create_test_entities(client, session_path, entity_type_name, entities):
    """Override entity values for a test session."""
    session_entity_type = SessionEntityType(
        name=f"{session_path}/entityTypes/{entity_type_name}",
        entity_override_mode=SessionEntityType.EntityOverrideMode.ENTITY_OVERRIDE_MODE_OVERRIDE,
        entities=[
            EntityType.Entity(value=e["value"], synonyms=e["synonyms"])
            for e in entities
        ],
    )
    return client.create_session_entity_type(
        session_entity_type=session_entity_type
    )

# In your test:
create_test_entities(session_client, session_path, "restaurant-location", [
    {"value": "downtown", "synonyms": ["downtown", "city center", "the main one"]},
    {"value": "airport", "synonyms": ["airport", "terminal"]},
])

CI/CD Integration

Run your CX regression tests in GitHub Actions against a dedicated test agent (separate from production):

name: Dialogflow CX Regression
on: [push, pull_request]

jobs:
  regression:
    runs-on: ubuntu-latest
    env:
      GOOGLE_CLOUD_PROJECT: ${{ secrets.GCP_PROJECT }}
      DIALOGFLOW_AGENT_ID: ${{ secrets.TEST_AGENT_ID }}
    steps:
      - uses: actions/checkout@v4
      - uses: google-github-actions/auth@v2
        with:
          credentials_json: ${{ secrets.GCP_SA_KEY }}
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install -r requirements-test.txt
      - run: pytest tests/dialogflow/ -v --tb=short
      - name: Run CX built-in test cases
        run: python scripts/run_cx_test_cases.py

Use a separate test environment agent that mirrors production. Never run regression tests against your live agent — CX test runs can affect agent metrics.

HelpMeTest can run multi-turn Dialogflow CX conversation scenarios on a schedule, alerting you when page transitions break or parameter extraction regresses after an agent update.

Read more