Integration Testing MCP Servers End-to-End: Transports, Tool Calls, and Resource Listing

Integration Testing MCP Servers End-to-End: Transports, Tool Calls, and Resource Listing

Unit tests for MCP tool handlers are necessary but not sufficient. They test the function — they don't test whether the server correctly handles the MCP protocol, whether tools are advertised correctly, whether resources are listed as expected, or whether the transport layer survives malformed messages.

Integration tests do. And for MCP servers, integration testing has an unusual property: the server itself becomes the system under test, not just the code inside it.

Here's how to write integration tests that give you real confidence in your MCP server.

The Integration Test Boundary

An integration test for an MCP server should:

  1. Start a real server process (or connect to a running one)
  2. Connect via a real transport (stdio, HTTP/SSE, or in-memory)
  3. Make actual MCP protocol calls
  4. Assert on the responses

This is different from unit testing, where you test the handler function in isolation. Integration tests catch bugs in the protocol layer, the server configuration, and the interaction between components.

Transport Options for Testing

MCP supports three transport types. Each has different testing characteristics.

In-memory transport — fastest, no process spawning, best for CI.

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { createMyServer } from './server';

async function createTestClient() {
  const [clientTransport, serverTransport] = InMemoryTransport.createLinkedPair();
  
  const server = createMyServer();
  await server.connect(serverTransport);
  
  const client = new Client({ name: 'test', version: '1.0' });
  await client.connect(clientTransport);
  
  return { client, server };
}

Use this for the majority of your integration tests. No process management, no port conflicts, runs in parallel.

stdio transport — tests the actual server binary. Catches issues with process startup, environment variable handling, and stdio buffering.

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { spawn } from 'child_process';

async function createStdioClient() {
  const transport = new StdioClientTransport({
    command: 'node',
    args: ['./dist/server.js'],
    env: { ...process.env, NODE_ENV: 'test' }
  });

  const client = new Client({ name: 'test', version: '1.0' });
  await client.connect(transport);
  
  return client;
}

Use this for your server binary smoke tests — "does it start? Does it respond to initialization?"

HTTP/SSE transport — tests the full HTTP stack. Required if your server is deployed as an HTTP service.

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { SSEClientTransport } from '@modelcontextprotocol/sdk/client/sse.js';

async function createHTTPClient(baseUrl: string) {
  const transport = new SSEClientTransport(new URL(`${baseUrl}/sse`));
  const client = new Client({ name: 'test', version: '1.0' });
  await client.connect(transport);
  return client;
}

Use this for staging environment tests and smoke tests against your production server.

Testing Tool Listing

The first thing a client does is enumerate your tools. If tools/list returns incorrect data, the client can't use your server correctly.

describe('tool listing', () => {
  let client: Client;
  let cleanup: () => Promise<void>;

  beforeEach(async () => {
    const { client: c, server } = await createTestClient();
    client = c;
    cleanup = async () => { await c.close(); await server.close(); };
  });

  afterEach(() => cleanup());

  it('lists all expected tools', async () => {
    const { tools } = await client.listTools();
    const names = tools.map(t => t.name);
    
    expect(names).toContain('search');
    expect(names).toContain('read-file');
    expect(names).toContain('write-file');
  });

  it('tool schemas have required fields', async () => {
    const { tools } = await client.listTools();
    
    for (const tool of tools) {
      expect(tool.name).toBeTruthy();
      expect(tool.description).toBeTruthy();
      expect(tool.inputSchema).toBeDefined();
      expect(tool.inputSchema.type).toBe('object');
    }
  });

  it('input schemas are valid JSON Schema', async () => {
    const { tools } = await client.listTools();
    const Ajv = (await import('ajv')).default;
    const ajv = new Ajv();

    for (const tool of tools) {
      // Each tool's inputSchema should be a valid JSON Schema
      expect(() => ajv.compile(tool.inputSchema)).not.toThrow();
    }
  });
});

These tests encode the contract your tools expose. When you rename a tool or change a schema, these tests fail — which is exactly what you want.

Testing Tool Call Results

After listing, the next layer is calling tools and asserting on outputs.

describe('tool call assertions', () => {
  it('search returns structured results', async () => {
    const result = await client.callTool({
      name: 'search',
      arguments: { query: 'integration testing', limit: 3 }
    });

    expect(result.isError).toBe(false);
    expect(result.content).toHaveLength(1);
    expect(result.content[0].type).toBe('text');

    const parsed = JSON.parse(result.content[0].text);
    expect(Array.isArray(parsed.results)).toBe(true);
    expect(parsed.results.length).toBeLessThanOrEqual(3);
  });

  it('read-file returns file content', async () => {
    // Pre-condition: test file exists in the test environment
    const result = await client.callTool({
      name: 'read-file',
      arguments: { path: '/tmp/test-fixture.txt' }
    });

    expect(result.isError).toBe(false);
    expect(result.content[0].type).toBe('text');
    expect(result.content[0].text).toContain('test content');
  });

  it('unknown tool returns protocol error', async () => {
    // Calling a non-existent tool should not crash the server
    await expect(
      client.callTool({ name: 'nonexistent-tool', arguments: {} })
    ).rejects.toThrow(); // Should throw a protocol error, not silently fail
  });
});

Test fixture management. Integration tests often need preconditions — files on disk, database rows, network endpoints. Set these up in beforeEach and tear them down in afterEach. Don't rely on global state that persists between test runs.

import { writeFileSync, unlinkSync } from 'fs';

describe('file tool integration', () => {
  const TEST_FILE = '/tmp/mcp-test-fixture.txt';

  beforeEach(() => {
    writeFileSync(TEST_FILE, 'test content for integration test');
  });

  afterEach(() => {
    try { unlinkSync(TEST_FILE); } catch {}
  });

  it('reads the fixture file correctly', async () => {
    const result = await client.callTool({
      name: 'read-file',
      arguments: { path: TEST_FILE }
    });
    expect(result.content[0].text).toBe('test content for integration test');
  });
});

Testing Resource Listing

Resources are a less-tested MCP capability. If your server exposes resources — files, database records, API data — test that listing and reading them works.

describe('resource listing', () => {
  it('lists available resources', async () => {
    const { resources } = await client.listResources();
    
    expect(Array.isArray(resources)).toBe(true);
    for (const resource of resources) {
      expect(resource.uri).toBeTruthy();
      expect(resource.name).toBeTruthy();
    }
  });

  it('resources have valid URI format', async () => {
    const { resources } = await client.listResources();
    
    for (const resource of resources) {
      // URIs should be parseable
      expect(() => new URL(resource.uri)).not.toThrow();
    }
  });

  it('reads a specific resource', async () => {
    const { resources } = await client.listResources();
    if (resources.length === 0) return; // Skip if no resources

    const firstResource = resources[0];
    const result = await client.readResource({ uri: firstResource.uri });
    
    expect(result.contents).toBeDefined();
    expect(result.contents.length).toBeGreaterThan(0);
  });
});

Paginated resource listing. If your server has many resources, test pagination:

it('paginates resource listing', async () => {
  const page1 = await client.listResources();
  
  if (page1.nextCursor) {
    const page2 = await client.listResources({ cursor: page1.nextCursor });
    // Page 2 should not repeat resources from page 1
    const page1Uris = new Set(page1.resources.map(r => r.uri));
    for (const r of page2.resources) {
      expect(page1Uris.has(r.uri)).toBe(false);
    }
  }
});

Testing Transport Resilience

Transports can fail. Test that your server handles transport-level issues gracefully.

describe('transport resilience', () => {
  it('handles client disconnection cleanly', async () => {
    const { client: c, server } = await createTestClient();
    
    // Abruptly close the client without cleanup
    await c.close();
    
    // Server should not crash — give it a moment
    await new Promise(resolve => setTimeout(resolve, 100));
    
    // Server should still be running — create a new client to verify
    const { client: c2 } = await createTestClient();
    const { tools } = await c2.listTools();
    expect(tools.length).toBeGreaterThan(0);
    await c2.close();
    await server.close();
  });

  it('handles rapid sequential connections', async () => {
    const connections = await Promise.all(
      Array.from({ length: 5 }, () => createTestClient())
    );
    
    // All clients should be able to list tools
    const results = await Promise.all(
      connections.map(({ client: c }) => c.listTools())
    );
    
    results.forEach(r => expect(r.tools.length).toBeGreaterThan(0));
    
    // Cleanup
    await Promise.all(connections.map(async ({ client: c, server: s }) => {
      await c.close();
      await s.close();
    }));
  });
});

HTTP/SSE Server Integration Tests

If you're running an HTTP-based MCP server, add tests that start the HTTP server and connect via SSE:

import { createServer } from 'http';
import { app } from './http-server'; // Your Express/Fastify app

describe('HTTP/SSE server integration', () => {
  let server: ReturnType<typeof createServer>;
  let baseUrl: string;

  beforeAll(async () => {
    await new Promise<void>(resolve => {
      server = app.listen(0, () => {
        const addr = server.address() as { port: number };
        baseUrl = `http://localhost:${addr.port}`;
        resolve();
      });
    });
  });

  afterAll(async () => {
    await new Promise<void>(resolve => server.close(() => resolve()));
  });

  it('serves MCP via SSE transport', async () => {
    const client = await createHTTPClient(baseUrl);
    const { tools } = await client.listTools();
    expect(tools.length).toBeGreaterThan(0);
    await client.close();
  });

  it('handles concurrent SSE connections', async () => {
    const clients = await Promise.all(
      Array.from({ length: 3 }, () => createHTTPClient(baseUrl))
    );
    
    const results = await Promise.all(clients.map(c => c.listTools()));
    results.forEach(r => expect(r.tools.length).toBeGreaterThan(0));
    
    await Promise.all(clients.map(c => c.close()));
  });
});

CI Configuration

Integration tests need a bit more setup than unit tests, but they run without external services if you use in-memory transport.

# .github/workflows/test.yml
name: Test

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '22'
      
      - run: npm ci
      - run: npm run build
      
      - name: Run unit tests
        run: npm run test:unit
      
      - name: Run integration tests
        run: npm run test:integration
        env:
          NODE_ENV: test

Separate your unit and integration test runs. Unit tests should take under 5 seconds; integration tests under 30. If either grows beyond that, investigate.

What to Assert in Integration Tests

Always assert:

  • Tool list contains expected tools
  • Tool call on happy path returns isError: false with non-empty content
  • Tool call with invalid args returns isError: true or a protocol error (not a crash)
  • Resource listing returns valid URI format

Assert when relevant:

  • Response content matches a specific schema (if your tool returns structured JSON)
  • Pagination works correctly
  • Concurrent calls return independent results

Don't assert:

  • Exact string content that may change (prefer structure assertions)
  • Performance numbers in CI (too flaky)
  • Internal server state (tests through the protocol only)

Integration tests are the safety net between unit tests and production. Write them before you ship, run them on every PR.

Read more