Solving MCP Session Persistence Across Kubernetes Pods with Redis

How I tackled stateful MCP session management in a multi-replica Kubernetes deployment using Redis as a shared session store.

If you’ve ever tried running a stateful MCP (Model Context Protocol) server in a horizontally-scaled Kubernetes deployment, you’ve probably hit the same wall I did: sessions are pod-local, and once you add a second replica, everything falls apart.

Let me walk through the problem and how I solved it.

The Problem

MCP servers maintain session state — tool registrations, context, ongoing conversations. By default, that state lives in memory on the process. That works great for a single instance. The moment you scale to two pods and your load balancer starts round-robining requests, you get:

  • Client connects to Pod A, session is initialized
  • Next request lands on Pod B — no session found, error
  • Chaos ensues

In my setup, I was running an MCP server that handled tool calls for an internal AI workflow. We needed HA, so we needed multiple replicas. And we needed session state to survive pod restarts and work across replicas.

The Solution: Redis as a Shared Session Store

The fix is conceptually simple: externalize the session store. Instead of each pod holding sessions in memory, all pods read and write to a shared Redis instance.

Here’s the approach I used:

import { createClient } from 'redis';
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

// Serialize and store session data on each update
async function saveSession(sessionId: string, data: SessionData) {
  await redis.setEx(
    `mcp:session:${sessionId}`,
    3600, // 1-hour TTL
    JSON.stringify(data)
  );
}

async function loadSession(sessionId: string): Promise<SessionData | null> {
  const raw = await redis.get(`mcp:session:${sessionId}`);
  return raw ? JSON.parse(raw) : null;
}

The session ID comes from the client — either a header or part of the request payload. On each request, we load the session from Redis, process the tool call with that context, and write the updated session back.

What About Session Hydration?

One tricky bit: when a new pod picks up a session that was started on a different pod, it needs to fully reconstruct the MCP server state from the stored session data. This means you need to be intentional about what you serialize.

For my use case, I stored:

  • Active tool registrations (as descriptors, not function references)
  • Current conversation context
  • Client metadata

Function references obviously can’t be serialized, so I re-register tools on session load using the stored descriptors as a lookup key.

Kubernetes Config

On the infra side, it’s straightforward. I added a Redis deployment (or used a managed Redis — ElastiCache works great for this) and exposed it as a ClusterIP service. The MCP server pods get REDIS_URL via a Kubernetes Secret.

env:
  - name: REDIS_URL
    valueFrom:
      secretKeyRef:
        name: mcp-secrets
        key: redis-url

The Result

After the change, scaling from 1 to 4 replicas was a non-event. Sessions are durable across pod restarts, requests can land on any pod, and the whole thing handles pod churn gracefully.

The extra latency from Redis is negligible — we’re talking sub-millisecond for a local Redis call, which disappears in the noise of an LLM inference round-trip.

If you’re running MCP in production at any meaningful scale, external session storage isn’t optional — it’s just part of the architecture.