Skip to main content

Command Palette

Search for a command to run...

🐳 Production MCP Servers: Streamable HTTP, OAuth 2.0, Zod Validation, and Docker

Graduate from stdio toy servers to a hardened, authenticated, containerised MCP service ready for the real world

Published
10 min read
🐳 Production MCP Servers: Streamable HTTP, OAuth 2.0, Zod Validation, and Docker
T

Hi 👋, I'm Tushar Patil. Currently I am working as Frontend Developer (Angular) and also have expertise with .Net Core and Framework.


This is Part 5 of the AI Engineering with TypeScript series. Prerequisites: Part 1 · Part 2 · Part 3 · Part 4 Stack: Node.js 20+ · Express 5 · @modelcontextprotocol/sdk v1.x · Zod · TypeScript 5.x · Docker


🗺️ What we'll cover

Everything we built in Parts 1–4 used stdio transport — the MCP server ran as a child process on the same machine. That works great for local tools and CLI agents, but falls apart when you need to:

  • Deploy your server to a cloud host 🌐
  • Let multiple clients connect concurrently 🔌
  • Gate access behind real authentication 🔐
  • Validate inputs before they ever touch your business logic 🛡️
  • Ship a container image rather than asking people to clone a repo 🐳

In Part 5 we rebuild our weather server with Streamable HTTP transport, add OAuth 2.0 Bearer token authentication, harden every tool input with Zod schemas, and wrap the whole thing in a Dockerfile ready for production.

By the end you'll have:

  • 🌐 An MCP server served over HTTP with proper POST /mcp and GET /mcp (SSE) endpoints
  • 🔐 OAuth 2.0 middleware that validates Bearer tokens on every request
  • ✅ Zod schemas that validate tool inputs before execution
  • 🐳 A minimal, multi-stage Docker image you can push to any registry
  • 🔌 An updated MCP client that connects over HTTP instead of stdio

🌐 Part 1: Streamable HTTP Transport — What and Why

The original MCP transports are stdio (child process pipes) and SSE (legacy server-sent events). The newer Streamable HTTP transport, introduced in MCP spec 2025-03-26, replaces both with a single bidirectional HTTP endpoint:

  • POST /mcp — client sends a JSON-RPC request, server responds (or opens a stream for long responses)
  • GET /mcp — client opens a persistent SSE connection to receive server-initiated messages

This means your MCP server is now a plain HTTP service. It can live behind a load balancer, be deployed to AWS/GCP/Fly.io, and serve hundreds of concurrent clients. All the standard HTTP infrastructure — TLS termination, rate limiting, API gateways — just works. 🎉


📦 Part 2: Project Setup

mkdir mcp-weather-http && cd mcp-weather-http
npm init -y
npm install @modelcontextprotocol/sdk express zod dotenv
npm install -D typescript @types/express @types/node tsx

Your tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "outDir": "dist",
    "strict": true,
    "esModuleInterop": true
  },
  "include": ["src"]
}

Your .env file (never commit this 🚨):

PORT=3000
VALID_TOKENS=token-alice,token-bob,token-service-account
WEATHER_API_KEY=your_openweathermap_key_here

🔐 Part 3: OAuth 2.0 Middleware

In a real system your auth server mints JWTs and you verify them with a public key. For clarity here we use opaque Bearer tokens validated against a list — the pattern is identical, just swap in jsonwebtoken.verify() when you're ready.

// src/auth.ts
import type { Request, Response, NextFunction } from "express";

const VALID_TOKENS = new Set(
  (process.env.VALID_TOKENS ?? "").split(",").filter(Boolean)
);

export function bearerAuth(req: Request, res: Response, next: NextFunction) {
  const authHeader = req.headers.authorization ?? "";

  if (!authHeader.startsWith("Bearer ")) {
    res.status(401).json({
      error: "unauthorized",
      message: "Missing or malformed Authorization header",
    });
    return;
  }

  const token = authHeader.slice(7).trim();

  if (!VALID_TOKENS.has(token)) {
    res.status(403).json({
      error: "forbidden",
      message: "Invalid Bearer token",
    });
    return;
  }

  next();
}

Three lines of real-world advice here. Never log the raw token — log a hash or the first 8 characters only. Keep VALID_TOKENS in an env var or secrets manager, never in source code. When you switch to JWTs, verify the signature and the exp claim — a valid signature on an expired token is still a rejected token. 🔐


✅ Part 4: Zod Schemas for Tool Inputs

Zod gives you runtime validation with TypeScript types inferred for free. Define your schemas once and use them in both the MCP tool inputSchema and the execution handler:

// src/schemas.ts
import { z } from "zod";

export const GetWeatherInput = z.object({
  city: z.string().min(1).max(100).describe("City name"),
  country: z
    .string()
    .length(2)
    .toUpperCase()
    .optional()
    .describe("ISO 3166-1 alpha-2 country code, e.g. IN"),
  units: z
    .enum(["metric", "imperial", "standard"])
    .default("metric")
    .describe("Temperature unit system"),
});

export const GetForecastInput = z.object({
  city: z.string().min(1).max(100),
  days: z.number().int().min(1).max(7).default(5),
});

export type GetWeatherInputType = z.infer<typeof GetWeatherInput>;
export type GetForecastInputType = z.infer<typeof GetForecastInput>;

Now a helper that turns any Zod schema into the JSON Schema object that MCP's inputSchema field expects:

// src/utils.ts
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

export function toInputSchema(schema: z.ZodTypeAny) {
  return zodToJsonSchema(schema, { target: "openApi3" }) as Record<string, unknown>;
}

Install the converter:

npm install zod-to-json-schema

🌐 Part 5: Building the HTTP MCP Server

// src/server.ts
import "dotenv/config";
import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { bearerAuth } from "./auth.js";
import {
  GetWeatherInput,
  GetForecastInput,
} from "./schemas.js";
import { fetchCurrentWeather, fetchForecast } from "./weather.js";

const app = express();
app.use(express.json());

// Apply auth to all MCP routes
app.use("/mcp", bearerAuth);

// One transport instance handles all sessions
const transport = new StreamableHTTPServerTransport({ path: "/mcp" });

// Wire up the MCP server
const server = new McpServer({
  name: "weather-http-server",
  version: "1.0.0",
});

// Register tools with Zod-validated inputs
server.tool(
  "get_current_weather",
  "Get the current weather for a city",
  GetWeatherInput.shape,
  async (args) => {
    const input = GetWeatherInput.parse(args);
    const data = await fetchCurrentWeather(input.city, input.country, input.units);
    return {
      content: [{ type: "text", text: JSON.stringify(data, null, 2) }],
    };
  }
);

server.tool(
  "get_forecast",
  "Get a multi-day weather forecast for a city",
  GetForecastInput.shape,
  async (args) => {
    const input = GetForecastInput.parse(args);
    const data = await fetchForecast(input.city, input.days);
    return {
      content: [{ type: "text", text: JSON.stringify(data, null, 2) }],
    };
  }
);

// Connect server to transport
await server.connect(transport);

// Mount POST and GET handlers
app.post("/mcp", (req, res) => transport.handleRequest(req, res));
app.get("/mcp", (req, res) => transport.handleRequest(req, res));

// Health check — no auth required
app.get("/health", (_req, res) => res.json({ status: "ok" }));

const PORT = Number(process.env.PORT ?? 3000);
app.listen(PORT, () => {
  console.log(`Weather MCP server running on http://localhost:${PORT}`);
});

Notice that bearerAuth is mounted as Express middleware before the transport handlers, so every MCP request is authenticated before a single byte of JSON-RPC is parsed. The /health endpoint sits outside the /mcp path so load balancers and Kubernetes liveness probes can reach it without a token. ✅


🌐 Part 6: Updating the Client for HTTP Transport

Switching the client from stdio to HTTP takes three lines:

// src/http-client.ts
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";

export async function createHttpMcpClient(serverUrl: string, token: string) {
  const transport = new StreamableHTTPClientTransport(
    new URL(serverUrl),
    {
      requestInit: {
        headers: { Authorization: `Bearer ${token}` },
      },
    }
  );

  const client = new Client(
    { name: "weather-agent", version: "1.0.0" },
    { capabilities: { sampling: {} } }
  );

  await client.connect(transport);
  console.log("Connected to HTTP MCP server");
  return client;
}

Every request the client makes — listTools, callTool, readResource — will automatically carry the Authorization header. The rest of your agent code from Parts 3 and 4 needs zero changes. 🎉


🐳 Part 7: Dockerfile

A multi-stage build keeps the final image lean — only compiled JS and production node_modules make it in:

# Stage 1: build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY tsconfig.json ./
COPY src ./src
RUN npm run build

# Stage 2: runtime
FROM node:20-alpine AS runtime
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY --from=builder /app/dist ./dist
EXPOSE 3000
ENV NODE_ENV=production
CMD ["node", "dist/server.js"]

Build and run it:

docker build -t weather-mcp-server .

docker run -p 3000:3000 \
  -e VALID_TOKENS=token-alice,token-bob \
  -e WEATHER_API_KEY=your_key \
  weather-mcp-server

Test the health endpoint:

curl http://localhost:3000/health
# {"status":"ok"}

Test an MCP tool call directly:

curl -X POST http://localhost:3000/mcp \
  -H "Authorization: Bearer token-alice" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"get_current_weather","arguments":{"city":"Pune"}}}'

🐳 Part 8: docker-compose for Local Development

Running the server and a test client together locally is easiest with Compose:

version: "3.9"
services:
  mcp-server:
    build: .
    ports:
      - "3000:3000"
    environment:
      - VALID_TOKENS=dev-token
      - WEATHER_API_KEY=${WEATHER_API_KEY}
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
      interval: 10s
      timeout: 5s
      retries: 3

  agent:
    build:
      context: ../weather-agent
    environment:
      - MCP_SERVER_URL=http://mcp-server:3000
      - MCP_TOKEN=dev-token
    depends_on:
      mcp-server:
        condition: service_healthy

docker compose up

The depends_on with service_healthy ensures the agent container only starts after the MCP server passes its health check. No more race conditions on startup. ✅


🛡️ Part 9: Input Validation Deep Dive

Let's look at what happens when a bad input hits your server — with and without Zod.

Without Zod:

// args could be anything
async (args) => {
  const weather = await fetchCurrentWeather(args.city, args.country, args.units);
  // If args.city is undefined — runtime crash deep in fetchCurrentWeather
  // Stack trace exposes internals, client gets a 500
}

With Zod:

async (args) => {
  const input = GetWeatherInput.parse(args);
  // If city is missing — ZodError thrown here with a clear message
  // You catch it and return a structured MCP error — no 500, no stack trace
}

Add a global Zod error handler in Express:

import { ZodError } from "zod";

app.use((err: unknown, _req: Request, res: Response, _next: NextFunction) => {
  if (err instanceof ZodError) {
    res.status(400).json({
      error: "validation_error",
      issues: err.issues.map((i) => ({
        path: i.path.join("."),
        message: i.message,
      })),
    });
    return;
  }
  console.error(err);
  res.status(500).json({ error: "internal_server_error" });
});

Now a bad request gets a clean 400 with field-level error messages, and your server internals stay hidden. 🛡️


🏗️ Part 10: Final Project Structure

mcp-weather-http/
├── src/
│   ├── server.ts          ← Express app + MCP server
│   ├── auth.ts            ← Bearer token middleware
│   ├── schemas.ts         ← Zod input schemas
│   ├── weather.ts         ← OpenWeatherMap API calls
│   └── utils.ts           ← zodToJsonSchema helper
├── Dockerfile
├── docker-compose.yml
├── .env                   ← never commit!
├── .env.example           ← commit this instead
├── package.json
└── tsconfig.json

💡 Production Checklist

Before you ship this to a real environment, run through this list:

  • 🔐 Replace opaque tokens with signed JWTs and verify exp, iss, aud
  • 🔒 Terminate TLS at your load balancer or reverse proxy — never serve HTTP in production
  • 📊 Add request logging with correlation IDs (use pino or winston)
  • ⚡ Add rate limiting per token (use express-rate-limit)
  • 🏥 Add a /ready endpoint in addition to /health for Kubernetes readiness probes
  • 📦 Pin your base Docker image to a specific digest, not just node:20-alpine
  • 🔍 Scan your image with docker scout or trivy before pushing to a registry

🎯 Summary

In Part 5 you promoted the weather server from a local stdio script to a production-grade HTTP service:

  • 🌐 Streamable HTTP transport — proper POST /mcp and GET /mcp endpoints
  • 🔐 OAuth 2.0 Bearer auth — every request is authenticated before JSON-RPC parsing
  • Zod validation — clean 400s instead of 500s on bad input
  • 🐳 Multi-stage Docker — lean runtime image, no dev dependencies shipped
  • 🔌 HTTP client — three lines to switch your agent from stdio to HTTP

In Part 6 we'll add multi-tenant session management to the HTTP server — so each connected client gets isolated state, tool call history, and resource caches. We'll also explore horizontal scaling and what that means for MCP session stickiness. 🏗️


📚 Further Reading