Skip to main content

Command Palette

Search for a command to run...

πŸ€–πŸ€– Multi-Agent Orchestration with MCP: Spawn, Delegate, and Aggregate

One agent cannot do everything well β€” build an orchestrator that decomposes tasks, spawns specialist sub-agents, runs them in parallel, and synthesises their results into a single coherent answer

Updated
β€’17 min read
πŸ€–πŸ€– Multi-Agent Orchestration with MCP: Spawn, Delegate, and Aggregate
T

Hi πŸ‘‹, I'm Tushar Patil. Currently I am working as Frontend Developer (Angular) and also have expertise with .Net Core and Framework.


This is Part 14 of the AI Engineering with TypeScript series.

Prerequisites: Part 3 β€” AI Agent Β· Part 8 β€” MCP Client SDK Β· Part 13 β€” Real-Time Agents

Stack: Node.js 20+ Β· TypeScript 5.x Β· @anthropic-ai/sdk Β· @modelcontextprotocol/sdk v1.x Β· Zod


πŸ—ΊοΈ What we'll cover

Throughout this series the agent has always been a single Claude instance β€” one model, one tool loop, one answer. That works beautifully for focused tasks. But complex real-world requests often span multiple domains simultaneously:

"Research what our internal docs say about deployment, check the current weather in the target region, summarise the findings, and write the result back to the knowledge base."

A single agent tackles these sequentially. A multi-agent system runs them in parallel β€” three specialists working simultaneously, an orchestrator collecting their results and synthesising a final answer. For a four-step task this can reduce wall-clock time by 60–70%. πŸš€

In Part 14 we build exactly this. The key insight is that sub-agents are just MCP tool calls from the orchestrator's perspective. The orchestrator calls a spawn_agent tool, the tool runs a full agent loop internally, and returns a string result β€” just like any other tool. The orchestrator never needs to know about Claude API calls, streaming, or message history inside the sub-agent.

By the end you will have:

  • 🏭 A SubAgentRunner β€” a self-contained class that runs a full agent loop and returns a typed result
  • πŸ”§ A spawn_agent MCP tool β€” the interface the orchestrator uses to delegate tasks
  • πŸ€– An OrchestratorAgent β€” the top-level Claude instance that decomposes tasks, calls spawn_agent in parallel, and synthesises results
  • ⚑ Parallel execution β€” multiple sub-agents running simultaneously with Promise.allSettled
  • πŸ›‘οΈ Failure isolation β€” one sub-agent failing never crashes the orchestrator
  • πŸ“Š Result aggregation β€” structured output from each sub-agent fed back to the orchestrator as context
  • πŸ§ͺ A test harness for multi-agent systems using InMemoryTransport

🧠 Part 1: The Architecture β€” Why This Pattern Works

The orchestrator pattern maps naturally onto MCP's tool abstraction:

User prompt
    β”‚
    β–Ό
Orchestrator (Claude instance)
    β”‚ decides to call spawn_agent three times
    β”‚
    β”œβ”€β”€β”€ spawn_agent({ role: "researcher", task: "find docs on deployment" })
    β”‚         └─── Sub-agent A runs independently
    β”‚                   └─── calls search_knowledge_base
    β”‚                   └─── returns findings
    β”‚
    β”œβ”€β”€β”€ spawn_agent({ role: "weather-analyst", task: "check Mumbai weather" })
    β”‚         └─── Sub-agent B runs independently
    β”‚                   └─── calls get_current_weather + get_forecast
    β”‚                   └─── returns weather summary
    β”‚
    └─── spawn_agent({ role: "writer", task: "draft the final report" })
              └─── Sub-agent C runs after A and B complete
                        └─── receives A and B results as context
                        └─── calls index_document
                        └─── returns confirmation

The orchestrator sees only tool results β€” clean strings. It has no visibility into how many API calls the sub-agent made, how long it took internally, or which tools it used. This separation means you can improve any sub-agent independently without touching the orchestrator. 🎯


🏭 Part 2: The SubAgentRunner

The runner encapsulates a complete agent loop. It takes a task description, a set of available tools, and a system prompt defining the agent's role. It runs until the model stops requesting tool calls and returns the final text output:

// src/agents/sub-agent-runner.ts
import Anthropic from "@anthropic-ai/sdk";
import type { McpClientWrapper } from "@techtush/mcp-client";

export interface SubAgentConfig {
  role: string;
  systemPrompt: string;
  task: string;
  context?: string;          // results from sibling agents to use as context
  maxToolCalls?: number;     // safety cap β€” prevents runaway loops
  timeoutMs?: number;
}

export interface SubAgentResult {
  role: string;
  task: string;
  output: string;
  toolCallCount: number;
  durationMs: number;
  success: boolean;
  error?: string;
}

const anthropic = new Anthropic();

export class SubAgentRunner {
  constructor(private readonly client: McpClientWrapper) {}

  async run(config: SubAgentConfig): Promise<SubAgentResult> {
    const start = Date.now();
    const maxToolCalls = config.maxToolCalls ?? 10;

    const tools = this.client.getTools().map((t) => ({
      name: t.name,
      description: t.description ?? "",
      input_schema: t.inputSchema as Anthropic.Tool["input_schema"],
    }));

    const userContent = config.context
      ? `Context from other agents:\n\({config.context}\n\nYour task: \){config.task}`
      : config.task;

    const messages: Anthropic.MessageParam[] = [
      { role: "user", content: userContent },
    ];

    let toolCallCount = 0;

    try {
      while (true) {
        if (toolCallCount >= maxToolCalls) {
          throw new Error(
            `Sub-agent "\({config.role}" exceeded max tool calls (\){maxToolCalls})`
          );
        }

        const response = await Promise.race([
          anthropic.messages.create({
            model: "claude-sonnet-4-20250514",
            max_tokens: 2048,
            tools,
            messages,
            system: config.systemPrompt,
          }),
          this.timeout(config.timeoutMs ?? 60_000, config.role),
        ]);

        messages.push({ role: "assistant", content: response.content });

        if (response.stop_reason === "end_turn") {
          const output = response.content
            .filter((b): b is Anthropic.TextBlock => b.type === "text")
            .map((b) => b.text)
            .join("\n");

          return {
            role: config.role,
            task: config.task,
            output,
            toolCallCount,
            durationMs: Date.now() - start,
            success: true,
          };
        }

        if (response.stop_reason === "tool_use") {
          const toolResults: Anthropic.ToolResultBlockParam[] = [];

          for (const block of response.content) {
            if (block.type !== "tool_use") continue;
            toolCallCount++;

            const resultText = await this.callTool(block.name, block.input as Record<string, unknown>);

            toolResults.push({
              type: "tool_result",
              tool_use_id: block.id,
              content: resultText,
            });
          }

          messages.push({ role: "user", content: toolResults });
        }
      }
    } catch (err) {
      return {
        role: config.role,
        task: config.task,
        output: "",
        toolCallCount,
        durationMs: Date.now() - start,
        success: false,
        error: err instanceof Error ? err.message : String(err),
      };
    }
  }

  private async callTool(name: string, input: Record<string, unknown>): Promise<string> {
    try {
      const result = await this.client["raw"].callTool({ name, arguments: input });
      const text = result.content
        .filter((c) => c.type === "text")
        .map((c) => (c as { type: "text"; text: string }).text)
        .join("\n");
      return result.isError ? `Error: ${text}` : text;
    } catch (err) {
      return `Tool \({name} failed: \){err instanceof Error ? err.message : String(err)}`;
    }
  }

  private timeout(ms: number, role: string): Promise<never> {
    return new Promise((_, reject) =>
      setTimeout(() => reject(new Error(`Sub-agent "\({role}" timed out after \){ms}ms`)), ms)
    );
  }
}

Two safety mechanisms deserve attention. The maxToolCalls cap prevents a runaway sub-agent from calling tools in an infinite loop β€” you always want a hard ceiling. The Promise.race against a timeout means a hung sub-agent cannot block the orchestrator forever. Both are essential in a production multi-agent system. πŸ›‘οΈ


πŸ”§ Part 3: The spawn_agent MCP Tool

Now expose sub-agent execution as an MCP tool that the orchestrator can call:

// src/tools/spawn-agent.ts
import { z } from "zod";
import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import type { McpClientWrapper } from "@techtush/mcp-client";
import { SubAgentRunner } from "../agents/sub-agent-runner.js";

const SPECIALIST_PROMPTS: Record<string, string> = {
  researcher:
    "You are a research specialist. Your only job is to search the knowledge base thoroughly and return a comprehensive summary of relevant findings. Always call search_knowledge_base at least once before answering.",
  "weather-analyst":
    "You are a weather analysis specialist. Check current conditions and forecast, then summarise weather impact on the user's plans. Always use both get_current_weather and get_forecast.",
  writer:
    "You are a technical writer. Given context from other agents, synthesise a clear, structured document and store it using index_document. Return a confirmation with the document summary.",
  analyst:
    "You are a data analyst. Given findings from other agents, identify patterns, risks, and recommendations. Return a structured analysis.",
};

export function registerSpawnAgentTool(server: McpServer, client: McpClientWrapper): void {
  const runner = new SubAgentRunner(client);

  server.tool(
    "spawn_agent",
    "Spawn a specialist sub-agent to handle a specific part of a complex task. The sub-agent runs independently and returns its result as a string. Use this to parallelise work across specialist roles.",
    {
      role: z
        .enum(["researcher", "weather-analyst", "writer", "analyst"])
        .describe("The specialist role for this sub-agent"),
      task: z
        .string()
        .min(10)
        .describe("Clear, specific task description for the sub-agent"),
      context: z
        .string()
        .optional()
        .describe("Optional context from other agents to pass to this sub-agent"),
      timeout_seconds: z
        .number()
        .int()
        .min(10)
        .max(120)
        .default(60)
        .describe("Maximum seconds to wait for this sub-agent"),
    },
    async (args) => {
      const systemPrompt = SPECIALIST_PROMPTS[args.role];

      const result = await runner.run({
        role: args.role,
        systemPrompt,
        task: args.task,
        context: args.context,
        timeoutMs: args.timeout_seconds * 1000,
      });

      if (!result.success) {
        return {
          isError: true,
          content: [
            {
              type: "text",
              text: `Sub-agent "\({args.role}" failed: \){result.error ?? "unknown error"}`,
            },
          ],
        };
      }

      const summary = [
        `[Sub-agent: ${result.role}]`,
        `Task: ${result.task}`,
        `Duration: \({result.durationMs}ms (\){result.toolCallCount} tool calls)`,
        ``,
        result.output,
      ].join("\n");

      return {
        content: [{ type: "text", text: summary }],
      };
    }
  );
}

The tool returns a structured string that includes the sub-agent's role, task, duration, and output. When the orchestrator receives this, it has full context about which specialist produced which result β€” essential for synthesis. βœ…


πŸ€– Part 4: The Orchestrator Agent

The orchestrator is the top-level Claude instance. It receives the user's complex request, decides how to decompose it into specialist tasks, calls spawn_agent for each, and synthesises the collected results:

// src/agents/orchestrator.ts
import Anthropic from "@anthropic-ai/sdk";
import type { McpClientWrapper } from "@techtush/mcp-client";

const anthropic = new Anthropic();

const ORCHESTRATOR_SYSTEM = `You are an orchestration agent. Your job is to:
1. Decompose complex tasks into specialist sub-tasks
2. Call spawn_agent for each sub-task β€” run independent tasks IN PARALLEL by calling spawn_agent multiple times before reading results
3. Pass relevant results as context when one agent depends on another's output
4. Synthesise all results into a final coherent answer

Available specialist roles:
- researcher: searches the knowledge base
- weather-analyst: fetches and analyses weather data
- writer: synthesises findings into a document and indexes it
- analyst: analyses data and produces recommendations

Always explain your decomposition strategy before spawning agents.`;

export async function runOrchestrator(
  userQuery: string,
  client: McpClientWrapper,
  onToken?: (text: string) => void
): Promise<string> {
  const tools = client.getTools().map((t) => ({
    name: t.name,
    description: t.description ?? "",
    input_schema: t.inputSchema as Anthropic.Tool["input_schema"],
  }));

  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userQuery },
  ];

  let finalAnswer = "";

  while (true) {
    // Use streaming so the user sees orchestrator reasoning in real time (Part 4 pattern)
    const stream = await anthropic.messages.stream({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      tools,
      messages,
      system: ORCHESTRATOR_SYSTEM,
    });

    for await (const event of stream) {
      if (
        event.type === "content_block_delta" &&
        event.delta.type === "text_delta" &&
        onToken
      ) {
        onToken(event.delta.text);
      }
    }

    const response = await stream.finalMessage();
    messages.push({ role: "assistant", content: response.content });

    if (response.stop_reason === "end_turn") {
      finalAnswer = response.content
        .filter((b): b is Anthropic.TextBlock => b.type === "text")
        .map((b) => b.text)
        .join("\n");
      break;
    }

    if (response.stop_reason === "tool_use") {
      // Collect ALL tool calls from this response
      const toolBlocks = response.content.filter((b) => b.type === "tool_use");

      // Run ALL tool calls in parallel β€” this is where the speedup comes from
      const toolResults = await Promise.allSettled(
        toolBlocks.map(async (block) => {
          if (block.type !== "tool_use") return null;

          const result = await client["raw"].callTool({
            name: block.name,
            arguments: block.input as Record<string, unknown>,
          });

          const text = result.content
            .filter((c) => c.type === "text")
            .map((c) => (c as { type: "text"; text: string }).text)
            .join("\n");

          return {
            type: "tool_result" as const,
            tool_use_id: block.id,
            content: result.isError ? `Error: ${text}` : text,
          };
        })
      );

      // Assemble results β€” failed calls get an error message, not a throw
      const resultBlocks: Anthropic.ToolResultBlockParam[] = toolResults
        .map((r, i) => {
          const block = toolBlocks[i];
          if (block.type !== "tool_use") return null;

          if (r.status === "fulfilled" && r.value) {
            return r.value;
          }

          return {
            type: "tool_result" as const,
            tool_use_id: block.id,
            content: `Tool call failed: ${r.status === "rejected" ? r.reason : "unknown error"}`,
          };
        })
        .filter(Boolean) as Anthropic.ToolResultBlockParam[];

      messages.push({ role: "user", content: resultBlocks });
    }
  }

  return finalAnswer;
}

The critical line is await Promise.allSettled(toolBlocks.map(...)). When the orchestrator calls spawn_agent three times in one response, all three sub-agents run simultaneously. Promise.allSettled (not Promise.all) means one failing sub-agent never rejects the entire batch β€” you get partial results and the orchestrator synthesises what it has. πŸ›‘οΈ


⚑ Part 5: Parallel vs Sequential β€” The Real Performance Impact

Let's make the speedup concrete. Without parallelism:

Task A: research docs          β†’ 3.2s
Task B: check weather          β†’ 1.8s
Task C: write report (needs A+B) β†’ 2.1s
─────────────────────────────────────
Sequential total               β†’ 7.1s

With parallel orchestration:

Task A + Task B run simultaneously β†’ max(3.2, 1.8) = 3.2s
Task C runs after A+B complete    β†’ 2.1s
─────────────────────────────────────
Parallel total                     β†’ 5.3s  (25% faster)

For longer tasks or more sub-agents the gains compound. The orchestrator also avoids waiting for sequential reasoning β€” while sub-agent A is calling the vector DB, sub-agent B is hitting the weather API. The wall-clock time is bounded by the slowest parallel batch, not the sum of all steps. ⚑

The orchestrator system prompt includes a critical instruction: "run independent tasks IN PARALLEL by calling spawn_agent multiple times before reading results." Without this nudge, Claude tends to call one tool, wait for the result, then decide to call the next β€” sequential by default. The explicit instruction shifts it toward batch spawning. πŸ’‘


🎬 Part 6: A Real Orchestration Run

Let's trace a real end-to-end run with this query:

"Research what our internal docs say about Redis session TTL,
 check the current weather in Pune, and write a combined
 summary document to the knowledge base."

Terminal output with streaming:

πŸ€– Orchestrator: I'll decompose this into three specialist tasks and
run the research and weather analysis in parallel before writing.

  πŸ”§ spawn_agent({ role: "researcher", task: "Find all information about Redis session TTL..." })
  πŸ”§ spawn_agent({ role: "weather-analyst", task: "Get current weather in Pune, IN..." })

[both sub-agents running simultaneously...]

  βœ… researcher (3241ms, 2 tool calls):
     [Sub-agent: researcher]
     Found: Sessions use 30-minute sliding TTL via Redis set_config...

  βœ… weather-analyst (1893ms, 2 tool calls):
     [Sub-agent: weather-analyst]
     Pune: 31Β°C, partly cloudy. Weekend forecast: light rain Saturday...

  πŸ”§ spawn_agent({
       role: "writer",
       task: "Write a combined summary...",
       context: "[researcher output]\n[weather-analyst output]"
     })

  βœ… writer (2108ms, 1 tool call):
     [Sub-agent: writer]
     Document indexed as "session-ttl-pune-weather-summary.md" (312 words)

πŸ€– Orchestrator: I've completed all three tasks. The researcher found
that Redis session TTL is set to 30 minutes with sliding expiry...
Meanwhile, Pune is 31Β°C and partly cloudy today...
A combined summary has been saved to the knowledge base. βœ…

Wall-clock time: 7.2 seconds (research + weather parallel: 3.2s, then write: 2.1s, plus orchestrator reasoning: 1.9s). Sequentially this would have taken around 10 seconds. πŸš€


πŸ›‘οΈ Part 7: Failure Isolation in Practice

What happens when the weather API is down? Without isolation:

spawn_agent(researcher) β†’ success
spawn_agent(weather-analyst) β†’ throws β†’ Promise.all rejects β†’ everything fails

With Promise.allSettled:

spawn_agent(researcher) β†’ success: findings[]
spawn_agent(weather-analyst) β†’ failure: "OpenWeatherMap API timeout"
─────────────────────────────────────────────────────────────────────
Orchestrator receives both results (one error string, one success)
Orchestrator: "I couldn't retrieve weather data due to an API timeout.
Based on the research findings alone, here is what I can tell you..."

The user gets a partial but useful answer. The orchestrator explains what it could and could not retrieve. No crash, no silent failure, no generic error message. This is the correct behaviour for production AI systems. βœ…


πŸ§ͺ Part 8: Testing Multi-Agent Systems

Multi-agent tests can be expensive β€” each test run fires multiple Claude API calls. The strategy is to mock sub-agent results at the spawn_agent tool level, keeping the orchestrator logic testable without hitting the API:

// src/__tests__/orchestrator.test.ts
import { describe, it, expect, vi, beforeAll } from "vitest";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { InMemoryTransport } from "@modelcontextprotocol/sdk/inMemory.js";

// A mock spawn_agent tool that returns deterministic results
// without firing real sub-agent loops
async function setupMockOrchestrationServer(): Promise<Client> {
  const server = new McpServer({ name: "mock-orchestration", version: "0.0.1" });

  server.tool(
    "spawn_agent",
    "Spawn a specialist sub-agent",
    {
      role: { type: "string" },
      task: { type: "string" },
      context: { type: "string" },
    },
    async (args) => {
      // Return deterministic mock results per role
      const mockResults: Record<string, string> = {
        researcher:
          "[Sub-agent: researcher]\nFound: Redis TTL is 30 minutes with sliding expiry.",
        "weather-analyst":
          "[Sub-agent: weather-analyst]\nPune: 31C, partly cloudy.",
        writer:
          "[Sub-agent: writer]\nDocument indexed: summary.md",
        analyst:
          "[Sub-agent: analyst]\nAnalysis: conditions are favourable.",
      };

      const role = args.role as string;
      return {
        content: [{ type: "text", text: mockResults[role] ?? "No mock result for this role" }],
      };
    }
  );

  const [clientTransport, serverTransport] = InMemoryTransport.createLinkedPair();
  await server.connect(serverTransport);

  const client = new Client({ name: "test-client", version: "0.0.1" }, { capabilities: {} });
  await client.connect(clientTransport);

  return client;
}

describe("Orchestrator tool decomposition", () => {
  it("calls spawn_agent for researcher and weather-analyst for compound queries", async () => {
    const client = await setupMockOrchestrationServer();
    const spawnCalls: string[] = [];

    // Spy on callTool to record which roles are spawned
    const originalCallTool = client.callTool.bind(client);
    vi.spyOn(client, "callTool").mockImplementation(async (params) => {
      if (params.name === "spawn_agent" && params.arguments?.role) {
        spawnCalls.push(params.arguments.role as string);
      }
      return originalCallTool(params);
    });

    // Run a compound query through the orchestrator
    await runOrchestrator(
      "Research Redis TTL docs and check Pune weather",
      client as unknown as McpClientWrapper
    );

    expect(spawnCalls).toContain("researcher");
    expect(spawnCalls).toContain("weather-analyst");

    await client.close();
  });
});

InMemoryTransport again proves its value β€” real MCP protocol, real tool calls, zero network overhead. The test verifies that the orchestrator actually decomposes the task into the right specialist calls without burning API budget on every CI run. βœ…


πŸ—οΈ Part 9: Wiring Into the Production Server

Add spawn_agent to your existing MCP server alongside the weather and knowledge tools:

// src/server.ts (updated from Part 5)
import { registerSpawnAgentTool } from "./tools/spawn-agent.js";
import { McpClientFactory } from "@techtush/mcp-client";

// The spawn_agent tool needs a client to call sub-agent tools with
// Create a self-referential client that connects back to the same server
const selfClient = await new McpClientFactory()
  .named("orchestrator-internal", "1.0.0")
  .withRetries(2)
  .withTimeout(90_000)
  .connectHttp({
    url: `http://localhost:${process.env.PORT ?? 3000}`,
    token: process.env.INTERNAL_SERVICE_TOKEN!,
  });

// Register all tools
registerWeatherTools(server);
registerKnowledgeTools(server, sessionId);
registerSpawnAgentTool(server, selfClient);  // πŸ‘ˆ adds spawn_agent

A self-referential client β€” the MCP server connecting to itself β€” is the cleanest way to give sub-agents access to all registered tools without duplicating tool registration logic. Add INTERNAL_SERVICE_TOKEN to your env vars and include it in VALID_TOKENS. πŸ”‘


πŸ’‘ Key Takeaways

Sub-agents are just tool calls. The orchestrator does not know or care that spawn_agent internally runs a full Claude message loop. From the orchestrator's perspective it is identical to calling get_current_weather. This abstraction keeps the orchestrator clean and makes sub-agents independently testable.

Promise.allSettled over Promise.all. Always. One sub-agent failing should never prevent the orchestrator from receiving results from the others. Partial information is almost always better than a full crash.

Prompt the orchestrator to batch. Claude's default behaviour is sequential tool calls. The system prompt instruction to "run independent tasks IN PARALLEL" is what unlocks batch spawning. Without it, you get the sequential pattern and lose the performance benefit.

Cap tool calls and timeouts per sub-agent. Without caps, a misconfigured sub-agent can loop indefinitely or run for minutes. maxToolCalls = 10 and timeoutMs = 60000 are sensible defaults β€” tune based on your tools' typical latency.

Keep specialist prompts focused. A researcher that also tries to write documents is worse than a researcher that only searches and a writer that only writes. Narrow role definitions produce more reliable, more testable sub-agents.


🎯 Summary

In Part 14 you built a complete multi-agent orchestration system on MCP:

  • 🏭 SubAgentRunner β€” encapsulates a full agent loop with maxToolCalls and timeout safety
  • πŸ”§ spawn_agent MCP tool β€” the clean interface the orchestrator uses to delegate specialist tasks
  • πŸ€– OrchestratorAgent β€” streaming Claude instance that decomposes, delegates, and synthesises
  • ⚑ Promise.allSettled parallelism β€” all independent sub-agents run simultaneously
  • πŸ›‘οΈ Failure isolation β€” one failed sub-agent yields a partial answer, not a crash
  • πŸ§ͺ Mock orchestration tests β€” InMemoryTransport + deterministic mock tools, zero API cost in CI

This is the final building block of the series. Over 14 parts you went from understanding what MCP is to running a distributed, observable, multi-tenant, multi-agent AI system deployed to production. πŸŽ‰


πŸ† Full Series Recap

Parts 1–2: MCP concepts, tools, resources, prompts, capability negotiation Parts 3–4: Agent loop, multi-step tool calls, streaming, interactive CLI Parts 5–6: HTTP transport, OAuth, Zod validation, Docker, multi-tenant sessions, Redis Part 7: Observability β€” pino, OpenTelemetry, Prometheus, Grafana Part 8: Reusable TypeScript client SDK, npm publish Part 9: Production deployment β€” Koyeb, Railway, Render, Fly.io, GitHub Actions CD Parts 10–11: RAG with pgvector and Qdrant, multi-tenant row-level security Part 12: Eval framework β€” faithfulness, relevance, precision, CI gate Part 13: Real-time event bus, SSE, live React dashboard Part 14: Multi-agent orchestration β€” spawn, delegate, aggregate πŸ€–πŸ€–


πŸ“š Further Reading

AI Engineering with TypeScript

Part 14 of 14

A comprehensive, code-first series on building production-grade AI systems with the Model Context Protocol (MCP) and TypeScript. From your first MCP server to multi-agent orchestration, RAG pipelines, observability, and global deployment β€” every post is packed with real, runnable code.

Start from the beginning

πŸ”Œ What is MCP (Model Context Protocol)? A TypeScript Developer's Guide

From 2 million to 97 million monthly downloads in 16 months β€” here's what MCP is, why every AI developer is talking about it, and how to build your first MCP server in TypeScript from scratch