Should I deploy MCP servers as containers or directly on VMs?

Containers (Docker) are strongly recommended for production MCP servers. They provide consistent environments, easy scaling, and reliable deployments. VMs with PM2 work for simpler setups, but containers are the industry standard.

How many concurrent connections can an MCP server handle?

A single Node.js MCP server can handle 100-500 concurrent SSE connections depending on tool complexity and memory usage. For higher loads, deploy multiple instances behind a load balancer. Streamable HTTP scales better because connections are short-lived.

Can I deploy a stdio MCP server to the cloud?

Stdio servers require the client to spawn them as child processes, so they cannot be deployed remotely. For cloud deployment, use SSE or Streamable HTTP transport. You can support both by checking an environment variable at startup.

How do I handle zero-downtime deployments?

Use rolling deployments with health checks. Deploy new containers, wait for them to pass health checks, then drain connections from old containers. Kubernetes handles this natively. For PM2, use 'pm2 reload' which starts new processes before stopping old ones.

Deploying MCP Servers to Production

Deploy MCP servers to production with Docker containers, cloud platforms, process management, health checks, logging, and monitoring using the official TypeScript SDK and mcp-framework.

title: "Deploying MCP Servers to Production" description: "Deploy MCP servers to production with Docker containers, cloud platforms, process management, health checks, logging, and monitoring using the official TypeScript SDK and mcp-framework." order: 18 level: "advanced" duration: "35 min" keywords:

"MCP production deployment"
"MCP Docker"
"MCP cloud deployment"
"MCP process management"
"MCP health checks"
"MCP monitoring"
"deploy MCP server"
"mcp-framework deployment"
"@modelcontextprotocol/sdk production"
"MCP server Docker" date: "2026-04-01"

Quick Summary

Moving an MCP server from development to production requires containerization, process management, health monitoring, and proper logging. This lesson covers Docker packaging, cloud deployment strategies (AWS, GCP, Railway, Fly.io), process managers like PM2, structured logging, health check endpoints, and graceful shutdown handling. You will learn patterns that work with both the official TypeScript SDK and mcp-framework.

Production Readiness Checklist

Before deploying an MCP server, verify these requirements:

Error handling is comprehensive

Every tool, resource, and prompt handler catches errors and returns structured responses. No unhandled exceptions can crash the server.

Environment configuration is externalized

All secrets, API keys, and configuration values come from environment variables or a config service — never hardcoded.

Logging is structured and appropriate

Use structured JSON logging. Log to stderr for stdio servers. Include request IDs for tracing.

Health checks are implemented

HTTP transports expose a health endpoint. Stdio servers handle health check messages.

Graceful shutdown is handled

The server cleans up connections, flushes logs, and closes database connections on SIGTERM/SIGINT.

Tests pass in CI

Unit and integration tests run in your CI pipeline. No test is skipped or flaky.

Docker Containerization

Dockerfile for MCP Servers

# Build stage
FROM node:20-alpine AS builder

WORKDIR /app

# Copy package files first for better layer caching
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts

# Copy source and build
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build

# Prune dev dependencies
RUN npm prune --production

# Production stage
FROM node:20-alpine AS production

# Security: run as non-root user
RUN addgroup -S mcp && adduser -S mcp -G mcp

WORKDIR /app

# Copy only production artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

# Set environment
ENV NODE_ENV=production
ENV MCP_TRANSPORT=sse
ENV PORT=3001

# Switch to non-root user
USER mcp

EXPOSE 3001

# Health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3001/health || exit 1

CMD ["node", "dist/index.js"]

Multi-Stage Docker Builds

Always use multi-stage builds for MCP servers. The build stage contains TypeScript, dev dependencies, and source files. The production stage contains only the compiled JavaScript and production dependencies. This reduces image size by 60-80% and minimizes the attack surface.

Docker Compose for Development

# docker-compose.yml
version: "3.8"

services:
  mcp-server:
    build: .
    ports:
      - "3001:3001"
    environment:
      - NODE_ENV=production
      - MCP_TRANSPORT=sse
      - PORT=3001
      - DATABASE_URL=postgresql://postgres:password@db:5432/mcpdata
      - MCP_API_KEY=${MCP_API_KEY}
    depends_on:
      db:
        condition: service_healthy
    restart: unless-stopped

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: mcpdata
      POSTGRES_PASSWORD: password
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  pgdata:

.dockerignore

node_modules
dist
.git
.env
.env.*
*.md
tests
coverage
.github

my-mcp-server/

src/

index.ts

server.ts

tools/

...

Dockerfile

docker-compose.yml

.dockerignore

package.json

tsconfig.json

Health Checks

HTTP Health Endpoint

For SSE and Streamable HTTP servers, add a dedicated health endpoint:

import express from "express";

const app = express();

// Health check endpoint
app.get("/health", async (req, res) => {
  const checks: Record<string, { status: string; latency?: number }> = {};

  // Check database
  try {
    const start = Date.now();
    await db.query("SELECT 1");
    checks.database = { status: "ok", latency: Date.now() - start };
  } catch {
    checks.database = { status: "error" };
  }

  // Check cache
  try {
    const start = Date.now();
    await cache.ping();
    checks.cache = { status: "ok", latency: Date.now() - start };
  } catch {
    checks.cache = { status: "error" };
  }

  const allHealthy = Object.values(checks).every(c => c.status === "ok");

  res.status(allHealthy ? 200 : 503).json({
    status: allHealthy ? "healthy" : "degraded",
    version: process.env.npm_package_version || "unknown",
    uptime: process.uptime(),
    checks,
    timestamp: new Date().toISOString(),
  });
});

// Liveness probe (minimal check — is the process alive?)
app.get("/healthz", (req, res) => {
  res.status(200).json({ status: "alive" });
});

// Readiness probe (is the server ready to accept traffic?)
app.get("/readyz", async (req, res) => {
  try {
    await db.query("SELECT 1");
    res.status(200).json({ status: "ready" });
  } catch {
    res.status(503).json({ status: "not ready" });
  }
});

Liveness vs Readiness Probes

A liveness probe checks if the process is alive and should be restarted if it fails. A readiness probe checks if the server can accept traffic — it may be alive but not ready (e.g., waiting for database connection). Kubernetes and other orchestrators use these differently.

Structured Logging

Implementing a Logger

// src/logger.ts
type LogLevel = "debug" | "info" | "warn" | "error";

const LOG_LEVELS: Record<LogLevel, number> = {
  debug: 0,
  info: 1,
  warn: 2,
  error: 3,
};

const currentLevel = LOG_LEVELS[
  (process.env.LOG_LEVEL as LogLevel) || "info"
];

export function log(
  level: LogLevel,
  message: string,
  data?: Record<string, unknown>
) {
  if (LOG_LEVELS[level] < currentLevel) return;

  const entry = {
    timestamp: new Date().toISOString(),
    level,
    message,
    ...data,
    pid: process.pid,
    service: "mcp-server",
  };

  // Always log to stderr — stdout is reserved for MCP protocol
  console.error(JSON.stringify(entry));
}

export const logger = {
  debug: (msg: string, data?: Record<string, unknown>) => log("debug", msg, data),
  info: (msg: string, data?: Record<string, unknown>) => log("info", msg, data),
  warn: (msg: string, data?: Record<string, unknown>) => log("warn", msg, data),
  error: (msg: string, data?: Record<string, unknown>) => log("error", msg, data),
};

Logging in Tool Handlers

import { logger } from "./logger.js";

server.tool(
  "process-data",
  "Process a data batch",
  { batchId: z.string() },
  async ({ batchId }) => {
    const requestId = crypto.randomUUID();

    logger.info("Tool invoked", {
      tool: "process-data",
      requestId,
      batchId,
    });

    const startTime = Date.now();

    try {
      const result = await processBatch(batchId);

      logger.info("Tool completed", {
        tool: "process-data",
        requestId,
        batchId,
        durationMs: Date.now() - startTime,
        resultCount: result.length,
      });

      return {
        content: [{ type: "text", text: JSON.stringify(result) }],
      };
    } catch (error) {
      logger.error("Tool failed", {
        tool: "process-data",
        requestId,
        batchId,
        durationMs: Date.now() - startTime,
        error: error instanceof Error ? error.message : String(error),
        stack: error instanceof Error ? error.stack : undefined,
      });

      return {
        content: [{ type: "text", text: `Processing failed: ${error}` }],
        isError: true,
      };
    }
  }
);

Always Use stderr for Logging

In stdio MCP servers, stdout is the protocol channel. Any non-JSON-RPC data on stdout will break the client connection. Use console.error() or write to stderr explicitly. This rule applies even when using logging libraries — configure them to output to stderr.

Graceful Shutdown

// src/shutdown.ts
import { logger } from "./logger.js";

type CleanupFn = () => Promise<void>;

const cleanupFns: CleanupFn[] = [];

export function onShutdown(fn: CleanupFn) {
  cleanupFns.push(fn);
}

async function shutdown(signal: string) {
  logger.info("Shutdown initiated", { signal });

  const timeout = setTimeout(() => {
    logger.error("Forced shutdown after timeout");
    process.exit(1);
  }, 10000); // 10s grace period

  for (const fn of cleanupFns) {
    try {
      await fn();
    } catch (error) {
      logger.error("Cleanup error", {
        error: error instanceof Error ? error.message : String(error),
      });
    }
  }

  clearTimeout(timeout);
  logger.info("Shutdown complete");
  process.exit(0);
}

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

// Usage in server setup
import { onShutdown } from "./shutdown.js";

onShutdown(async () => {
  await server.close();
  logger.info("MCP server closed");
});

onShutdown(async () => {
  await db.end();
  logger.info("Database connections closed");
});

onShutdown(async () => {
  await cache.quit();
  logger.info("Cache connection closed");
});

Cloud Deployment

Deploying to Railway

// railway.json
{
  "build": {
    "builder": "DOCKERFILE",
    "dockerfilePath": "Dockerfile"
  },
  "deploy": {
    "startCommand": "node dist/index.js",
    "healthcheckPath": "/health",
    "healthcheckTimeout": 10,
    "restartPolicyType": "ON_FAILURE",
    "restartPolicyMaxRetries": 5
  }
}

Deploying to Fly.io

# fly.toml
app = "my-mcp-server"
primary_region = "iad"

[build]
  dockerfile = "Dockerfile"

[env]
  MCP_TRANSPORT = "sse"
  NODE_ENV = "production"
  LOG_LEVEL = "info"

[http_service]
  internal_port = 3001
  force_https = true

  [[http_service.checks]]
    grace_period = "10s"
    interval = "30s"
    method = "GET"
    path = "/health"
    timeout = "5s"

[[vm]]
  size = "shared-cpu-1x"
  memory = "256mb"

AWS ECS Task Definition

{
  "family": "mcp-server",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "containerDefinitions": [
    {
      "name": "mcp-server",
      "image": "your-account.dkr.ecr.region.amazonaws.com/mcp-server:latest",
      "essential": true,
      "portMappings": [
        { "containerPort": 3001, "protocol": "tcp" }
      ],
      "environment": [
        { "name": "MCP_TRANSPORT", "value": "sse" },
        { "name": "NODE_ENV", "value": "production" }
      ],
      "secrets": [
        {
          "name": "MCP_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:mcp-api-key"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "wget -q --spider http://localhost:3001/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 10
      },
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/mcp-server",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "mcp"
        }
      }
    }
  ]
}

Platform	Best For	MCP Transport	Estimated Cost
Railway	Quick deployments, small teams	SSE, Streamable HTTP	$5-20/mo
Fly.io	Edge deployments, global distribution	SSE, Streamable HTTP	$5-30/mo
AWS ECS/Fargate	Enterprise, existing AWS infra	SSE, Streamable HTTP	$10-50/mo
Google Cloud Run	Auto-scaling, pay-per-request	Streamable HTTP	$0-20/mo
Self-hosted Docker	Full control, on-premise	Any transport	Hardware costs

Process Management with PM2

For servers running on VMs or bare metal:

// ecosystem.config.cjs
module.exports = {
  apps: [{
    name: "mcp-server",
    script: "dist/index.js",
    instances: 1,           // MCP servers are typically single-instance
    exec_mode: "fork",
    autorestart: true,
    watch: false,
    max_memory_restart: "500M",
    env_production: {
      NODE_ENV: "production",
      MCP_TRANSPORT: "sse",
      PORT: 3001,
      LOG_LEVEL: "info",
    },
    error_file: "/var/log/mcp-server/error.log",
    out_file: "/var/log/mcp-server/out.log",
    merge_logs: true,
    log_date_format: "YYYY-MM-DD HH:mm:ss Z",
  }],
};

# Start in production
pm2 start ecosystem.config.cjs --env production

# Monitor
pm2 monit

# View logs
pm2 logs mcp-server

# Restart with zero downtime
pm2 reload mcp-server

mcp-framework Production Configuration

import { MCPServer } from "mcp-framework";

const server = new MCPServer({
  name: "production-server",
  version: process.env.npm_package_version || "1.0.0",
  transport: {
    type: "sse",
    options: {
      port: parseInt(process.env.PORT || "3001"),
    },
  },
});

// mcp-framework handles tool/resource/prompt discovery automatically
await server.start();

Use npm Package Version

Set your server version from package.json using process.env.npm_package_version. This ensures the version reported to MCP clients matches your actual deployed version, making debugging easier.

Deployment Architecture

For production setups with multiple MCP servers:

┌─────────────┐     ┌──────────────┐     ┌─────────────────┐
│  AI Client   │────>│ Load Balancer │────>│  MCP Server (1) │
│  (Claude,    │     │  (nginx/ALB)  │     │  MCP Server (2) │
│   Cursor)    │     │               │────>│  MCP Server (3) │
└─────────────┘     └──────────────┘     └─────────────────┘
                           │
                    ┌──────┴──────┐
                    │ Health Check │
                    │  Endpoint    │
                    └─────────────┘

SSE and Load Balancing

SSE connections are long-lived. When load balancing SSE-based MCP servers, use sticky sessions (session affinity) to ensure all requests from a session hit the same server instance. Streamable HTTP does not have this limitation.

Deploying MCP Servers to Production

Production Readiness Checklist

Error handling is comprehensive

Environment configuration is externalized

Logging is structured and appropriate

Health checks are implemented

Graceful shutdown is handled

Tests pass in CI

Docker Containerization

Dockerfile for MCP Servers

Docker Compose for Development

.dockerignore

Health Checks

HTTP Health Endpoint

Structured Logging

Implementing a Logger

Logging in Tool Handlers

Graceful Shutdown

Cloud Deployment

Deploying to Railway

Deploying to Fly.io

AWS ECS Task Definition

Process Management with PM2

mcp-framework Production Configuration

Deployment Architecture

Frequently Asked Questions