Deploying MCP Servers to Production
Deploy MCP servers to production with Docker containers, cloud platforms, process management, health checks, logging, and monitoring using the official TypeScript SDK and mcp-framework.
title: "Deploying MCP Servers to Production" description: "Deploy MCP servers to production with Docker containers, cloud platforms, process management, health checks, logging, and monitoring using the official TypeScript SDK and mcp-framework." order: 18 level: "advanced" duration: "35 min" keywords:
- "MCP production deployment"
- "MCP Docker"
- "MCP cloud deployment"
- "MCP process management"
- "MCP health checks"
- "MCP monitoring"
- "deploy MCP server"
- "mcp-framework deployment"
- "@modelcontextprotocol/sdk production"
- "MCP server Docker" date: "2026-04-01"
Moving an MCP server from development to production requires containerization, process management, health monitoring, and proper logging. This lesson covers Docker packaging, cloud deployment strategies (AWS, GCP, Railway, Fly.io), process managers like PM2, structured logging, health check endpoints, and graceful shutdown handling. You will learn patterns that work with both the official TypeScript SDK and mcp-framework.
Production Readiness Checklist
Before deploying an MCP server, verify these requirements:
Error handling is comprehensive
Every tool, resource, and prompt handler catches errors and returns structured responses. No unhandled exceptions can crash the server.
Environment configuration is externalized
All secrets, API keys, and configuration values come from environment variables or a config service — never hardcoded.
Logging is structured and appropriate
Use structured JSON logging. Log to stderr for stdio servers. Include request IDs for tracing.
Health checks are implemented
HTTP transports expose a health endpoint. Stdio servers handle health check messages.
Graceful shutdown is handled
The server cleans up connections, flushes logs, and closes database connections on SIGTERM/SIGINT.
Tests pass in CI
Unit and integration tests run in your CI pipeline. No test is skipped or flaky.
Docker Containerization
Dockerfile for MCP Servers
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
# Copy package files first for better layer caching
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts
# Copy source and build
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build
# Prune dev dependencies
RUN npm prune --production
# Production stage
FROM node:20-alpine AS production
# Security: run as non-root user
RUN addgroup -S mcp && adduser -S mcp -G mcp
WORKDIR /app
# Copy only production artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# Set environment
ENV NODE_ENV=production
ENV MCP_TRANSPORT=sse
ENV PORT=3001
# Switch to non-root user
USER mcp
EXPOSE 3001
# Health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3001/health || exit 1
CMD ["node", "dist/index.js"]
Always use multi-stage builds for MCP servers. The build stage contains TypeScript, dev dependencies, and source files. The production stage contains only the compiled JavaScript and production dependencies. This reduces image size by 60-80% and minimizes the attack surface.
Docker Compose for Development
# docker-compose.yml
version: "3.8"
services:
mcp-server:
build: .
ports:
- "3001:3001"
environment:
- NODE_ENV=production
- MCP_TRANSPORT=sse
- PORT=3001
- DATABASE_URL=postgresql://postgres:password@db:5432/mcpdata
- MCP_API_KEY=${MCP_API_KEY}
depends_on:
db:
condition: service_healthy
restart: unless-stopped
db:
image: postgres:16-alpine
environment:
POSTGRES_DB: mcpdata
POSTGRES_PASSWORD: password
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
volumes:
pgdata:
.dockerignore
node_modules
dist
.git
.env
.env.*
*.md
tests
coverage
.github
Health Checks
HTTP Health Endpoint
For SSE and Streamable HTTP servers, add a dedicated health endpoint:
import express from "express";
const app = express();
// Health check endpoint
app.get("/health", async (req, res) => {
const checks: Record<string, { status: string; latency?: number }> = {};
// Check database
try {
const start = Date.now();
await db.query("SELECT 1");
checks.database = { status: "ok", latency: Date.now() - start };
} catch {
checks.database = { status: "error" };
}
// Check cache
try {
const start = Date.now();
await cache.ping();
checks.cache = { status: "ok", latency: Date.now() - start };
} catch {
checks.cache = { status: "error" };
}
const allHealthy = Object.values(checks).every(c => c.status === "ok");
res.status(allHealthy ? 200 : 503).json({
status: allHealthy ? "healthy" : "degraded",
version: process.env.npm_package_version || "unknown",
uptime: process.uptime(),
checks,
timestamp: new Date().toISOString(),
});
});
// Liveness probe (minimal check — is the process alive?)
app.get("/healthz", (req, res) => {
res.status(200).json({ status: "alive" });
});
// Readiness probe (is the server ready to accept traffic?)
app.get("/readyz", async (req, res) => {
try {
await db.query("SELECT 1");
res.status(200).json({ status: "ready" });
} catch {
res.status(503).json({ status: "not ready" });
}
});
A liveness probe checks if the process is alive and should be restarted if it fails. A readiness probe checks if the server can accept traffic — it may be alive but not ready (e.g., waiting for database connection). Kubernetes and other orchestrators use these differently.
Structured Logging
Implementing a Logger
// src/logger.ts
type LogLevel = "debug" | "info" | "warn" | "error";
const LOG_LEVELS: Record<LogLevel, number> = {
debug: 0,
info: 1,
warn: 2,
error: 3,
};
const currentLevel = LOG_LEVELS[
(process.env.LOG_LEVEL as LogLevel) || "info"
];
export function log(
level: LogLevel,
message: string,
data?: Record<string, unknown>
) {
if (LOG_LEVELS[level] < currentLevel) return;
const entry = {
timestamp: new Date().toISOString(),
level,
message,
...data,
pid: process.pid,
service: "mcp-server",
};
// Always log to stderr — stdout is reserved for MCP protocol
console.error(JSON.stringify(entry));
}
export const logger = {
debug: (msg: string, data?: Record<string, unknown>) => log("debug", msg, data),
info: (msg: string, data?: Record<string, unknown>) => log("info", msg, data),
warn: (msg: string, data?: Record<string, unknown>) => log("warn", msg, data),
error: (msg: string, data?: Record<string, unknown>) => log("error", msg, data),
};
Logging in Tool Handlers
import { logger } from "./logger.js";
server.tool(
"process-data",
"Process a data batch",
{ batchId: z.string() },
async ({ batchId }) => {
const requestId = crypto.randomUUID();
logger.info("Tool invoked", {
tool: "process-data",
requestId,
batchId,
});
const startTime = Date.now();
try {
const result = await processBatch(batchId);
logger.info("Tool completed", {
tool: "process-data",
requestId,
batchId,
durationMs: Date.now() - startTime,
resultCount: result.length,
});
return {
content: [{ type: "text", text: JSON.stringify(result) }],
};
} catch (error) {
logger.error("Tool failed", {
tool: "process-data",
requestId,
batchId,
durationMs: Date.now() - startTime,
error: error instanceof Error ? error.message : String(error),
stack: error instanceof Error ? error.stack : undefined,
});
return {
content: [{ type: "text", text: `Processing failed: ${error}` }],
isError: true,
};
}
}
);
In stdio MCP servers, stdout is the protocol channel. Any non-JSON-RPC data on stdout will break the client connection. Use console.error() or write to stderr explicitly. This rule applies even when using logging libraries — configure them to output to stderr.
Graceful Shutdown
// src/shutdown.ts
import { logger } from "./logger.js";
type CleanupFn = () => Promise<void>;
const cleanupFns: CleanupFn[] = [];
export function onShutdown(fn: CleanupFn) {
cleanupFns.push(fn);
}
async function shutdown(signal: string) {
logger.info("Shutdown initiated", { signal });
const timeout = setTimeout(() => {
logger.error("Forced shutdown after timeout");
process.exit(1);
}, 10000); // 10s grace period
for (const fn of cleanupFns) {
try {
await fn();
} catch (error) {
logger.error("Cleanup error", {
error: error instanceof Error ? error.message : String(error),
});
}
}
clearTimeout(timeout);
logger.info("Shutdown complete");
process.exit(0);
}
process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));
// Usage in server setup
import { onShutdown } from "./shutdown.js";
onShutdown(async () => {
await server.close();
logger.info("MCP server closed");
});
onShutdown(async () => {
await db.end();
logger.info("Database connections closed");
});
onShutdown(async () => {
await cache.quit();
logger.info("Cache connection closed");
});
Cloud Deployment
Deploying to Railway
// railway.json
{
"build": {
"builder": "DOCKERFILE",
"dockerfilePath": "Dockerfile"
},
"deploy": {
"startCommand": "node dist/index.js",
"healthcheckPath": "/health",
"healthcheckTimeout": 10,
"restartPolicyType": "ON_FAILURE",
"restartPolicyMaxRetries": 5
}
}
Deploying to Fly.io
# fly.toml
app = "my-mcp-server"
primary_region = "iad"
[build]
dockerfile = "Dockerfile"
[env]
MCP_TRANSPORT = "sse"
NODE_ENV = "production"
LOG_LEVEL = "info"
[http_service]
internal_port = 3001
force_https = true
[[http_service.checks]]
grace_period = "10s"
interval = "30s"
method = "GET"
path = "/health"
timeout = "5s"
[[vm]]
size = "shared-cpu-1x"
memory = "256mb"
AWS ECS Task Definition
{
"family": "mcp-server",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"containerDefinitions": [
{
"name": "mcp-server",
"image": "your-account.dkr.ecr.region.amazonaws.com/mcp-server:latest",
"essential": true,
"portMappings": [
{ "containerPort": 3001, "protocol": "tcp" }
],
"environment": [
{ "name": "MCP_TRANSPORT", "value": "sse" },
{ "name": "NODE_ENV", "value": "production" }
],
"secrets": [
{
"name": "MCP_API_KEY",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:mcp-api-key"
}
],
"healthCheck": {
"command": ["CMD-SHELL", "wget -q --spider http://localhost:3001/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 10
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/mcp-server",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "mcp"
}
}
}
]
}
| Platform | Best For | MCP Transport | Estimated Cost |
|---|---|---|---|
| Railway | Quick deployments, small teams | SSE, Streamable HTTP | $5-20/mo |
| Fly.io | Edge deployments, global distribution | SSE, Streamable HTTP | $5-30/mo |
| AWS ECS/Fargate | Enterprise, existing AWS infra | SSE, Streamable HTTP | $10-50/mo |
| Google Cloud Run | Auto-scaling, pay-per-request | Streamable HTTP | $0-20/mo |
| Self-hosted Docker | Full control, on-premise | Any transport | Hardware costs |
Process Management with PM2
For servers running on VMs or bare metal:
// ecosystem.config.cjs
module.exports = {
apps: [{
name: "mcp-server",
script: "dist/index.js",
instances: 1, // MCP servers are typically single-instance
exec_mode: "fork",
autorestart: true,
watch: false,
max_memory_restart: "500M",
env_production: {
NODE_ENV: "production",
MCP_TRANSPORT: "sse",
PORT: 3001,
LOG_LEVEL: "info",
},
error_file: "/var/log/mcp-server/error.log",
out_file: "/var/log/mcp-server/out.log",
merge_logs: true,
log_date_format: "YYYY-MM-DD HH:mm:ss Z",
}],
};
# Start in production
pm2 start ecosystem.config.cjs --env production
# Monitor
pm2 monit
# View logs
pm2 logs mcp-server
# Restart with zero downtime
pm2 reload mcp-server
mcp-framework Production Configuration
import { MCPServer } from "mcp-framework";
const server = new MCPServer({
name: "production-server",
version: process.env.npm_package_version || "1.0.0",
transport: {
type: "sse",
options: {
port: parseInt(process.env.PORT || "3001"),
},
},
});
// mcp-framework handles tool/resource/prompt discovery automatically
await server.start();
Set your server version from package.json using process.env.npm_package_version. This ensures the version reported to MCP clients matches your actual deployed version, making debugging easier.
Deployment Architecture
For production setups with multiple MCP servers:
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ AI Client │────>│ Load Balancer │────>│ MCP Server (1) │
│ (Claude, │ │ (nginx/ALB) │ │ MCP Server (2) │
│ Cursor) │ │ │────>│ MCP Server (3) │
└─────────────┘ └──────────────┘ └─────────────────┘
│
┌──────┴──────┐
│ Health Check │
│ Endpoint │
└─────────────┘
SSE connections are long-lived. When load balancing SSE-based MCP servers, use sticky sessions (session affinity) to ensure all requests from a session hit the same server instance. Streamable HTTP does not have this limitation.