
Introduction 🎯
Your modular monolith is growing. You’re hitting scale limits. You’ve heard the warnings: “Don’t do microservices too early,” “Distributed systems are hard,” “Stick with the monolith.”
But here’s the truth: microservices aren’t about scale—they’re about team autonomy, deployment independence, and technology diversity. And you can practice these patterns locally before production demands them.
This guide builds a production-grade local development infrastructure using k3s and k3d. You’ll run PostgreSQL HA, Redis Sentinel, RabbitMQ, and your microservices—all on your laptop, all accessible from your host machine, all mirroring production architecture.
No more “it works on my machine.” No more docker-compose limitations. No more guessing how things will behave in Kubernetes.
Let’s build infrastructure that scales with your ambition. 🚀
Part 1: Why Local Microservices Infrastructure Matters
1.1 The Modular Monolith → Microservices Journey
flowchart LR
subgraph Stage1["Stage 1: Modular Monolith"]
direction TB
M1["Auth Module"]
M2["User Module"]
M3["Inventory Module"]
M4["Order Module"]
DB1[("Single Database<br/>PostgreSQL")]
M1 --- M2 --- M3 --- M4
M4 --- DB1
end
subgraph Stage2["Stage 2: Distributed Monolith"]
direction TB
S1["Auth Service"]
S2["User Service"]
S3["Inventory Service"]
S4["Order Service"]
DB2[("Single Database<br/>PostgreSQL")]
S1 --- S2 --- S3 --- S4
S4 --- DB2
end
subgraph Stage3["Stage 3: Microservices"]
direction TB
MS1["Auth Svc"]
MS2["User Svc"]
MS3["Inventory Svc"]
MS4["Order Svc"]
DB3A[("Auth DB")]
DB3B[("User DB")]
DB3C[("Inventory DB")]
DB3D[("Order DB")]
MS1 --- DB3A
MS2 --- DB3B
MS3 --- DB3C
MS4 --- DB3D
end
Stage1 --> Stage2 --> Stage3
style Stage1 fill:#dbeafe,stroke:#333,stroke-width:2px
style Stage2 fill:#fef3c7,stroke:#333,stroke-width:2px
style Stage3 fill:#d1fae5,stroke:#333,stroke-width:2px
The Problem: Most teams jump from Stage 1 to Stage 3 without practicing Stage 2. They deploy to production Kubernetes and discover:
- Network latency they never tested locally
- Database connection pool exhaustion
- Service discovery failures
- Distributed tracing gaps
- Configuration management nightmares
The Solution: Build your local infrastructure to mirror production from day one.
1.2 Why Docker Compose Isn’t Enough
# docker-compose.yml - The Limitations
version: '3.8'
services:
api:
build: .
ports:
- "3000:3000"
depends_on:
- postgres
- redis
postgres:
image: postgres:15
ports:
- "5432:5432"
redis:
image: redis:7
ports:
- "6379:6379"
What docker-compose lacks:
- ❌ No Kubernetes Service discovery
- ❌ No ConfigMaps or Secrets management
- ❌ No Horizontal Pod Autoscaling
- ❌ No Ingress controllers
- ❌ No namespace isolation
- ❌ No RBAC or service accounts
- ❌ No Helm charts or Kustomize
- ❌ No production-like networking
Result: Your local setup teaches you nothing about production Kubernetes.
1.3 k3s vs k3d: When to Use What
| Feature | k3d (Docker-based) | k3s (VM-based) |
|---|---|---|
| Startup Time | ~10-30 seconds | ~2-5 minutes |
| Resource Usage | ~500MB RAM | ~2-4GB RAM |
| Production Fidelity | Medium | High |
| Multi-node Support | Simulated | Real VMs |
| Persistent Storage | Volume mounts | Longhorn possible |
| Best For | Quick dev, CI/CD | Production rehearsal |
| Platform | macOS, Linux, Windows | Linux, macOS (Multipass) |
Rule of thumb:
- k3d for daily development, fast iteration, CI pipelines
- k3s on Multipass for production testing, multi-node scenarios, storage testing
Part 2: Architecture Overview
2.1 Full Local Infrastructure Stack
flowchart TB
subgraph Host["Your Host Machine"]
direction TB
AuthAPI["Auth API<br/>Port 3001"]
UserAPI["User API<br/>Port 3002"]
InvAPI["Inventory API<br/>Port 3003"]
end
subgraph Cluster["k3s/k3d Cluster"]
direction TB
MetalLB["MetalLB L2 Mode<br/>IP Pool: 10.46.7.100-120"]
subgraph Services["Shared Services"]
PG["PostgreSQL HA<br/>10.46.7.100:5432"]
Redis["Redis Sentinel<br/>10.46.7.101:6379"]
RMQ["RabbitMQ<br/>10.46.7.102:5672"]
Mail["Mailpit UI<br/>10.46.7.103:8025"]
SigNoz["SigNoz APM<br/>10.46.7.104:3301"]
Grafana["Grafana<br/>10.46.7.105:3000"]
end
subgraph Ingress["Ingress Layer"]
Traefik["Traefik Ingress<br/>auth.local, api.local"]
end
end
Host -->|LoadBalancer IPs| MetalLB
MetalLB --> Services
MetalLB --> Ingress
style Host fill:#dbeafe,stroke:#333,stroke-width:2px
style Cluster fill:#f0fdf4,stroke:#333,stroke-width:2px
style Services fill:#fef3c7,stroke:#333,stroke-width:1px
style Ingress fill:#fae8ff,stroke:#333,stroke-width:1px
2.2 Network Flow
sequenceDiagram
participant Dev as Developer
participant Host as Host Machine<br/>NestJS App
participant PG as PostgreSQL<br/>10.46.7.100
participant Redis as Redis<br/>10.46.7.101
participant RMQ as RabbitMQ<br/>10.46.7.102
participant Mail as Mailpit UI
participant SigNoz as SigNoz APM
Dev->>Host: Write code + npm run dev
Host->>PG: Connect: 10.46.7.100:5432
Host->>Redis: Connect: 10.46.7.101:6379
Host->>RMQ: Publish events: 10.46.7.102:5672
RMQ->>Host: Consume events async
Dev->>Mail: Access UI: http://10.46.7.103:8025
Dev->>SigNoz: Access: http://10.46.7.104:3301
Host->>SigNoz: OpenTelemetry traces
Note over Dev,SigNoz: Production-like distributed system<br/>running locally
Result: Production-like distributed system, running locally.
Part 3: k3d Setup (Docker-based, Fastest)
3.1 Installation
# macOS with Homebrew
brew install k3d
# Linux
curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
# Verify installation
k3d version
3.2 Create k3d Cluster
# Create cluster with MetalLB configuration
k3d cluster create dev-cluster \
--port "80:80@loadbalancer" \
--port "443:443@loadbalancer" \
--agents 2 \
--k3s-arg "--disable=traefik@server:0" \
--k3s-arg "--disable=servicelb@server:0" \
--network k3d-network
# Verify cluster
kubectl cluster-info
kubectl get nodes
Expected output:
NAME STATUS ROLES AGE VERSION
k3d-dev-cluster-server-0 Ready control-plane,master 30s v1.29.2+k3s1
k3d-dev-cluster-agent-0 Ready <none> 25s v1.29.2+k3s1
k3d-dev-cluster-agent-1 Ready <none> 25s v1.29.2+k3s1
3.3 Configure MetalLB
MetalLB provides LoadBalancer IPs in bare-metal clusters.
# Install MetalLB
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.12/config/manifests/metallb-native.yaml
# Wait for MetalLB to be ready
kubectl wait --namespace metallb-system \
--for=condition=ready pod \
--selector=app=metallb \
--timeout=90s
# Get Docker network subnet
docker network inspect k3d-network \
-f '{{(index .IPAM.Config 0).Subnet}}'
Example output:
172.28.0.0/16
# We'll use 172.28.255.200-172.28.255.250 for LoadBalancer IPs
Create MetalLB configuration:
# metallb-config.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: dev-pool
namespace: metallb-system
spec:
addresses:
- 172.28.255.200-172.28.255.250
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: dev-l2
namespace: metallb-system
spec:
ipAddressPools:
- dev-pool
kubectl apply -f metallb-config.yaml
3.4 Install Ingress Controller
# Install Traefik (lightweight, Kubernetes-native)
helm repo add traefik https://traefik.github.io/charts
helm repo update
helm install traefik traefik/traefik \
--namespace traefik-system \
--create-namespace \
--set ports.web.port=80 \
--set ports.websecure.port=443 \
--set service.type=LoadBalancer
# Wait for LoadBalancer IP
kubectl get svc traefik -n traefik-system -w
Expected output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
traefik LoadBalancer 10.43.123.45 172.28.255.200 80:30001/TCP 30s
Now you have 172.28.255.200 as your ingress IP. Configure DNS:
# Add to /etc/hosts (requires sudo)
echo "172.28.255.200 auth.local api.local inventory.local mail.local" | sudo tee -a /etc/hosts
Part 4: k3s on Multipass (VM-based, Production-like)
4.1 When to Choose k3s Over k3d
Choose k3s on Multipass when:
- ✅ Testing multi-node failures
- ✅ Validating persistent storage with Longhorn
- ✅ Rehearsing production deployments
- ✅ Testing network policies
- ✅ Benchmarking performance under load
Stick with k3d when:
- ✅ Daily development
- ✅ Quick prototyping
- ✅ CI/CD pipelines
- ✅ Limited RAM (< 8GB)
4.2 Installation
# Install Multipass (Ubuntu VM manager)
# macOS
brew install --cask multipass
# Linux (Ubuntu)
sudo snap install multipass
# Windows
# Download from https://multipass.run/install
# Verify
multipass version
4.3 Create Multi-Node Cluster
# Create control plane node (2 CPU, 4GB RAM)
multipass launch 22.04 --name k3s-server --cpus 2 --memory 4G --disk 20G
# Create worker nodes (2 CPU, 2GB RAM each)
multipass launch 22.04 --name k3s-agent-1 --cpus 2 --memory 2G --disk 10G
multipass launch 22.04 --name k3s-agent-2 --cpus 2 --memory 2G --disk 10G
# Get server IP
SERVER_IP=$(multipass info k3s-server --format json | jq -r '.info["k3s-server"].ipv4[0]')
# Install k3s on server
multipass exec k3s-server -- \
curl -sfL https://get.k3s.io | sh -
# Get node token
NODE_TOKEN=$(multipass exec k3s-server -- sudo cat /var/lib/rancher/k3s/server/node-token)
# Join agents
multipass exec k3s-agent-1 -- \
curl -sfL https://get.k3s.io | K3S_URL=https://${SERVER_IP}:6443 K3S_TOKEN=${NODE_TOKEN} sh -
multipass exec k3s-agent-2 -- \
curl -sfL https://get.k3s.io | K3S_URL=https://${SERVER_IP}:6443 K3S_TOKEN=${NODE_TOKEN} sh -
# Get kubeconfig
multipass exec k3s-server -- sudo cat /etc/rancher/k3s/k3s.yaml > ~/.kube/config-k3s
# Replace server IP
sed -i '' "s/127.0.0.1/${SERVER_IP}/g" ~/.kube/config-k3s
# Set context
export KUBECONFIG=~/.kube/config-k3s
kubectl cluster-info
4.4 Configure MetalLB for Multipass
Get subnet from Multipass network:
multipass networks
# Find the subnet, e.g., 10.46.7.0/24
# Create MetalLB config
cat > metallb-config.yaml <<EOF
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: multipass-pool
namespace: metallb-system
spec:
addresses:
- 10.46.7.100-10.46.7.120
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: multipass-l2
namespace: metallb-system
spec:
ipAddressPools:
- multipass-pool
EOF
kubectl apply -f metallb-config.yaml
Part 5: Deploying Shared Services
5.1 PostgreSQL HA with Bitnami
# Add Bitnami Helm repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# Create namespace
kubectl create namespace database
# Install PostgreSQL HA
helm install postgresql bitnami/postgresql-ha \
--namespace database \
--set auth.username=devuser \
--set auth.password=devpassword \
--set auth.database=devdb \
--set persistence.size=10Gi \
--set metrics.enabled=true \
--set service.type=LoadBalancer \
--set service.loadBalancerIP=10.46.7.100
# Wait for PostgreSQL to be ready
kubectl wait --namespace database \
--for=condition=ready pod \
--selector=app.kubernetes.io/name=postgresql-ha \
--timeout=300s
# Get connection details
export POSTGRES_HOST=$(kubectl get svc postgresql-ha -n database -o jsonpath='{.status.loadBalancerIP}')
export POSTGRES_PORT=5432
export POSTGRES_USER=devuser
export POSTGRES_PASSWORD=devpassword
export POSTGRES_DB=devdb
echo "PostgreSQL HA ready at: ${POSTGRES_HOST}:${POSTGRES_PORT}"
Test connection from host machine:
# Install psql client
# macOS: brew install postgresql
# Linux: sudo apt install postgresql-client
psql -h ${POSTGRES_HOST} -U ${POSTGRES_USER} -d ${POSTGRES_DB} -c "SELECT version();"
5.2 Redis Sentinel for High Availability
# Install Redis with Sentinel
helm install redis bitnami/redis \
--namespace database \
--set auth.enabled=true \
--set auth.password=redispassword \
--set architecture=replication \
--set master.count=1 \
--set replica.replicaCount=2 \
--set sentinel.enabled=true \
--set sentinel.quorum=2 \
--set sentinel.downAfterMilliseconds=5000 \
--set sentinel.failoverTimeout=60000 \
--set master.service.type=LoadBalancer \
--set master.service.loadBalancerIP=10.46.7.101 \
--set persistence.enabled=true \
--set persistence.size=5Gi
# Wait for Redis to be ready
kubectl wait --namespace database \
--for=condition=ready pod \
--selector=app.kubernetes.io/name=redis \
--timeout=120s
# Get Redis connection
export REDIS_HOST=$(kubectl get svc redis-master -n database -o jsonpath='{.status.loadBalancerIP}')
export REDIS_PASSWORD=redispassword
echo "Redis Sentinel ready at: ${REDIS_HOST}:6379"
Test connection:
# Install redis-cli
# macOS: brew install redis
# Linux: sudo apt install redis-tools
redis-cli -h ${REDIS_HOST} -a ${REDIS_PASSWORD} ping
# Should return: PONG
5.3 RabbitMQ for Message Queues
# Install RabbitMQ with management plugin
helm install rabbitmq bitnami/rabbitmq \
--namespace messaging \
--create-namespace \
--set auth.username=rabbitmq \
--set auth.password=rabbitmqpassword \
--set auth.erlangCookie=secretcookie \
--set clustering.enabled=true \
--set replicaCount=3 \
--set service.type=LoadBalancer \
--set service.loadBalancerIP=10.46.7.102 \
--set persistence.enabled=true \
--set persistence.size=5Gi \
--set managementPlugin.enabled=true \
--set serviceManager.type=LoadBalancer \
--set serviceManager.loadBalancerIP=10.46.7.103
# Wait for RabbitMQ
kubectl wait --namespace messaging \
--for=condition=ready pod \
--selector=app.kubernetes.io/name=rabbitmq \
--timeout=180s
# Get connection details
export RABBITMQ_HOST=$(kubectl get svc rabbitmq -n messaging -o jsonpath='{.status.loadBalancerIP}')
export RABBITMQ_USER=rabbitmq
export RABBITMQ_PASSWORD=rabbitmqpassword
echo "RabbitMQ AMQP: ${RABBITMQ_HOST}:5672"
echo "RabbitMQ Management: http://${RABBITMQ_HOST}:15672"
echo "Username: ${RABBITMQ_USER}"
echo "Password: ${RABBITMQ_PASSWORD}"
5.4 Mailpit for Email Testing
# Install Mailpit (lightweight email catcher)
helm repo add mailpit https://sj26.github.io/mailpit/charts
helm repo update
helm install mailpit mailpit/mailpit \
--namespace messaging \
--set service.type=LoadBalancer \
--set service.loadBalancerIP=10.46.7.104 \
--set persistence.enabled=true \
--set persistence.size=2Gi
# Wait for Mailpit
kubectl wait --namespace messaging \
--for=condition=ready pod \
--selector=app.kubernetes.io/name=mailpit \
--timeout=60s
export MAILPIT_HOST=$(kubectl get svc mailpit -n messaging -o jsonpath='{.status.loadBalancerIP}')
echo "Mailpit SMTP: ${MAILPIT_HOST}:1025"
echo "Mailpit UI: http://${MAILPIT_HOST}:8025"
5.5 SigNoz for APM & Distributed Tracing
# Install SigNoz (OpenTelemetry-native APM)
git clone --depth 1 -b main https://github.com/SigNoz/signoz.git signoz
cd signoz/deploy/
# Install with Helm
helm install signoz signoz \
--namespace signoz \
--create-namespace \
--set frontend.service.type=LoadBalancer \
--set frontend.service.loadBalancerIP=10.46.7.105 \
--set otelCollector.service.type=LoadBalancer \
--set otelCollector.service.loadBalancerIP=10.46.7.106
# Wait for SigNoz (takes 3-5 minutes)
kubectl wait --namespace signoz \
--for=condition=ready pod \
--selector=app.kubernetes.io/instance=signoz \
--timeout=300s
export SIGNOZ_HOST=$(kubectl get svc signoz-frontend -n signoz -o jsonpath='{.status.loadBalancerIP}')
echo "SigNoz UI: http://${SIGNOZ_HOST}:3301"
Part 6: Deploying Your Microservices
6.1 Containerizing NestJS Apps
# Dockerfile (multi-stage build)
FROM node:20-alpine AS builder
WORKDIR /app
# Install pnpm
RUN corepack enable && corepack prepare pnpm@latest --activate
# Copy package files
COPY package.json pnpm-lock.yaml ./
# Install dependencies
RUN pnpm install --frozen-lockfile
# Copy source code
COPY . .
# Build application
RUN pnpm build
# Production stage
FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
# Install pnpm
RUN corepack enable && corepack prepare pnpm@latest --activate
# Copy package files
COPY package.json pnpm-lock.yaml ./
# Install production dependencies only
RUN pnpm install --prod --frozen-lockfile
# Copy built application
COPY --from=builder /app/dist ./dist
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
USER nodejs
EXPOSE 3000
CMD ["node", "dist/main.js"]
6.2 Helm Chart for Microservices
# charts/auth-service/Chart.yaml
apiVersion: v2
name: auth-service
description: Authentication Service Helm Chart
type: application
version: 0.1.0
appVersion: "1.0.0"
dependencies:
- name: common
version: 2.x
repository: https://charts.bitnami.com/bitnami
# charts/auth-service/values.yaml
replicaCount: 2
image:
repository: your-registry/auth-service
tag: latest
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 3000
ingress:
enabled: true
className: traefik
hosts:
- host: auth.local
paths:
- path: /
pathType: Prefix
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
targetMemoryUtilizationPercentage: 80
env:
DATABASE_URL: "postgresql://devuser:devpassword@10.46.7.100:5432/devdb"
REDIS_URL: "redis://default:redispassword@10.46.7.101:6379"
RABBITMQ_URL: "amqp://rabbitmq:rabbitmqpassword@10.46.7.102:5672"
OTEL_EXPORTER_OTLP_ENDPOINT: "http://10.46.7.106:4317"
# charts/auth-service/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}
labels:
app.kubernetes.io/name: {{ .Release.Name }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
app.kubernetes.io/name: {{ .Release.Name }}
template:
metadata:
labels:
app.kubernetes.io/name: {{ .Release.Name }}
annotations:
# Trigger redeploy on config change
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
spec:
containers:
- name: {{ .Release.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: 3000
protocol: TCP
envFrom:
- configMapRef:
name: {{ .Release.Name }}-config
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
resources:
{{- toYaml .Values.resources | nindent 12 }}
6.3 Deploy Your Services
# Package and deploy auth service
cd charts/auth-service
helm package .
helm upgrade --install auth-service ./auth-service-0.1.0.tgz \
--namespace services \
--create-namespace
# Deploy user service
helm upgrade --install user-service ../user-service \
--namespace services \
--create-namespace
# Deploy inventory service
helm upgrade --install inventory-service ../inventory-service \
--namespace services \
--create-namespace
# Check deployment status
kubectl get deployments -n services
kubectl get pods -n services
kubectl get svc -n services
Part 7: Local-to-Prod Parity
7.1 Environment Management
# ConfigMap for non-sensitive config
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: services
data:
NODE_ENV: "development"
LOG_LEVEL: "debug"
CORS_ORIGIN: "http://localhost:3000"
JWT_EXPIRES_IN: "7d"
RATE_LIMIT_TTL: "60"
RATE_LIMIT_MAX: "100"
# Secret for sensitive data
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
namespace: services
type: Opaque
stringData:
JWT_SECRET: "your-super-secret-key-change-in-prod"
DATABASE_PASSWORD: "devpassword"
REDIS_PASSWORD: "redispassword"
RABBITMQ_PASSWORD: "rabbitmqpassword"
Best Practice: Use different secrets for dev/prod:
# Development
kubectl apply -f secrets.dev.yaml
# Production (with sealed-secrets or external-secrets)
kubectl apply -f secrets.prod.yaml
7.2 Database Migrations in Kubernetes
# migration-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: run-migrations
namespace: database
annotations:
"helm.sh/hook": pre-install,pre-upgrade
"helm.sh/hook-weight": "-1"
"helm.sh/hook-delete-policy": hook-succeeded
spec:
template:
spec:
containers:
- name: migrations
image: your-registry/auth-service:latest
command: ["pnpm", "prisma", "migrate", "deploy"]
envFrom:
- secretRef:
name: app-secrets
- configMapRef:
name: app-config
restartPolicy: Never
backoffLimit: 3
7.3 Hot Reload for Development
Option 1: Skaffold (Recommended)
# skaffold.yaml
apiVersion: skaffold/v4beta6
kind: Config
metadata:
name: auth-service
build:
artifacts:
- image: auth-service
context: apps/auth
docker:
dockerfile: Dockerfile.dev
deploy:
helm:
releases:
- name: auth-service
chartPath: charts/auth-service
namespace: services
setValueTemplates:
image.repository: "{{.IMAGE_REPO_auth_service}}"
image.tag: "{{.IMAGE_TAG_auth_service}}"
profiles:
- name: dev
activation:
- command: dev
build:
artifacts:
- image: auth-service
sync:
manual:
- src: 'src/**/*.ts'
dest: .
deploy:
helm:
releases:
- name: auth-service
setValueTemplates:
image.tag: "{{.IMAGE_TAG_auth_service}}"
# Run with hot reload
skaffold dev --profile dev
Option 2: Tilt (Alternative)
# Tiltfile
docker_build('auth-service', './apps/auth')
k8s_yaml('charts/auth-service/templates/deployment.yaml')
k8s_resource('auth-service', port_forwards=3000)
tilt up
Part 8: Observability Stack
8.1 OpenTelemetry Instrumentation
// src/telemetry.ts
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc';
import { Resource } from '@opentelemetry/resources';
import { SEMRESATTRS_SERVICE_NAME } from '@opentelemetry/semantic-conventions';
const sdk = new NodeSDK({
resource: new Resource({
[SEMRESATTRS_SERVICE_NAME]: 'auth-service',
}),
traceExporter: new OTLPTraceExporter({
url: 'http://10.46.7.106:4317', // SigNoz collector
}),
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-http': {
enabled: true,
ignoreIncomingRequestHook: (request) => {
// Ignore health checks
return request.url?.includes('/health');
},
},
'@opentelemetry/instrumentation-express': {
enabled: true,
},
'@opentelemetry/instrumentation-pg': {
enabled: true,
},
'@opentelemetry/instrumentation-redis': {
enabled: true,
},
}),
],
});
sdk.start();
// Graceful shutdown
process.on('SIGTERM', () => {
sdk.shutdown()
.then(() => console.log('Tracing terminated'))
.catch((error) => console.log('Error terminating tracing', error))
.finally(() => process.exit(0));
});
8.2 Prometheus Metrics
// src/metrics.ts
import client from 'prom-client';
const register = new client.Registry();
client.collectDefaultMetrics({ register });
// Custom metrics
const httpRequestDuration = new client.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5],
registers: [register],
});
const activeConnections = new client.Gauge({
name: 'active_connections',
help: 'Number of active database connections',
registers: [register],
});
export { register, httpRequestDuration, activeConnections };
# servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: auth-service
namespace: services
labels:
release: prometheus
spec:
selector:
matchLabels:
app.kubernetes.io/name: auth-service
endpoints:
- port: http
path: /metrics
interval: 15s
Part 9: Developer Workflow
9.1 Daily Development Loop
# 1. Start cluster (if not running)
k3d cluster start dev-cluster
# 2. Set kubeconfig
export KUBECONFIG=~/.k3d/kubeconfig-dev-cluster.yaml
# 3. Deploy shared services (one-time setup)
helm upgrade --install postgresql bitnami/postgresql-ha -n database
helm upgrade --install redis bitnami/redis -n database
helm upgrade --install rabbitmq bitnami/rabbitmq -n messaging
# 4. Run apps locally (on host machine)
cd apps/auth
pnpm dev
App connects to cluster services:
# PostgreSQL at 172.28.255.200:5432
# Redis at 172.28.255.201:6379
# RabbitMQ at 172.28.255.202:5672
# 5. Access services
# Mailpit UI: http://172.28.255.203:8025
# SigNoz: http://172.28.255.204:3301
# Grafana: http://172.28.255.205:3000
9.2 Port-Forward for Debugging
# Forward PostgreSQL to localhost
kubectl port-forward svc/postgresql-ha -n database 5432:5432
# Forward Redis
kubectl port-forward svc/redis-master -n database 6379:6379
# Forward SigNoz
kubectl port-forward svc/signoz-frontend -n signoz 3301:3301
# In another terminal, run your app
pnpm dev
9.3 Viewing Logs
# Tail all pods in namespace
kubectl logs -f -n services -l app.kubernetes.io/name=auth-service
# Specific pod
kubectl logs -f auth-service-7d9f8b6c5-xk2pl -n services
# Previous instance (if crashed)
kubectl logs -p auth-service-7d9f8b6c5-xk2pl -n services
# With timestamps
kubectl logs -f auth-service-7d9f8b6c5-xk2pl -n services --timestamps
Part 10: Scaling & Performance Testing
10.1 Horizontal Pod Autoscaler
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: auth-service-hpa
namespace: services
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: auth-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
10.2 Load Testing with k6
// tests/load/auth-load-test.ts
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';
const errorRate = new Rate('errors');
export const options = {
stages: [
{ duration: '30s', target: 10 }, // Ramp to 10 users
{ duration: '1m', target: 10 }, // Stay at 10 users
{ duration: '30s', target: 50 }, // Ramp to 50 users
{ duration: '2m', target: 50 }, // Stay at 50 users
{ duration: '30s', target: 100 }, // Ramp to 100 users
{ duration: '2m', target: 100 }, // Stay at 100 users
{ duration: '30s', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% of requests < 500ms
errors: ['rate<0.1'], // Error rate < 10%
},
};
export default function () {
const payload = {
email: 'test@example.com',
password: 'password123',
};
const res = http.post('http://auth.local/api/auth/login', JSON.stringify(payload), {
headers: { 'Content-Type': 'application/json' },
});
check(res, {
'status is 200': (r) => r.status === 200,
'has token': (r) => JSON.parse(r.body).token !== undefined,
});
errorRate.add(res.status !== 200);
sleep(1);
}
# Run load test
k6 run tests/load/auth-load-test.ts
Part 11: When to Move to Production K8s
11.1 Production Checklist
## Infrastructure
- [ ] Managed Kubernetes (EKS, GKE, AKS, DigitalOcean)
- [ ] Managed Database (RDS, Cloud SQL, managed PostgreSQL)
- [ ] Managed Redis (ElastiCache, Memorystore)
- [ ] Managed Message Queue (SQS, Pub/Sub, managed RabbitMQ)
- [ ] Load Balancer (ALB, NLB, Cloud Load Balancing)
- [ ] CDN for static assets
## Security
- [ ] Network Policies enabled
- [ ] Pod Security Standards enforced
- [ ] RBAC configured with least privilege
- [ ] Secrets management (Vault, AWS Secrets Manager)
- [ ] TLS certificates (cert-manager)
- [ ] Image scanning in CI/CD
## Observability
- [ ] Centralized logging (ELK, Loki, Cloud Logging)
- [ ] Distributed tracing (SigNoz, Jaeger, Cloud Trace)
- [ ] Metrics & alerting (Prometheus, Cloud Monitoring)
- [ ] Error tracking (Sentry, Rollbar)
- [ ] Uptime monitoring
## CI/CD
- [ ] GitOps with ArgoCD or Flux
- [ ] Automated deployments on merge
- [ ] Rollback strategy
- [ ] Blue-green or canary deployments
- [ ] Database migration automation
11.2 GitOps with ArgoCD
# argocd-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: auth-service
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/your-org/infrastructure.git
targetRevision: HEAD
path: apps/auth-service
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
Conclusion 🎯
You now have production-grade infrastructure running locally. You can:
- ✅ Develop microservices with real service discovery
- ✅ Test database migrations safely
- ✅ Validate distributed tracing end-to-end
- ✅ Load test before production
- ✅ Practice GitOps workflows
- ✅ Rehearse deployments and rollbacks
Your Local Stack:
k3d/k3s # Kubernetes cluster
MetalLB # LoadBalancer IPs
Traefik # Ingress controller
PostgreSQL HA # Primary database
Redis Sentinel # Cache & sessions
RabbitMQ # Message queues
Mailpit # Email testing
SigNoz # APM & tracing
Grafana # Metrics dashboards
Next Steps:
- Start small — Deploy one service, connect to cluster databases
- Add observability — Instrument with OpenTelemetry
- Practice failures — Kill pods, test resilience
- Automate — Set up Skaffold or Tilt for hot reload
- Scale up — Test HPA with k6 load tests
The gap between local and production isn’t inevitable. With k3s and k3d, you can build infrastructure that teaches you production patterns before you deploy.
Stop guessing. Start practicing. 🚀
Further Reading
- k3d Documentation
- k3s Documentation
- MetalLB
- SigNoz
- Skaffold
- lite-ims Repository - Real-world example
- MoonERP Repository - Modular monolith architecture