Local Dev Infrastructure | From Modular Monolith to Microservices with k3s & k3d

Cover Image

Introduction 🎯

Your modular monolith is growing. You’re hitting scale limits. You’ve heard the warnings: “Don’t do microservices too early,” “Distributed systems are hard,” “Stick with the monolith.”

But here’s the truth: microservices aren’t about scale—they’re about team autonomy, deployment independence, and technology diversity. And you can practice these patterns locally before production demands them.

This guide builds a production-grade local development infrastructure using k3s and k3d. You’ll run PostgreSQL HA, Redis Sentinel, RabbitMQ, and your microservices—all on your laptop, all accessible from your host machine, all mirroring production architecture.

No more “it works on my machine.” No more docker-compose limitations. No more guessing how things will behave in Kubernetes.

Let’s build infrastructure that scales with your ambition. 🚀

Part 1: Why Local Microservices Infrastructure Matters

1.1 The Modular Monolith → Microservices Journey

flowchart LR
    subgraph Stage1["Stage 1: Modular Monolith"]
        direction TB
        M1["Auth Module"]
        M2["User Module"]
        M3["Inventory Module"]
        M4["Order Module"]
        DB1[("Single Database<br/>PostgreSQL")]
        
        M1 --- M2 --- M3 --- M4
        M4 --- DB1
    end
    
    subgraph Stage2["Stage 2: Distributed Monolith"]
        direction TB
        S1["Auth Service"]
        S2["User Service"]
        S3["Inventory Service"]
        S4["Order Service"]
        DB2[("Single Database<br/>PostgreSQL")]
        
        S1 --- S2 --- S3 --- S4
        S4 --- DB2
    end
    
    subgraph Stage3["Stage 3: Microservices"]
        direction TB
        MS1["Auth Svc"]
        MS2["User Svc"]
        MS3["Inventory Svc"]
        MS4["Order Svc"]
        DB3A[("Auth DB")]
        DB3B[("User DB")]
        DB3C[("Inventory DB")]
        DB3D[("Order DB")]
        
        MS1 --- DB3A
        MS2 --- DB3B
        MS3 --- DB3C
        MS4 --- DB3D
    end
    
    Stage1 --> Stage2 --> Stage3
    
    style Stage1 fill:#dbeafe,stroke:#333,stroke-width:2px
    style Stage2 fill:#fef3c7,stroke:#333,stroke-width:2px
    style Stage3 fill:#d1fae5,stroke:#333,stroke-width:2px

The Problem: Most teams jump from Stage 1 to Stage 3 without practicing Stage 2. They deploy to production Kubernetes and discover:

Network latency they never tested locally
Database connection pool exhaustion
Service discovery failures
Distributed tracing gaps
Configuration management nightmares

The Solution: Build your local infrastructure to mirror production from day one.

1.2 Why Docker Compose Isn’t Enough

# docker-compose.yml - The Limitations
version: '3.8'
services:
  api:
    build: .
    ports:
      - "3000:3000"
    depends_on:
      - postgres
      - redis
  
  postgres:
    image: postgres:15
    ports:
      - "5432:5432"
  
  redis:
    image: redis:7
    ports:
      - "6379:6379"

What docker-compose lacks:

❌ No Kubernetes Service discovery
❌ No ConfigMaps or Secrets management
❌ No Horizontal Pod Autoscaling
❌ No Ingress controllers
❌ No namespace isolation
❌ No RBAC or service accounts
❌ No Helm charts or Kustomize
❌ No production-like networking

Result: Your local setup teaches you nothing about production Kubernetes.

1.3 k3s vs k3d: When to Use What

Feature	k3d (Docker-based)	k3s (VM-based)
Startup Time	~10-30 seconds	~2-5 minutes
Resource Usage	~500MB RAM	~2-4GB RAM
Production Fidelity	Medium	High
Multi-node Support	Simulated	Real VMs
Persistent Storage	Volume mounts	Longhorn possible
Best For	Quick dev, CI/CD	Production rehearsal
Platform	macOS, Linux, Windows	Linux, macOS (Multipass)

Rule of thumb:

k3d for daily development, fast iteration, CI pipelines
k3s on Multipass for production testing, multi-node scenarios, storage testing

Part 2: Architecture Overview

2.1 Full Local Infrastructure Stack

flowchart TB
    subgraph Host["Your Host Machine"]
        direction TB
        AuthAPI["Auth API<br/>Port 3001"]
        UserAPI["User API<br/>Port 3002"]
        InvAPI["Inventory API<br/>Port 3003"]
    end
    
    subgraph Cluster["k3s/k3d Cluster"]
        direction TB
        MetalLB["MetalLB L2 Mode<br/>IP Pool: 10.46.7.100-120"]
        
        subgraph Services["Shared Services"]
            PG["PostgreSQL HA<br/>10.46.7.100:5432"]
            Redis["Redis Sentinel<br/>10.46.7.101:6379"]
            RMQ["RabbitMQ<br/>10.46.7.102:5672"]
            Mail["Mailpit UI<br/>10.46.7.103:8025"]
            SigNoz["SigNoz APM<br/>10.46.7.104:3301"]
            Grafana["Grafana<br/>10.46.7.105:3000"]
        end
        
        subgraph Ingress["Ingress Layer"]
            Traefik["Traefik Ingress<br/>auth.local, api.local"]
        end
    end
    
    Host -->|LoadBalancer IPs| MetalLB
    MetalLB --> Services
    MetalLB --> Ingress
    
    style Host fill:#dbeafe,stroke:#333,stroke-width:2px
    style Cluster fill:#f0fdf4,stroke:#333,stroke-width:2px
    style Services fill:#fef3c7,stroke:#333,stroke-width:1px
    style Ingress fill:#fae8ff,stroke:#333,stroke-width:1px

2.2 Network Flow

sequenceDiagram
    participant Dev as Developer
    participant Host as Host Machine<br/>NestJS App
    participant PG as PostgreSQL<br/>10.46.7.100
    participant Redis as Redis<br/>10.46.7.101
    participant RMQ as RabbitMQ<br/>10.46.7.102
    participant Mail as Mailpit UI
    participant SigNoz as SigNoz APM
    
    Dev->>Host: Write code + npm run dev
    Host->>PG: Connect: 10.46.7.100:5432
    Host->>Redis: Connect: 10.46.7.101:6379
    Host->>RMQ: Publish events: 10.46.7.102:5672
    RMQ->>Host: Consume events async
    Dev->>Mail: Access UI: http://10.46.7.103:8025
    Dev->>SigNoz: Access: http://10.46.7.104:3301
    Host->>SigNoz: OpenTelemetry traces
    
    Note over Dev,SigNoz: Production-like distributed system<br/>running locally

Result: Production-like distributed system, running locally.

Part 3: k3d Setup (Docker-based, Fastest)

3.1 Installation

# macOS with Homebrew
brew install k3d

# Linux
curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash

# Verify installation
k3d version

3.2 Create k3d Cluster

# Create cluster with MetalLB configuration
k3d cluster create dev-cluster \
  --port "80:80@loadbalancer" \
  --port "443:443@loadbalancer" \
  --agents 2 \
  --k3s-arg "--disable=traefik@server:0" \
  --k3s-arg "--disable=servicelb@server:0" \
  --network k3d-network

# Verify cluster
kubectl cluster-info
kubectl get nodes

Expected output:

NAME         STATUS   ROLES                  AGE   VERSION
k3d-dev-cluster-server-0   Ready    control-plane,master   30s   v1.29.2+k3s1
k3d-dev-cluster-agent-0    Ready    <none>                 25s   v1.29.2+k3s1
k3d-dev-cluster-agent-1    Ready    <none>                 25s   v1.29.2+k3s1

3.3 Configure MetalLB

MetalLB provides LoadBalancer IPs in bare-metal clusters.

# Install MetalLB
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.12/config/manifests/metallb-native.yaml

# Wait for MetalLB to be ready
kubectl wait --namespace metallb-system \
  --for=condition=ready pod \
  --selector=app=metallb \
  --timeout=90s

# Get Docker network subnet
docker network inspect k3d-network \
  -f '{{(index .IPAM.Config 0).Subnet}}'

Example output:

172.28.0.0/16
# We'll use 172.28.255.200-172.28.255.250 for LoadBalancer IPs

Create MetalLB configuration:

# metallb-config.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: dev-pool
  namespace: metallb-system
spec:
  addresses:
  - 172.28.255.200-172.28.255.250
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: dev-l2
  namespace: metallb-system
spec:
  ipAddressPools:
  - dev-pool

kubectl apply -f metallb-config.yaml

3.4 Install Ingress Controller

# Install Traefik (lightweight, Kubernetes-native)
helm repo add traefik https://traefik.github.io/charts
helm repo update

helm install traefik traefik/traefik \
  --namespace traefik-system \
  --create-namespace \
  --set ports.web.port=80 \
  --set ports.websecure.port=443 \
  --set service.type=LoadBalancer

# Wait for LoadBalancer IP
kubectl get svc traefik -n traefik-system -w

Expected output:

NAME      TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
traefik   LoadBalancer   10.43.123.45    172.28.255.200  80:30001/TCP   30s

Now you have 172.28.255.200 as your ingress IP. Configure DNS:

# Add to /etc/hosts (requires sudo)
echo "172.28.255.200 auth.local api.local inventory.local mail.local" | sudo tee -a /etc/hosts

Part 4: k3s on Multipass (VM-based, Production-like)

4.1 When to Choose k3s Over k3d

Choose k3s on Multipass when:

✅ Testing multi-node failures
✅ Validating persistent storage with Longhorn
✅ Rehearsing production deployments
✅ Testing network policies
✅ Benchmarking performance under load

Stick with k3d when:

✅ Daily development
✅ Quick prototyping
✅ CI/CD pipelines
✅ Limited RAM (< 8GB)

4.2 Installation

# Install Multipass (Ubuntu VM manager)
# macOS
brew install --cask multipass

# Linux (Ubuntu)
sudo snap install multipass

# Windows
# Download from https://multipass.run/install

# Verify
multipass version

4.3 Create Multi-Node Cluster

# Create control plane node (2 CPU, 4GB RAM)
multipass launch 22.04 --name k3s-server --cpus 2 --memory 4G --disk 20G

# Create worker nodes (2 CPU, 2GB RAM each)
multipass launch 22.04 --name k3s-agent-1 --cpus 2 --memory 2G --disk 10G
multipass launch 22.04 --name k3s-agent-2 --cpus 2 --memory 2G --disk 10G

# Get server IP
SERVER_IP=$(multipass info k3s-server --format json | jq -r '.info["k3s-server"].ipv4[0]')

# Install k3s on server
multipass exec k3s-server -- \
  curl -sfL https://get.k3s.io | sh -

# Get node token
NODE_TOKEN=$(multipass exec k3s-server -- sudo cat /var/lib/rancher/k3s/server/node-token)

# Join agents
multipass exec k3s-agent-1 -- \
  curl -sfL https://get.k3s.io | K3S_URL=https://${SERVER_IP}:6443 K3S_TOKEN=${NODE_TOKEN} sh -

multipass exec k3s-agent-2 -- \
  curl -sfL https://get.k3s.io | K3S_URL=https://${SERVER_IP}:6443 K3S_TOKEN=${NODE_TOKEN} sh -

# Get kubeconfig
multipass exec k3s-server -- sudo cat /etc/rancher/k3s/k3s.yaml > ~/.kube/config-k3s

# Replace server IP
sed -i '' "s/127.0.0.1/${SERVER_IP}/g" ~/.kube/config-k3s

# Set context
export KUBECONFIG=~/.kube/config-k3s
kubectl cluster-info

4.4 Configure MetalLB for Multipass

Get subnet from Multipass network:

multipass networks
# Find the subnet, e.g., 10.46.7.0/24

# Create MetalLB config
cat > metallb-config.yaml <<EOF
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: multipass-pool
  namespace: metallb-system
spec:
  addresses:
  - 10.46.7.100-10.46.7.120
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: multipass-l2
  namespace: metallb-system
spec:
  ipAddressPools:
  - multipass-pool
EOF

kubectl apply -f metallb-config.yaml

Part 5: Deploying Shared Services

5.1 PostgreSQL HA with Bitnami

# Add Bitnami Helm repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# Create namespace
kubectl create namespace database

# Install PostgreSQL HA
helm install postgresql bitnami/postgresql-ha \
  --namespace database \
  --set auth.username=devuser \
  --set auth.password=devpassword \
  --set auth.database=devdb \
  --set persistence.size=10Gi \
  --set metrics.enabled=true \
  --set service.type=LoadBalancer \
  --set service.loadBalancerIP=10.46.7.100

# Wait for PostgreSQL to be ready
kubectl wait --namespace database \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/name=postgresql-ha \
  --timeout=300s

# Get connection details
export POSTGRES_HOST=$(kubectl get svc postgresql-ha -n database -o jsonpath='{.status.loadBalancerIP}')
export POSTGRES_PORT=5432
export POSTGRES_USER=devuser
export POSTGRES_PASSWORD=devpassword
export POSTGRES_DB=devdb

echo "PostgreSQL HA ready at: ${POSTGRES_HOST}:${POSTGRES_PORT}"

Test connection from host machine:

# Install psql client
# macOS: brew install postgresql
# Linux: sudo apt install postgresql-client

psql -h ${POSTGRES_HOST} -U ${POSTGRES_USER} -d ${POSTGRES_DB} -c "SELECT version();"

5.2 Redis Sentinel for High Availability

# Install Redis with Sentinel
helm install redis bitnami/redis \
  --namespace database \
  --set auth.enabled=true \
  --set auth.password=redispassword \
  --set architecture=replication \
  --set master.count=1 \
  --set replica.replicaCount=2 \
  --set sentinel.enabled=true \
  --set sentinel.quorum=2 \
  --set sentinel.downAfterMilliseconds=5000 \
  --set sentinel.failoverTimeout=60000 \
  --set master.service.type=LoadBalancer \
  --set master.service.loadBalancerIP=10.46.7.101 \
  --set persistence.enabled=true \
  --set persistence.size=5Gi

# Wait for Redis to be ready
kubectl wait --namespace database \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/name=redis \
  --timeout=120s

# Get Redis connection
export REDIS_HOST=$(kubectl get svc redis-master -n database -o jsonpath='{.status.loadBalancerIP}')
export REDIS_PASSWORD=redispassword

echo "Redis Sentinel ready at: ${REDIS_HOST}:6379"

Test connection:

# Install redis-cli
# macOS: brew install redis
# Linux: sudo apt install redis-tools

redis-cli -h ${REDIS_HOST} -a ${REDIS_PASSWORD} ping
# Should return: PONG

5.3 RabbitMQ for Message Queues

# Install RabbitMQ with management plugin
helm install rabbitmq bitnami/rabbitmq \
  --namespace messaging \
  --create-namespace \
  --set auth.username=rabbitmq \
  --set auth.password=rabbitmqpassword \
  --set auth.erlangCookie=secretcookie \
  --set clustering.enabled=true \
  --set replicaCount=3 \
  --set service.type=LoadBalancer \
  --set service.loadBalancerIP=10.46.7.102 \
  --set persistence.enabled=true \
  --set persistence.size=5Gi \
  --set managementPlugin.enabled=true \
  --set serviceManager.type=LoadBalancer \
  --set serviceManager.loadBalancerIP=10.46.7.103

# Wait for RabbitMQ
kubectl wait --namespace messaging \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/name=rabbitmq \
  --timeout=180s

# Get connection details
export RABBITMQ_HOST=$(kubectl get svc rabbitmq -n messaging -o jsonpath='{.status.loadBalancerIP}')
export RABBITMQ_USER=rabbitmq
export RABBITMQ_PASSWORD=rabbitmqpassword

echo "RabbitMQ AMQP: ${RABBITMQ_HOST}:5672"
echo "RabbitMQ Management: http://${RABBITMQ_HOST}:15672"
echo "Username: ${RABBITMQ_USER}"
echo "Password: ${RABBITMQ_PASSWORD}"

5.4 Mailpit for Email Testing

# Install Mailpit (lightweight email catcher)
helm repo add mailpit https://sj26.github.io/mailpit/charts
helm repo update

helm install mailpit mailpit/mailpit \
  --namespace messaging \
  --set service.type=LoadBalancer \
  --set service.loadBalancerIP=10.46.7.104 \
  --set persistence.enabled=true \
  --set persistence.size=2Gi

# Wait for Mailpit
kubectl wait --namespace messaging \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/name=mailpit \
  --timeout=60s

export MAILPIT_HOST=$(kubectl get svc mailpit -n messaging -o jsonpath='{.status.loadBalancerIP}')
echo "Mailpit SMTP: ${MAILPIT_HOST}:1025"
echo "Mailpit UI: http://${MAILPIT_HOST}:8025"

5.5 SigNoz for APM & Distributed Tracing

# Install SigNoz (OpenTelemetry-native APM)
git clone --depth 1 -b main https://github.com/SigNoz/signoz.git signoz
cd signoz/deploy/

# Install with Helm
helm install signoz signoz \
  --namespace signoz \
  --create-namespace \
  --set frontend.service.type=LoadBalancer \
  --set frontend.service.loadBalancerIP=10.46.7.105 \
  --set otelCollector.service.type=LoadBalancer \
  --set otelCollector.service.loadBalancerIP=10.46.7.106

# Wait for SigNoz (takes 3-5 minutes)
kubectl wait --namespace signoz \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/instance=signoz \
  --timeout=300s

export SIGNOZ_HOST=$(kubectl get svc signoz-frontend -n signoz -o jsonpath='{.status.loadBalancerIP}')
echo "SigNoz UI: http://${SIGNOZ_HOST}:3301"

Part 6: Deploying Your Microservices

6.1 Containerizing NestJS Apps

# Dockerfile (multi-stage build)
FROM node:20-alpine AS builder

WORKDIR /app

# Install pnpm
RUN corepack enable && corepack prepare pnpm@latest --activate

# Copy package files
COPY package.json pnpm-lock.yaml ./

# Install dependencies
RUN pnpm install --frozen-lockfile

# Copy source code
COPY . .

# Build application
RUN pnpm build

# Production stage
FROM node:20-alpine AS runner

WORKDIR /app

ENV NODE_ENV=production

# Install pnpm
RUN corepack enable && corepack prepare pnpm@latest --activate

# Copy package files
COPY package.json pnpm-lock.yaml ./

# Install production dependencies only
RUN pnpm install --prod --frozen-lockfile

# Copy built application
COPY --from=builder /app/dist ./dist

# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

USER nodejs

EXPOSE 3000

CMD ["node", "dist/main.js"]

6.2 Helm Chart for Microservices

# charts/auth-service/Chart.yaml
apiVersion: v2
name: auth-service
description: Authentication Service Helm Chart
type: application
version: 0.1.0
appVersion: "1.0.0"
dependencies:
  - name: common
    version: 2.x
    repository: https://charts.bitnami.com/bitnami

# charts/auth-service/values.yaml
replicaCount: 2

image:
  repository: your-registry/auth-service
  tag: latest
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 3000

ingress:
  enabled: true
  className: traefik
  hosts:
    - host: auth.local
      paths:
        - path: /
          pathType: Prefix

resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 128Mi

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80
  targetMemoryUtilizationPercentage: 80

env:
  DATABASE_URL: "postgresql://devuser:devpassword@10.46.7.100:5432/devdb"
  REDIS_URL: "redis://default:redispassword@10.46.7.101:6379"
  RABBITMQ_URL: "amqp://rabbitmq:rabbitmqpassword@10.46.7.102:5672"
  OTEL_EXPORTER_OTLP_ENDPOINT: "http://10.46.7.106:4317"

# charts/auth-service/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}
  labels:
    app.kubernetes.io/name: {{ .Release.Name }}
spec:
  {{- if not .Values.autoscaling.enabled }}
  replicas: {{ .Values.replicaCount }}
  {{- end }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ .Release.Name }}
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ .Release.Name }}
      annotations:
        # Trigger redeploy on config change
        checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
    spec:
      containers:
        - name: {{ .Release.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          ports:
            - name: http
              containerPort: 3000
              protocol: TCP
          envFrom:
            - configMapRef:
                name: {{ .Release.Name }}-config
          livenessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: http
            initialDelaySeconds: 5
            periodSeconds: 5
          resources:
            {{- toYaml .Values.resources | nindent 12 }}

6.3 Deploy Your Services

# Package and deploy auth service
cd charts/auth-service
helm package .
helm upgrade --install auth-service ./auth-service-0.1.0.tgz \
  --namespace services \
  --create-namespace

# Deploy user service
helm upgrade --install user-service ../user-service \
  --namespace services \
  --create-namespace

# Deploy inventory service
helm upgrade --install inventory-service ../inventory-service \
  --namespace services \
  --create-namespace

# Check deployment status
kubectl get deployments -n services
kubectl get pods -n services
kubectl get svc -n services

Part 7: Local-to-Prod Parity

7.1 Environment Management

# ConfigMap for non-sensitive config
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: services
data:
  NODE_ENV: "development"
  LOG_LEVEL: "debug"
  CORS_ORIGIN: "http://localhost:3000"
  JWT_EXPIRES_IN: "7d"
  RATE_LIMIT_TTL: "60"
  RATE_LIMIT_MAX: "100"

# Secret for sensitive data
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
  namespace: services
type: Opaque
stringData:
  JWT_SECRET: "your-super-secret-key-change-in-prod"
  DATABASE_PASSWORD: "devpassword"
  REDIS_PASSWORD: "redispassword"
  RABBITMQ_PASSWORD: "rabbitmqpassword"

Best Practice: Use different secrets for dev/prod:

# Development
kubectl apply -f secrets.dev.yaml

# Production (with sealed-secrets or external-secrets)
kubectl apply -f secrets.prod.yaml

7.2 Database Migrations in Kubernetes

# migration-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: run-migrations
  namespace: database
  annotations:
    "helm.sh/hook": pre-install,pre-upgrade
    "helm.sh/hook-weight": "-1"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  template:
    spec:
      containers:
        - name: migrations
          image: your-registry/auth-service:latest
          command: ["pnpm", "prisma", "migrate", "deploy"]
          envFrom:
            - secretRef:
                name: app-secrets
            - configMapRef:
                name: app-config
      restartPolicy: Never
  backoffLimit: 3

7.3 Hot Reload for Development

Option 1: Skaffold (Recommended)

# skaffold.yaml
apiVersion: skaffold/v4beta6
kind: Config
metadata:
  name: auth-service
build:
  artifacts:
    - image: auth-service
      context: apps/auth
      docker:
        dockerfile: Dockerfile.dev
deploy:
  helm:
    releases:
      - name: auth-service
        chartPath: charts/auth-service
        namespace: services
        setValueTemplates:
          image.repository: "{{.IMAGE_REPO_auth_service}}"
          image.tag: "{{.IMAGE_TAG_auth_service}}"
profiles:
  - name: dev
    activation:
      - command: dev
    build:
      artifacts:
        - image: auth-service
          sync:
            manual:
              - src: 'src/**/*.ts'
                dest: .
    deploy:
      helm:
        releases:
          - name: auth-service
            setValueTemplates:
              image.tag: "{{.IMAGE_TAG_auth_service}}"

# Run with hot reload
skaffold dev --profile dev

Option 2: Tilt (Alternative)

# Tiltfile
docker_build('auth-service', './apps/auth')
k8s_yaml('charts/auth-service/templates/deployment.yaml')
k8s_resource('auth-service', port_forwards=3000)

tilt up

Part 8: Observability Stack

8.1 OpenTelemetry Instrumentation

// src/telemetry.ts
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc';
import { Resource } from '@opentelemetry/resources';
import { SEMRESATTRS_SERVICE_NAME } from '@opentelemetry/semantic-conventions';

const sdk = new NodeSDK({
  resource: new Resource({
    [SEMRESATTRS_SERVICE_NAME]: 'auth-service',
  }),
  traceExporter: new OTLPTraceExporter({
    url: 'http://10.46.7.106:4317', // SigNoz collector
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-http': {
        enabled: true,
        ignoreIncomingRequestHook: (request) => {
          // Ignore health checks
          return request.url?.includes('/health');
        },
      },
      '@opentelemetry/instrumentation-express': {
        enabled: true,
      },
      '@opentelemetry/instrumentation-pg': {
        enabled: true,
      },
      '@opentelemetry/instrumentation-redis': {
        enabled: true,
      },
    }),
  ],
});

sdk.start();

// Graceful shutdown
process.on('SIGTERM', () => {
  sdk.shutdown()
    .then(() => console.log('Tracing terminated'))
    .catch((error) => console.log('Error terminating tracing', error))
    .finally(() => process.exit(0));
});

8.2 Prometheus Metrics

// src/metrics.ts
import client from 'prom-client';

const register = new client.Registry();
client.collectDefaultMetrics({ register });

// Custom metrics
const httpRequestDuration = new client.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5],
  registers: [register],
});

const activeConnections = new client.Gauge({
  name: 'active_connections',
  help: 'Number of active database connections',
  registers: [register],
});

export { register, httpRequestDuration, activeConnections };

# servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: auth-service
  namespace: services
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: auth-service
  endpoints:
    - port: http
      path: /metrics
      interval: 15s

Part 9: Developer Workflow

9.1 Daily Development Loop

# 1. Start cluster (if not running)
k3d cluster start dev-cluster

# 2. Set kubeconfig
export KUBECONFIG=~/.k3d/kubeconfig-dev-cluster.yaml

# 3. Deploy shared services (one-time setup)
helm upgrade --install postgresql bitnami/postgresql-ha -n database
helm upgrade --install redis bitnami/redis -n database
helm upgrade --install rabbitmq bitnami/rabbitmq -n messaging

# 4. Run apps locally (on host machine)
cd apps/auth
pnpm dev

App connects to cluster services:

# PostgreSQL at 172.28.255.200:5432
# Redis at 172.28.255.201:6379
# RabbitMQ at 172.28.255.202:5672

# 5. Access services
# Mailpit UI: http://172.28.255.203:8025
# SigNoz: http://172.28.255.204:3301
# Grafana: http://172.28.255.205:3000

9.2 Port-Forward for Debugging

# Forward PostgreSQL to localhost
kubectl port-forward svc/postgresql-ha -n database 5432:5432

# Forward Redis
kubectl port-forward svc/redis-master -n database 6379:6379

# Forward SigNoz
kubectl port-forward svc/signoz-frontend -n signoz 3301:3301

# In another terminal, run your app
pnpm dev

9.3 Viewing Logs

# Tail all pods in namespace
kubectl logs -f -n services -l app.kubernetes.io/name=auth-service

# Specific pod
kubectl logs -f auth-service-7d9f8b6c5-xk2pl -n services

# Previous instance (if crashed)
kubectl logs -p auth-service-7d9f8b6c5-xk2pl -n services

# With timestamps
kubectl logs -f auth-service-7d9f8b6c5-xk2pl -n services --timestamps

Part 10: Scaling & Performance Testing

10.1 Horizontal Pod Autoscaler

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: auth-service-hpa
  namespace: services
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: auth-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 50
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Percent
          value: 100
          periodSeconds: 60

10.2 Load Testing with k6

// tests/load/auth-load-test.ts
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export const options = {
  stages: [
    { duration: '30s', target: 10 },   // Ramp to 10 users
    { duration: '1m', target: 10 },    // Stay at 10 users
    { duration: '30s', target: 50 },   // Ramp to 50 users
    { duration: '2m', target: 50 },    // Stay at 50 users
    { duration: '30s', target: 100 },  // Ramp to 100 users
    { duration: '2m', target: 100 },   // Stay at 100 users
    { duration: '30s', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% of requests < 500ms
    errors: ['rate<0.1'],              // Error rate < 10%
  },
};

export default function () {
  const payload = {
    email: 'test@example.com',
    password: 'password123',
  };

  const res = http.post('http://auth.local/api/auth/login', JSON.stringify(payload), {
    headers: { 'Content-Type': 'application/json' },
  });

  check(res, {
    'status is 200': (r) => r.status === 200,
    'has token': (r) => JSON.parse(r.body).token !== undefined,
  });

  errorRate.add(res.status !== 200);
  sleep(1);
}

# Run load test
k6 run tests/load/auth-load-test.ts

Part 11: When to Move to Production K8s

11.1 Production Checklist

## Infrastructure
- [ ] Managed Kubernetes (EKS, GKE, AKS, DigitalOcean)
- [ ] Managed Database (RDS, Cloud SQL, managed PostgreSQL)
- [ ] Managed Redis (ElastiCache, Memorystore)
- [ ] Managed Message Queue (SQS, Pub/Sub, managed RabbitMQ)
- [ ] Load Balancer (ALB, NLB, Cloud Load Balancing)
- [ ] CDN for static assets

## Security
- [ ] Network Policies enabled
- [ ] Pod Security Standards enforced
- [ ] RBAC configured with least privilege
- [ ] Secrets management (Vault, AWS Secrets Manager)
- [ ] TLS certificates (cert-manager)
- [ ] Image scanning in CI/CD

## Observability
- [ ] Centralized logging (ELK, Loki, Cloud Logging)
- [ ] Distributed tracing (SigNoz, Jaeger, Cloud Trace)
- [ ] Metrics & alerting (Prometheus, Cloud Monitoring)
- [ ] Error tracking (Sentry, Rollbar)
- [ ] Uptime monitoring

## CI/CD
- [ ] GitOps with ArgoCD or Flux
- [ ] Automated deployments on merge
- [ ] Rollback strategy
- [ ] Blue-green or canary deployments
- [ ] Database migration automation

11.2 GitOps with ArgoCD

# argocd-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: auth-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/infrastructure.git
    targetRevision: HEAD
    path: apps/auth-service
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true

Conclusion 🎯

You now have production-grade infrastructure running locally. You can:

✅ Develop microservices with real service discovery
✅ Test database migrations safely
✅ Validate distributed tracing end-to-end
✅ Load test before production
✅ Practice GitOps workflows
✅ Rehearse deployments and rollbacks

Your Local Stack:

k3d/k3s         # Kubernetes cluster
MetalLB         # LoadBalancer IPs
Traefik         # Ingress controller
PostgreSQL HA   # Primary database
Redis Sentinel  # Cache & sessions
RabbitMQ        # Message queues
Mailpit         # Email testing
SigNoz          # APM & tracing
Grafana         # Metrics dashboards

Next Steps:

Start small — Deploy one service, connect to cluster databases
Add observability — Instrument with OpenTelemetry
Practice failures — Kill pods, test resilience
Automate — Set up Skaffold or Tilt for hot reload
Scale up — Test HPA with k6 load tests

The gap between local and production isn’t inevitable. With k3s and k3d, you can build infrastructure that teaches you production patterns before you deploy.

Stop guessing. Start practicing. 🚀