Kubernetes Deployment LIVE
This website is served from a k3s Kubernetes cluster on EC2. Production-grade manifests with autoscaling, health checks, and NodePort routing.
Architecture
Kubernetes Manifests (6 files)
namespace.yaml
Isolates all AI Advisory resources in a dedicated ai-advisory namespace, preventing conflicts with other workloads.
kind: Namespace metadata: name: ai-advisory
deployment.yaml
Pod spec with readiness/liveness probes (60s initial delay for model load), resource limits (384Mi–768Mi), and environment injection from ConfigMap + Secret.
readinessProbe:
httpGet:
path: /
port: 5000
initialDelaySeconds: 60
resources:
limits:
memory: "768Mi"
service.yaml
NodePort service exposing the app on port 30080. nginx reverse proxy forwards external HTTPS traffic to this port.
spec:
type: NodePort
ports:
- port: 80
targetPort: 5000
nodePort: 30080
hpa.yaml
HorizontalPodAutoscaler scales from 1 to 3 replicas when average CPU exceeds 70%. k3s includes metrics-server by default.
spec:
minReplicas: 1
maxReplicas: 3
metrics:
- resource:
name: cpu
target:
averageUtilization: 70
configmap.yaml
Non-secret configuration: LLM backend selection, Flask environment, gunicorn worker count and timeout.
data: RAG_LLM_BACKEND: "huggingface" GUNICORN_WORKERS: "1" GUNICORN_TIMEOUT: "120"
kustomization.yaml
Kustomize overlay applies all 6 manifests in order with a single command: k3s kubectl apply -k k8s/. No Helm needed.
kind: Kustomization resources: - namespace.yaml - deployment.yaml - service.yaml - hpa.yaml
How to Run
Install k3s (single command)
curl -sfL https://get.k3s.io | sh -s - --disable traefik
Build + import Docker image
docker build -t ai-advisory:latest . docker save ai-advisory:latest | k3s ctr images import -
Deploy to k3s
k3s kubectl apply -k k8s/
Applies all 6 manifests via Kustomize: namespace, configmap, secret, deployment, service, HPA.
Access the app
curl http://localhost:30080/
Tear down
k3s kubectl delete namespace ai-advisory
Why k3s?
k3s is a lightweight, CNCF-certified Kubernetes distribution. Single binary, ~512MB RAM, production-ready. Perfect for single-node deployments where full K8s would be overkill.
Fits on t3.small (2GB) alongside the app
One curl command to install, no dependencies
100% Kubernetes API compatible — same manifests work on EKS
metrics-server included — HPA works out of the box
Skills Demonstrated
Deployments, Services, Namespaces, ConfigMaps, Secrets
NodePort routing, nginx reverse proxy, SSL termination
HPA with CPU metrics, min/max replica control
Readiness + liveness probes with tuned delays for ML model loading
CPU/memory requests and limits per container
Declarative manifest management without Helm complexity