Kubernetes Deployment | Sandeep Chilukuri

Architecture

External Traffic

Browser / curl

↓

SSL Termination

nginx (:443, Let's Encrypt)proxy_pass 127.0.0.1:30080

↓

k3s Cluster

Service (NodePort)port 30080 → 5000

↓

Pod 1Flask + RAG

Pod 2HPA scaled

Pod 3HPA scaled

ConfigMapLLM backend, Flask env

SecretHF API token

HPA1–3 pods, CPU 70%

Kubernetes Manifests (6 files)

namespace.yaml

Isolates all AI Advisory resources in a dedicated ai-advisory namespace, preventing conflicts with other workloads.

kind: Namespace
metadata:
  name: ai-advisory

deployment.yaml

Pod spec with readiness/liveness probes (60s initial delay for model load), resource limits (384Mi–768Mi), and environment injection from ConfigMap + Secret.

readinessProbe:
  httpGet:
    path: /
    port: 5000
  initialDelaySeconds: 60
resources:
  limits:
    memory: "768Mi"

service.yaml

NodePort service exposing the app on port 30080. nginx reverse proxy forwards external HTTPS traffic to this port.

spec:
  type: NodePort
  ports:
    - port: 80
      targetPort: 5000
      nodePort: 30080

hpa.yaml

HorizontalPodAutoscaler scales from 1 to 3 replicas when average CPU exceeds 70%. k3s includes metrics-server by default.

spec:
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 70

configmap.yaml

Non-secret configuration: LLM backend selection, Flask environment, gunicorn worker count and timeout.

data:
  RAG_LLM_BACKEND: "huggingface"
  GUNICORN_WORKERS: "1"
  GUNICORN_TIMEOUT: "120"

kustomization.yaml

Kustomize overlay applies all 6 manifests in order with a single command: k3s kubectl apply -k k8s/. No Helm needed.

kind: Kustomization
resources:
  - namespace.yaml
  - deployment.yaml
  - service.yaml
  - hpa.yaml

How to Run

Install k3s (single command)

curl -sfL https://get.k3s.io | sh -s - --disable traefik

Build + import Docker image

docker build -t ai-advisory:latest .
docker save ai-advisory:latest | k3s ctr images import -

Deploy to k3s

k3s kubectl apply -k k8s/

Applies all 6 manifests via Kustomize: namespace, configmap, secret, deployment, service, HPA.

Access the app

curl http://localhost:30080/

Tear down

k3s kubectl delete namespace ai-advisory

Why k3s?

k3s is a lightweight, CNCF-certified Kubernetes distribution. Single binary, ~512MB RAM, production-ready. Perfect for single-node deployments where full K8s would be overkill.

~512MB RAM

Fits on t3.small (2GB) alongside the app

Single Binary

One curl command to install, no dependencies

CNCF Certified

100% Kubernetes API compatible — same manifests work on EKS

Built-in Metrics

metrics-server included — HPA works out of the box

Skills Demonstrated

Kubernetes Core

Deployments, Services, Namespaces, ConfigMaps, Secrets

Networking

NodePort routing, nginx reverse proxy, SSL termination

Autoscaling

HPA with CPU metrics, min/max replica control

Health Checks

Readiness + liveness probes with tuned delays for ML model loading

Resource Management

CPU/memory requests and limits per container

Kustomize

Declarative manifest management without Helm complexity

Kubernetes Deployment LIVE