🚀 Day 08 All Essential Kubernetes (K8s) Commands for MLOps Engineers

I am Bittu Sharma, a DevOps & AI Engineer with a keen interest in building intelligent, automated systems. My goal is to bridge the gap between software engineering and data science, ensuring scalable deployments and efficient model operations in production.! 𝗟𝗲𝘁'𝘀 𝗖𝗼𝗻𝗻𝗲𝗰𝘁 I would love the opportunity to connect and contribute. Feel free to DM me on LinkedIn itself or reach out to me at bittush9534@gmail.com. I look forward to connecting and networking with people in this exciting Tech World.
As an MLOps Engineer, you’ll constantly work with Kubernetes to deploy, monitor, and scale ML workloads — from training containers to serving APIs.
Knowing the right kubectl commands saves time, reduces errors, and helps automate your workflows efficiently.
In this blog, we’ll cover all practical Kubernetes commands — from beginner to advanced — tailored for Machine Learning Operations (MLOps) use cases.
⚙️ 1️⃣ Basic Kubernetes Commands
These commands help you start working with any Kubernetes cluster.
# Check Kubernetes version
kubectl version
# Get cluster info
kubectl cluster-info
# List all nodes in the cluster
kubectl get nodes
# Get detailed info about a node
kubectl describe node <node-name>
# View all namespaces
kubectl get namespaces
# Switch to a namespace
kubectl config set-context --current --namespace=<namespace-name>
🧱 2️⃣ Working with Pods
Pods are the smallest deployable units in Kubernetes.
In ML, they often represent your model training or inference containers.
# List all pods
kubectl get pods
# Get pods in all namespaces
kubectl get pods --all-namespaces
# Describe a specific pod
kubectl describe pod <pod-name>
# View pod logs (useful for ML training logs)
kubectl logs <pod-name>
# Stream logs in real-time
kubectl logs -f <pod-name>
# Execute a command inside a pod (like checking model outputs)
kubectl exec -it <pod-name> -- /bin/bash
🧩 3️⃣ Deployments and ReplicaSets
Deployments ensure that your ML models or training jobs are continuously running and scalable.
# List all deployments
kubectl get deployments
# Create a deployment
kubectl create deployment ml-api --image=mlops/serve:latest
# Scale deployment (e.g., 3 replicas for model serving)
kubectl scale deployment ml-api --replicas=3
# Update deployment with a new image
kubectl set image deployment/ml-api ml-api=mlops/serve:v2
# Rollback deployment
kubectl rollout undo deployment/ml-api
🌐 4️⃣ Services and Networking
Services expose your ML model to internal or external traffic.
# List all services
kubectl get svc
# Expose a deployment as a service
kubectl expose deployment ml-api --type=LoadBalancer --port=80 --target-port=5000
# Describe a service
kubectl describe svc ml-api
# Get the external IP of a LoadBalancer service
kubectl get svc ml-api -o wide
⚡ 5️⃣ ConfigMaps and Secrets
These are essential for ML configurations — like data paths, credentials, or API keys.
# Create a ConfigMap
kubectl create configmap ml-config --from-literal=DATA_PATH=/data
# View ConfigMaps
kubectl get configmaps
# Create a Secret (for API keys, DB credentials)
kubectl create secret generic ml-secret --from-literal=API_KEY=abcd1234
# Describe Secret
kubectl describe secret ml-secret
📈 6️⃣ Autoscaling and Resource Management
Kubernetes makes ML model scaling easy with the Horizontal Pod Autoscaler.
# Apply resource limits in YAML
kubectl apply -f ml-deployment.yaml
# Set up autoscaling based on CPU
kubectl autoscale deployment ml-api --min=2 --max=10 --cpu-percent=80
# Check current autoscalers
kubectl get hpa
🧮 7️⃣ Jobs and CronJobs (For Training ML Models)
Jobs are perfect for running ML training tasks that need to complete once.
CronJobs automate model retraining at intervals.
# Run a one-time training job
kubectl create job ml-train --image=mlops/train:v1
# List jobs
kubectl get jobs
# Create a CronJob for daily retraining
kubectl create cronjob ml-retrain --image=mlops/train:v1 --schedule="0 2 * * *"
# List CronJobs
kubectl get cronjobs
🔍 8️⃣ Monitoring & Debugging
Useful commands for inspecting resources during model deployment or training.
# View events (troubleshooting training pods)
kubectl get events --sort-by=.metadata.creationTimestamp
# Check all resources in current namespace
kubectl get all
# Port-forward a pod to access locally
kubectl port-forward pod/<pod-name> 8080:80
# Delete a stuck pod
kubectl delete pod <pod-name> --force --grace-period=0
🧰 9️⃣ Custom Resource Definitions (CRDs) for ML
MLOps platforms like Kubeflow or Seldon Core use CRDs to manage ML pipelines.
# Get all CRDs
kubectl get crds
# Apply a custom CRD (e.g., SeldonDeployment)
kubectl apply -f seldon-deploy.yaml
# Check status of CRD
kubectl describe seldondeployment <name>
🧱 10️⃣ YAML Management
As an MLOps engineer, you often use YAML manifests to define resources.
# Apply a YAML file
kubectl apply -f deployment.yaml
# Delete a YAML resource
kubectl delete -f deployment.yaml
# View generated YAML
kubectl get deployment ml-api -o yaml
🚀 Bonus: Useful Shortcuts
# Quickly delete all pods in namespace
kubectl delete pods --all
# Restart deployment
kubectl rollout restart deployment/ml-api
# Switch context
kubectl config use-context <context-name>
# View current context
kubectl config current-context
🧩 Real-World Use Case
Imagine you have a TensorFlow model deployed as an API using FastAPI.
Here’s what you might do:
Build & push the model Docker image.
Deploy it on Kubernetes using:
kubectl create deployment tf-api --image=mlops/tf-api:v1 kubectl expose deployment tf-api --type=LoadBalancer --port=80 --target-port=8000Set up autoscaling:
kubectl autoscale deployment tf-api --min=2 --max=10 --cpu-percent=75Configure model retraining via CronJob:
kubectl create cronjob tf-retrain --image=mlops/train:v2 --schedule="0 0 * * SUN"
💡 Interview Questions
Q1: What’s the difference between a Deployment and a Job in Kubernetes?
Q2: How do you perform rolling updates for ML model APIs?
Q3: How can you autoscale a training pipeline in Kubernetes?
Q4: How do ConfigMaps and Secrets improve ML workflow security?
Q5: What’s the role of CronJobs in MLOps pipelines?
🧭 Conclusion
Kubernetes is the backbone of modern MLOps.
With these commands, you can manage ML model training, deployment, scaling, and monitoring in production environments confidently.
Master these commands — and you’ll be ready to handle real-world ML pipelines at scale!
✍️ Author: Bittu Sharma
DevOps | MLOps | AIOps Engineer passionate about automating AI pipelines.
Follow me on LinkedIn
Follow me on GitHub
Keep Learning……




