Discussion on Horizontal Pod Autoscaler with a demo on local k8s cluster

Photo Credit: Ryo Yoshitake https://unsplash.com/photos/cusz0Bg-5mQ

Algorithm to calculate Replicas

desired_replica = ceil(current_replica * (current value/ target value))
Target CPU utilization : 60%
Current Utilization: 90%
Current pods: 3
Desired Pods = ceil(current_pods * (current value/ target value))
Desired Pods = ceil(3*(.9/.6)) = 5
Target CPU Utilization : 60%
Current Utilization: 20%
Current pods : 5
Desired Pods = ceil(5*(.3/.6)) = 2

Lab Setup

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: hpacluster
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
- role: worker
- role: worker
- role: worker
- role: worker
asishs-MacBook-Air:kind$ kind create cluster --config hpa-lab.yaml
Creating cluster "hpacluster" ...
✓ Ensuring node image (kindest/node:v1.21.1) 🖼
✓ Preparing nodes 📦 📦 📦 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining worker nodes 🚜
Set kubectl context to "kind-hpacluster"
You can now use your cluster with:
kubectl cluster-info --context kind-hpaclusterHave a nice day! 👋
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
labels:
app: frontend
name: frontend
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: frontend
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: frontend
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
ports:
- containerPort: 80
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
apiVersion: v1
kind: Service
metadata:
labels:
app: frontend
name: frontend-svc
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: frontend
kubectl -n ingress-nginx apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/kind/deploy.yaml
kubectl wait --namespace ingress-nginx \
--for=condition=ready pod \
--selector=app.kubernetes.io/component=controller \
--timeout=90s
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: frontend-ingress
spec:
rules:
- http:
paths:
- path: /
backend:
serviceName: frontend-svc
servicePort: 80
asishs-MacBook-Air:hpa$ kubectl get pods
NAME READY STATUS RESTARTS AGE
frontend-86968456b9-p7nc2 1/1 Running 0 60m
asishs-MacBook-Air:hpa$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
frontend-svc ClusterIP 10.96.161.97 <none> 80/TCP 60m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 61m
asishs-MacBook-Air:hpa$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
frontend-ingress <none> * localhost 80 57m
asishs-MacBook-Air:hpa$ curl -I http://localhost
HTTP/1.1 200 OK
Date: Sun, 20 Jun 2021 16:28:19 GMT
Content-Type: text/html
Content-Length: 612
Connection: keep-alive
Last-Modified: Tue, 25 May 2021 12:28:56 GMT
ETag: "60aced88-264"
Accept-Ranges: bytes
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: frontend-hpa
namespace: default
spec:
minReplicas: 3
maxReplicas: 10
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: frontend

targetCPUUtilizationPercentage: 10
asishs-MacBook-Air:hpa$ kubectl apply -f hpa.yaml
horizontalpodautoscaler.autoscaling/frontend-hpa created
asishs-MacBook-Air:hpa$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend <unknown>/10% 3 10 1 29s
asishs-MacBook-Air:hpa$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
asishs-MacBook-Air:hpa$ k top nodes
W0620 22:53:30.277142 49629 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
E0620 17:29:41.525715       1 scraper.go:139] "Failed to scrape node" err="Get \"https://172.18.0.4:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 172.18.0.4 because it doesn't contain any IP SANs" node="hpacluster-worker3"
E0620 17:29:41.534082 1 scraper.go:139] "Failed to scrape node" err="Get \"https://172.18.0.6:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 172.18.0.6 because it doesn't contain any IP SANs" node="hpacluster-worker4"
...
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
...
asishs-MacBook-Air:hpa$ kubectl top nodes
W0621 07:24:43.894564 52298 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
hpacluster-control-plane 184m 4% 573Mi 28%
hpacluster-worker 126m 3% 122Mi 6%
hpacluster-worker2 25m 0% 106Mi 5%
hpacluster-worker3 85m 2% 93Mi 4%
hpacluster-worker4 74m 1% 93Mi 4%
asishs-MacBook-Air:hpa$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend <unknown>/10% 3 10 3 85s
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
ports:
- containerPort: 80
protocol: TCP
resources:
limits:
cpu: 600m
memory: 128Mi
requests:
cpu: 200m
memory: 64Mi
asishs-MacBook-Air:hpa$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend <unknown>/10% 3 10 3 9m51s
asishs-MacBook-Air:hpa$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend 0%/10% 3 10 3 10m
  1. Starting the traffic to the service:
asishs-MacBook-Air:kind$ ab -n 1000000 -c 100 http://localhost/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)Server Software:
Server Hostname: localhost
Server Port: 80
Document Path: /
Document Length: 0 bytes
Concurrency Level: 100
Time taken for tests: 224.176 seconds
Complete requests: 98073
Failed requests: 0
Total transferred: 0 bytes
HTML transferred: 0 bytes
Requests per second: 437.48 [#/sec] (mean)
Time per request: 228.581 [ms] (mean)
Time per request: 2.286 [ms] (mean, across all concurrent requests)
Transfer rate: 0.00 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 129.0 0 19662
Processing: 0 1 4.3 1 544
Waiting: 0 0 0.0 0 0
Total: 0 2 129.1 1 19663
Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 1
90% 1
95% 1
98% 2
99% 2
100% 19663 (longest request)
asishs-MacBook-Air:hpa$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend 0%/10% 3 10 3 138m
asishs-MacBook-Air:hpa$ k get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend 0%/10% 3 10 3 138m
frontend-hpa Deployment/frontend 3%/10% 3 10 3 138m
frontend-hpa Deployment/frontend 25%/10% 3 10 3
asishs-MacBook-Air:hpa$ kubectl top pods
W0621 09:56:37.103195 53592 top_pod.go:140] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME CPU(cores) MEMORY(bytes)
frontend-78764b4d8-5k5ln 0m 1Mi
frontend-78764b4d8-fzmsd 0m 1Mi
frontend-78764b4d8-grdjr 0m 1Mi
asishs-MacBook-Air:hpa$ kubectl top pods
W0621 09:56:48.363569 53619 top_pod.go:140] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME CPU(cores) MEMORY(bytes)
frontend-78764b4d8-5k5ln 38m 2Mi
frontend-78764b4d8-fzmsd 35m 3Mi
frontend-78764b4d8-grdjr 73m 1Mi
asishs-MacBook-Air:hpa$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend 0%/10% 3 10 3 138m
asishs-MacBook-Air:hpa$ k get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend-hpa Deployment/frontend 0%/10% 3 10 3 138m
frontend-hpa Deployment/frontend 3%/10% 3 10 3 138m
frontend-hpa Deployment/frontend 25%/10% 3 10 3
138m
frontend-hpa Deployment/frontend 4%/10% 3 10 6 138m
frontend-hpa Deployment/frontend 0%/10% 3 10 8 138m
frontend-hpa Deployment/frontend 0%/10% 3 10 8 139m
asishs-MacBook-Air:hpa$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
26m Normal ScalingReplicaSet deployment/frontend Scaled up replica set frontend-78764b4d8 to 6
26m Normal ScalingReplicaSet deployment/frontend Scaled up replica set frontend-78764b4d8 to 8
asishs-MacBook-Air:hpa$ k get hpa -w
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
139m
frontend-hpa Deployment/frontend 0%/10% 3 10 8 145m
frontend-hpa Deployment/frontend 4%/10% 3 10 6 138m
frontend-hpa Deployment/frontend 0%/10% 3 10 3
asishs-MacBook-Air:hpa$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
2m51s Normal ScalingReplicaSet deployment/frontend Scaled up replica set frontend-78764b4d8 to 6
2m36s Normal ScalingReplicaSet deployment/frontend Scaled up replica set frontend-78764b4d8 to 8
4m6s Normal ScalingReplicaSet deployment/frontend Scaled down replica set frontend-78764b4d8 to 3
  1. 30 seconds as interval between metrics check
  2. 3 mins for scale out operation
  3. 5 mins for scale in operation

HPA thrashing

  • If HPA monitored the deployment and made immediate changes so frequently, then this would lead to thrashing or instability of service by adding and removing pods quickly.
  • We need to find a balance, where cluster is responsive to a trend in metrics and not too immediate.
  • We want to scale out fairly quickly to handle spikes and scale in a bit slower.
  • This is accomplished by “cool down” periods, by adding delays between two scale out or scale in operations, by giving a chance for the cluster to stabilize, honoring other scaling operations.

Best Practices

  • There should be resource limits on the pods specified. Without limits, HPA wont work.
  • The minimum replica count should be calculated properly and mentioned.
  • If your application requires some other metrics other than CPU, you have to deep dive on it and use the same. May be integrate with solutions like prometheus.
  • You need to consider that your application will take its own sweet time to start up, consider liveness-probe for example. So auto scaling will not be immediate. It can take several minutes to scale out. Give some buffer for your application to handle sudden spikes.
  • If your cluster is not able to handle the load, we might have to consider vertical scaling of nodes, or scaling cluster auto scaler.
  • Give a suitable buffer so that your application can handle spikes in traffic.
  • Your application should be stateless and no coupling between requests, with short requests.

Conclusion

--

--

--

Passion to learn & share

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

I think that alias ls="lsd -Fla" is patched font terminal UI and UX based derivative on System76…

Aim 3.1 — Images Tracker and Images Explorer

Open Again?

iOS third-party cookie saga with Tableau Server and Salesforce Mobile Publisher

How I Made A 5★ Rated App In 48 Hours

Create and Query a NoSQL Table with Amazon DynamoDB

Development Update on sSwap.io

Why language matters for Computer & Programming Language

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Asish M Madhu

Asish M Madhu

Passion to learn & share

More from Medium

Jenkins Vs Teamcity CI Tool

ELK Stack in Kubernetes Using Helm

Set up Kubernetes Cluster using Kind in 2 mins

Debugging namespace deletion issue in Kubernetes