February 23, 2020

Setup an HPA (Horizontal Pod AutoScaler)

The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).

Setup an HPA (Horizontal Pod AutoScaler)

In prior examples we've created deployments that allow us to push container images out with a value set in Replicas that determine how many copies of a replica you want running. But in the real world things are not always black and white. Sometimes you'll want to start a deployment with the same number of replica's as you have data-center zones, so that  you have fault tolerance in case of a zone outage at your cloud or on-prem data-center provider. But the promise of kubernetes (k8s) is that it manages the tedious things that you don't have the time or money to invest in.

One of the neat things you can do with a deployment set is to create an HPA (Horizontal Pod AutoScaler) this will allow you to set rules for when a deployment scales up or down it's number of nodes.

Lets create an autoscaler for our hello-minikube deployment

PS ...\kubernetes\kubernetesTraining> kubectl autoscale deployment/hello-minikube --min=3 --max=10                                                                       horizontalpodautoscaler.autoscaling/hello-minikube autoscaled

Now we're created an HPA, lets make sure it created properly

PS ...\kubernetes\kubernetesTraining> kubectl get hpa                                                                                                                                    NAME             REFERENCE                   TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
hello-minikube   Deployment/hello-minikube   <unknown>/1%   3         10        3          22s

We didn't pass in a --cpu-percent value so kubernetes (k8s) is going to try to automatically manage scaling based on it's internal defaults. Read about those more here.

Now we have the basics of how to create a HPA so we can make sure our deployments and workloads scale as needed w/o any human intervention.