Setup an HPA (Horizontal Pod AutoScaler)
The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
In prior examples we've created deployments that allow us to push container images out with a value set in Replicas
that determine how many copies of a replica you want running. But in the real world things are not always black and white. Sometimes you'll want to start a deployment with the same number of replica's as you have data-center zones, so that you have fault tolerance in case of a zone outage at your cloud or on-prem data-center provider. But the promise of kubernetes (k8s) is that it manages the tedious things that you don't have the time or money to invest in.
One of the neat things you can do with a deployment set is to create an HPA (Horizontal Pod AutoScaler) this will allow you to set rules for when a deployment scales up or down it's number of nodes.
Lets create an autoscaler for our hello-minikube
deployment
PS ...\kubernetes\kubernetesTraining> kubectl autoscale deployment/hello-minikube --min=3 --max=10 horizontalpodautoscaler.autoscaling/hello-minikube autoscaled
Now we're created an HPA, lets make sure it created properly
PS ...\kubernetes\kubernetesTraining> kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello-minikube Deployment/hello-minikube <unknown>/1% 3 10 3 22s
We didn't pass in a --cpu-percent
value so kubernetes (k8s) is going to try to automatically manage scaling based on it's internal defaults. Read about those more here.
Now we have the basics of how to create a HPA so we can make sure our deployments and workloads scale as needed w/o any human intervention.