Horizontal Pod Autoscaling in Kubernetes based on External Metrics, using Prometheus Adapter

Orchestrators manage modern-day micro-service architectures. Kubernetes is one of them, which provides benefits of resource optimization, minimal or zero downtime deployments, reliability, auto-scaling, to name a few. Auto-scaling solutions are feedback loop based on specific metrics like network throughput, resource utilization of the services. Generally, metrics can be traffic throughput, resource utilization like CPU/Memory of the services. These metrics are part of the cluster and monitored to take auto-scaling decisions, but what about the external metrics? This blog covers both kinds of metrics for deploying the auto-scaling solution and used in production for a client.

One of our clients was using a Redis server which was outside of the Kubernetes cluster. We had to collect the metrics of the Redis queues and based on threshold auto-scale the pods.

What is Horizontal Pod Autoscaling (HPA)?

Kubernetes is inherently scalable, providing a number of tools that allow the applications as well as the infrastructure to scale up and down depending on the demand, efficiency and a number of other metrics. What I’m going to discuss in this article, is one such feature that allows the user to horizontally scale the Pods based on certain metrics, which can either be provided by Kubernetes itself, or custom metrics which have been generated by the user.

The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on some metrics. It is implemented as a Kubernetes API resource and a controller.

The HPA controller retrieves metrics from a series of APIs, which include:

metrics.k8s.io API for resource metrics
- These include metrics like cpu/memory usage of a Pod.
custom.metrics.k8s.io API for custom metrics
- These can be defined by using operators and are generated from within the cluster, for example Prometheus Operator.
external.metrics.k8s.io API for external metrics
- These metrics originate from outside the Kubernetes cluster, for example number of pending jobs present in a external Redis queue, and have to made available to the cluster so that the HPA controller can monitor it.

How are we going to implement HPA?

For this article, we will be using the Prometheus Adapter in order to have the Prometheus metric available to the Kubernetes cluster as an external metric.

The following steps outline how HPA can be implemented in the cluster:

There will be an application running in the cluster, which connects to the external Redis service to pick up the next job from the queue.
After picking up the job from the queue, the application will send to StatsD the number of pending jobs remaining in the queue as a gauge metric.
The external Prometheus will scrape StatsD and now has the metric.
- For example, below is the metric that will be used to trigger the autoscaling event:

Now, you can deploy the Prometheus Adapter in the cluster to query the external Prometheus and expose the metric to the cluster via the external metric API.

The manifests for deploying the Prometheus Adapter can be found here.
- The following changes are to be made to the Prometheus Adapter’s manifests:
  - Update the URL for the Prometheus service in the deploy/manifests/custom-metrics-apiserver-deployment.yaml
  - Update the deploy/manifests/custom-metrics-config-map.yaml with the correct rule for querying Prometheus.
    - For example, our metric is named trigger_prod_hpa, which has the labels {instance="sh119.global.temp.domains/~onetwoni",job="trigger_prod_hpa"}.
    - The corresponding Prometheus Adapter rule for the above metric would be:

externalRules:
  - seriesQuery: 'trigger_prod_hpa{instance="sh119.global.temp.domains/~onetwoni",job="trigger_prod_hpa"}'
    resources:
      overrides:
        namespace: {resource: "namespace"}
        service: {resource: "service"}
    name:
      matches: ^trigger_prod_(.*)$
      as: "trigger_prod_$1"
    metricsQuery: 'trigger_prod_hpa{instance="sh119.global.temp.domains/~onetwoni",job="trigger_prod_hpa"}'

The git repository for the Prometheus Adapter contains a very detailed explanation on how the adapter rules are written.
TLS certificates will have to be generated for the Prometheus Adapter.
- This Makefile can be used to generate the certs.
- Run the following commands to generate the manifest which will contain the certs as a k8s secret:

mkdir custom-metrics-api
touch metrics-ca.key metrics-ca.crt metrics-ca-config.json custom-metrics-api/cm-adapter-serving-certs.yaml
make certs

The above commands will generate a yaml file which has the secret configured.
Copy the generated cm-adapter-serving-certs.yaml to Prometheus Adapter’s deploy/manifests directory.
- Note: Make sure that the namespace of the generated secret is the same as the namespace for the manifests in Prometheus Adapter’s deploy/manifests.
After the adapter has been successfully deployed, we now have to confirm that the adapter configuration is applied correctly:
- Confirm that the external.metrics.k8s.io API is active and aware of the metric:
  - kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq .

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq .    

{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "external.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "trigger_prod_hpa",
      "singularName": "",
      "namespaced": true,
      "kind": "ExternalMetricValueList",
      "verbs": [
        "get"
      ]
    }
  ]
}

Next, confirm that the metric’s value from Prometheus is correctly available to the cluster:

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/trigger_prod_hpa" | jq .

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/trigger_prod_hpa" | jq .    

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/%2A/trigger_prod_hpa"
  },
  "items": [
    {
      "metricName": "trigger_prod_hpa",
      "metricLabels": {
        "__name__": "trigger_prod_hpa",
        "instance": "localhost",
        "job": "trigger_prod_hpa"
      },
      "timestamp": "2020-11-01T04:17:01Z",
      "value": "10"
    }
  ]
}

Now that we have confirmed that our external metric is available to the cluster, we are ready to define the HPA configuration which will have:
- threshold value to trigger the autoscaling event, which is compared against the external metric
- minimum number of Pods that must be running when the value is below the threshold
- maximum number of Pods that can be scaled up to when the value crosses the threshold
Below is the HPA configuration yaml:
- hpa.yaml

---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache-hpa  # Name of the HPA config
  namespace: hpademo    # Namespace of the deployment on which HPA is to be applied
spec:
  scaleTargetRef:
    apiVersion: apps/v1 
    kind: Deployment
    name: php-apache    # Name of the deployment
  minReplicas: 1        # Minimum number of running Pods
  maxReplicas: 5        # Maximum number of Pods that can be scaled 
  metrics:
    - type: External    
      external:
        metricName: trigger_prod_hpa      # Name of the external metric as it is available to the cluster   
        targetValue: "40"                 # Threshold value for the autoscaling to trigger

The above configuration can be applied by the following command: kubectl apply -f hpa.yaml
We can check the applied HPA configuration by running the following command: kubectl describe hpa -n <namespace>

kubectl describe hpa -n hpademo

Name:                                 php-apache-hpa
Namespace:                            hpademo
Labels:                               <none>
Annotations:                          <none>
CreationTimestamp:                    Sun, 01 Nov 2020 10:06:20 +0530
Reference:                            Deployment/php-apache
Metrics:                              ( current / target )
  "trigger_prod_hpa" (target value):  10 / 40
Min replicas:                         1
Max replicas:                         5
Deployment pods:                      1 current / 1 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from external metric trigger_prod_hpa(nil)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:           <none>

As can be seen above, the HPA configuration has been applied to the cluster and the HPA controller is able to access the external metric correctly. It will monitor the value of the external metric to the threshold’s value, and when it crosses the threshold, will trigger a scale up action. Similarly, when the external metric’s value goes below the threshold, the HPA controller will trigger a scale down action.

The HPA controller keeps a track of desired number of Pods based on the following formula:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

Test the HPA

In order to test our HPA configuration and make sure that the scaling up/down occurs correctly, we will update the value of the trigger_prod_hpa metric to a value above the threshold.

Update the value of the trigger_prod_hpa metric at Prometheus to a value above “40” (threshold we have set). Let’s set the value at “50”, and after the scale up event has happened, let’s update it to “30”
As we can see below, the metric value has been updated and the HPA controller has already started scaling up our Pods till the maximum allowed number:

kubectl describe hpa -n hpademo

Name:                                 php-apache-hpa
Namespace:                            hpademo
Labels:                               <none>
Annotations:                          <none>
CreationTimestamp:                    Sun, 01 Nov 2020 10:06:20 +0530
Reference:                            Deployment/php-apache
Metrics:                              ( current / target )
  "trigger_prod_hpa" (target value):  50 / 40
Min replicas:                         1
Max replicas:                         5
Deployment pods:                      5 current / 5 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from external metric trigger_prod_hpa(nil)
  ScalingLimited  True    TooManyReplicas   the desired replica count is more than the maximum replica count
Events:
  Type    Reason             Age                    From                       Message
  ----    ------             ----                   ----                       -------
  Normal  SuccessfulRescale  3m20s                  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target
  Normal  SuccessfulRescale  2m49s (x2 over 8m56s)  horizontal-pod-autoscaler  New size: 2; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  2m34s (x2 over 8m41s)  horizontal-pod-autoscaler  New size: 3; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  2m18s (x2 over 8m25s)  horizontal-pod-autoscaler  New size: 4; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  2m3s                   horizontal-pod-autoscaler  New size: 5; reason: external metric trigger_prod_hpa(nil) above target

When the value of the trigger_prod_hpa metric eventually falls below the threshold, the HPA controller will start scaling down the Pods based on the formula mentioned above:

It is possible to track the number of replicas you will end up with after scaling down:

desiredReplicas = ceil[5*(30/40)] = ceil(3.75) = 4 replicas
---
desiredReplicas = ceil[4*(30/40)] = ceil(3)    = 3 replicas
---
desiredReplicas = ceil[3*(30/40)] = ceil(2.25) = 3 replicas

As can be seen, HPA is now maintaining 3 replicas:

kubectl describe hpa -n hpademo

Name:                                 php-apache-hpa
Namespace:                            hpademo
Labels:                               <none>
Annotations:                          <none>
CreationTimestamp:                    Sun, 01 Nov 2020 10:06:20 +0530
Reference:                            Deployment/php-apache
Metrics:                              ( current / target )
  "trigger_prod_hpa" (target value):  30 / 40
Min replicas:                         1
Max replicas:                         5
Deployment pods:                      3 current / 3 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from external metric trigger_prod_hpa(nil)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type    Reason             Age                From                       Message
  ----    ------             ----               ----                       -------
  Normal  SuccessfulRescale  37m                horizontal-pod-autoscaler  New size: 1; reason: All metrics below target
  Normal  SuccessfulRescale  36m (x2 over 42m)  horizontal-pod-autoscaler  New size: 2; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  36m (x2 over 42m)  horizontal-pod-autoscaler  New size: 3; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  36m (x2 over 42m)  horizontal-pod-autoscaler  New size: 4; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  36m                horizontal-pod-autoscaler  New size: 5; reason: external metric trigger_prod_hpa(nil) above target
  Normal  SuccessfulRescale  28m                horizontal-pod-autoscaler  New size: 4; reason: All metrics below target
  Normal  SuccessfulRescale  23m                horizontal-pod-autoscaler  New size: 3; reason: All metrics below target

Conclusion

We were able to achieve Horizontal Pod Autoscaling feature of Kubernetes, by ingesting the external metrics to the cluster, define HPA configuration and let Kubernetes handle auto-scale the pods.
Always set the maximum and a minimum number of replicas. There can be situations where the maximum replicas are not enough or minimum replicas are more than desired.

What is Horizontal Pod Autoscaling (HPA)?

How are we going to implement HPA?

Test the HPA

Conclusion

Akshay Srivastava