Why is HPA request based?

  • As per the current configurations, the targetCPUUtilization and targetMemoryUtilization specified in HPA, is basically based on requests.cpu or requests.memory , respectively. A detailed example could be checked here.
  • There have been several discussions going on upstream, with respect to configuring the HPA’s algorithm to be based on limits instead of the requests.
  • However, one such issue is still open as we have a conflict on what would be more efficient (limits or requests).
  • When we talk about setting a percentage of utilization as per standard terms, we are most likely to think that one should specify the target less than or equal to 100%.
  • However, this is *not* necessary in case of HPA. I myself was a bit confused with the “why” and “how” of this part, but stumbling around multiple discussions and judgments helped me understand this.
  • As currently, HPA uses resources.requests as its base to calculate and compare the resource utilization, setting a target above 100% should not cause any problem as long as the threshold(tragetUtilization) is less than or equal to resources.limits .
  • For example, deploy an application with resources.requests.cpu=200m and resources.limits.cpu="4"for each container. For this application, configure a HPA with targetCPUUtilization=300% . Now, each time the average consumption of all application pods reaches 300% of 200m (requests.cpu) i.e. 600m, the new pods would scale up.
  • With the enhancements, the waiting period could be modified as per the requirement. Curious to learn about those with pseudocode? Check out here.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store