Kubernetes resource requirements and granular limits (2023)

When creating resources in a Kubernetes cluster, you may have encountered the following scenarios:

  1. No CPU or low CPU requirements are specified for the workloads, which means that more pods can "seemingly" be run on the same node. During bursts of traffic, your CPU will be busy with a longer delay, while some of your computers may have a CPU software crash.
  2. Similarly, no memory requirements or low memory requirements are specified for workloads. Some pods, particularly those running commercial Java applications, keep restarting while they can run normally in local tests.
  3. In a Kubernetes cluster, workloads are typically not evenly distributed across nodes. In most cases, particular memory resources are unevenly distributed, which means that some nodes can have much higher memory usage than other nodes. As the de facto standard in container orchestration, Kubernetes must have an effective scheduler that ensures even distribution of resources. But is it really so?

In general, cluster administrators cannot do anything other than restart the cluster when the above issues occur during traffic spikes when all their machines fail and SSH login fails. In this article, we'll dive deeper into the requirements and limitations of Kubernetes, looking at potential issues, and discussing best practices for fixing them. If you are also interested in the underlying mechanism, you can also find the analysis from the point of view of the source code. I hope this article helps you understand how Kubernetes requirements and limits work and why they might work as expected.

Concepts of requirements and limits

To fully utilize the resources in a Kubernetes cluster and improve scheduling efficiency, Kubernetes uses requirements and thresholds to control the allocation of resources to containers. Each container can have its own requests and limits. These two parameters are specified byResources.OrdersmiResources.Limits. In general, the requirements are more important in planning, while the constraints are more important in execution.

resources: requests for: UPC:50m Store:50 miles Boundaries: UPC:100m Store:100 miles

The requirements define the minimum amount of resources that a container requires. For example, for a container running Spring Boot Business, the specified requirements must meet the minimum amount of resources that a Java Virtual Machine (JVM) in the container image must consume. If you only specify a low memory requirement, it is likely that the Kubernetes scheduler will tend to schedule the pod for the node that does not have enough resources to run the JVM. This is because the pod cannot use more memory than is required by the JVM initialization process. As a result, the pod keeps restarting.

Limits, on the other hand, determine the maximum amount of resources a container can use, preventing resource shortages or machine crashes due to excessive resource consumption. when set0, it means that there is no resource limit for the container. Especially when you defineBoundariesno specificationrequests for, Kubernetes considers the value ofrequests foris the same asBoundariesDefault.

Kubernetes requirements and limits apply to two types of resources: compressible (eg CPU) and incompressible (eg memory). Proper bounds are extremely important for incompressible features.

Here is a brief summary of the requests and limits:

  • If a pod's services use more CPU resources than the specified thresholds, the pod will be throttled, but not terminated. If no limit is set, a pod can use all idle CPU resources.
  • If a pod is using more memory resources than the specified thresholds, the container processes in the pod will be terminated due to OOM. In that case, Kubernetes tends to restart the container on the original node or just create another pod.
  • 0 <= orders <= assignable nodes; Requirements <= Limits <= Infinite.

scenario analysis

Now that we've covered the concepts of requests and limits, let's go back to the three scenarios mentioned above.

Scene 1

First of all, you should know that CPU resources and memory resources are completely different. CPU resources are compressible. CPU distribution and management is based on Completely Fair Scheduler (CFS) and Cgroups. Simply put, if the service in a pod uses more CPU resources than the specified CPU thresholds, Kubernetes throttles it. For pods with no CPU limits, the amount of previously allocated CPU resources will gradually decrease as idle CPU resources are exhausted. In both situations, the pods are unable to process external requests, resulting in increased delay and response time.

scenario 2

On the other hand, memory cannot be compressed and pods cannot share memory resources. This means that allocating new memory resources will certainly fail when you run out of memory.

(Video) Setting Resource Requests and Limits in Kubernetes

Some processes in a pod require a certain amount of memory only at startup. For example, a JVM requests a certain amount of memory when it starts up. If the specified memory requirement is less than the memory allocated by the JVM, the memory enforcement fails (OOM sweep). As a result, the pod keeps rebooting and failing.

scenario 3

When a pod is created, Kubernetes must allocate or provision various resources, including CPU and memory, in a balanced and complete manner. The Kubernetes scheduling algorithm involves a variety of factors, such as:NodeResourcesLeastAllocatedand pod affinity. The reason memory resources are often unevenly distributed is because memory for applications is considered more scarce than other resources.

Also, a Kubernetes scheduler works based on the current state of a cluster. In other words, when new pods are created, the scheduler chooses an optimal node on which the pods can run according to the cluster's resource specification at that time. This is where potential problems can arise, since Kubernetes clusters are very dynamic. For example, to maintain a node, you might need to lock it down, and any pods running on it will be scheduled for other nodes. The problem is that these pods are not automatically scheduled to return to the original node after maintenance. This is because Kubernetes cannot reschedule a running pod to another node once it has initially connected to a node.

Best practices for configuring Kubernetes resource requirements and limits

From the analysis above, we know that cluster stability has a direct impact on the performance of your application. Temporary resource shortages are often the root cause of cluster instability, which can lead to application malfunction or even node failure. Here we would like to introduce two ways to improve cluster stability.

First, reserve a certain amount of system resourcesEditing the kubelet configuration file. This is especially important when dealing with incompressible computing resources, such as memory or disk space.

Second, please configure accordinglyQuality of Service (QoS)-Klassenfor pods. Kubernetes uses QoS classes to determine pod scheduling and eviction priority. Different pods can be assigned to different QoS classes, includingguaranteed(Maximum priority),Berstbarmibest effort(lowest priority).

  • guaranteed. Each container in the pod, including starter containers, must have specified CPU and memory requirements and limits, and they must be the same.
  • Berstbar. At least one container in the pod has specified CPU or memory requirements.
  • best effort. No container in the pod has specific CPU and memory requirements and limits.


Kubelet CPU management policies allow you to set CPU affinity for a specific pod. For more information, seeKubernetes documentation.

When you run out of resources, your cluster will first terminate pods with a QoS class ofbest effort, followed byBerstbar. In other words, the pods with the lowest priority are removed first. If you have enough resources, you can give all pods the class ofguaranteed. This can be seen as a compromise between computing resources and performance and stability. You can expect more overhead, but your cluster can operate more efficiently. At the same time, to improve resource utilization, you can assign the class to pods running business services.guaranteed. Assign the class of to other servicesBerstbarobest effortaccording to your priority.

Next we use theKubeSphere-Containerplattformas an example to see how to correctly configure resources for pods.

Use KubeSphere to allocate resources

As mentioned above, requirements and limits are two important building blocks for cluster stability. As one of the leading Kubernetes distributions, KubeSphere offers a concise, clean, and interactive user interface that significantly shortens the Kubernetes learning curve.

(Video) [ Kube 16 ] Using Resource Quotas & Limits in Kubernetes Cluster

before you start

KubeSphere has a highly functional multi-tenant system for detailed access control of different users. In KubeSphere 3.0, you can define requirements and limits for namespaces (ResourceQuotas) and containers (LimitRanges). To perform these operations, you must have a workspace, a project (that is, a namespace), and a user (ws-admin). For more information, seeCreate workspaces, projects, users and roles.

Define resource dimensions

  1. GonnaOverviewthe page of your project, go toBasic informationemproject settingsand selectedit quotasdomanage projectmenu suspense.

    Kubernetes resource requirements and granular limits (1)

  2. In the dialog that appears, define requirements and limits for your project.

    Kubernetes resource requirements and granular limits (2)

    Remember this:

    • The requests or limits set on this page must be greater than the total number of requests or limits specified for all pods in the project.
    • If you create a container in the project without specifying any requirements or limits, you will get an error message (logged events) on creation.

    After setting the dimensions of the project,requests formiBoundariesmust be specified for all containers created in the project. As we always say: "Code is Law". Project quotas set a rule that all containers must follow.


    Project quotas in KubeSphere are the same asresource quotasin Kubernetes. In addition to CPU and memory, you can set resource quotas for other objects like Deployments and ConfigMaps separately. For more information, seeproject fees.

Define standard requirements and thresholds

As mentioned above, if project quotas are specified, you must configure the requirements and limits for the pods accordingly. In fact, in tests or even in production, the value ofrequests forand the value ofBoundariesthey are very similar or even equivalent for most pods. To simplify workload creation, KubeSphere allows users to define default container requirements and limits upfront. That way, you don't have to define requirements and limits every time pods are created.

Follow these steps to define the default requirements and thresholds:

(Video) Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency / Henning Jacobs

  1. Also inBasic informationpage clickEdit Default Feature Requestdomanage projectmenu suspense.

  2. In the dialog that appears, set the default container requirements and limits.

    Kubernetes resource requirements and granular limits (3)


    The default container requirements and limits in KubeSphere are known aslimit rangesin Kubernetes. For more information, seecontainer edges.

  3. When you create workloads later, the requirements and thresholds are populated automatically. For more information about building workloads on KubeSphere, seeKubeSphere documentation.

    Kubernetes resource requirements and granular limits (4)

Containers running critical business processes must handle more traffic than other containers. In reality, there is no one-size-fits-all solution, and you must make careful and thorough decisions about the requirements and limitations of these containers. Think about the following questions:

  1. Are your containers CPU or IO intensive?
  2. Are they high availability?
  3. What are the upstream and downstream objects of your service?

If you look at container loading over time, you might find that it's periodic. In this sense, historical monitoring data can serve as an important reference when establishing requirements and thresholds. On the back of Prometheus built into the platform, KubeSphere has a powerful and holistic observability system that monitors resources at a granular level. Vertically, it spans data from clusters to pods. Horizontally, it keeps track of CPU, memory, network, and storage information. In general, you can specify requirements based on the average value of historical data, while the thresholds must be greater than the average. However, you may need to make some adjustments to your final decision if necessary.

source code analysis

Now that you know some best practices for configuring requirements and limits, let's dive into the source code.

questions and programming

The following code shows the relationship between requests from a pod and requests from the containers in the pod.

(Video) Kubernetes: Understanding Resources via YAML, Deployments, Replica Sets, and Pods

Bargain it ComputePodResourceRequest(cocoon *v1.cocoon)*preFilterState{Result := &preFilterState{}for _,Container := Area cocoon.specification.container{Result.add(Container.resources.requests for)}// Pegue max_resource(sum_pod, any_init_container).for _,Container := Area cocoon.specification.InitContainer{Result.SetResourceMax.(Container.resources.requests for)}// If overload is used, add to the total requirements for the podSe cocoon.specification.the overload != Null && useful function.DefaultFeatureGate.Permitted(characteristics.Subsobrecarga) {Result.add(cocoon.specification.the overload)}regresar Result}...Bargain it(F *Adjust)prefiltro(ctx context.context,cycle status *quadruple.CycleState,cocoon *v1.cocoon)*quadruple.state{cycle status.Write(preFilterStateKey,ComputePodResourceRequest(cocoon))go back Null}...Bargain it getPreFilterState(cycle status *quadruple.CycleState) (*preFilterState,Error) {C,err := cycle status.Ler(preFilterStateKey)Se err != Null{// preFilterState does not exist, most likely PreFilter was not called.go back Null,fmt.error("Error reading %q from CycleState: %v",preFilterStateKey,err)}S,OK := C.(*preFilterState)Se!OK{go back Null,fmt.error("%+v Error al convertir a NodeResourcesFit.preFilterState",C)}go back S,Null}...Bargain it(F *Adjust)Filter(ctx context.context,cycle status *quadruple.CycleState,cocoon *v1.cocoon,node information *quadruple.node information)*quadruple.state{S,err := getPreFilterState(cycle status)Se err != Null{go back quadruple.New state(quadruple.Error,err.Error())}insufficient resources := fits to order(S,node information,F.ignored features,F.ignore resource groups)SeLen (insufficient resources)!= 0{// We will keep all the error reasons.failure reasons :=do([]line,0, Len (insufficient resources))for _,R := Area insufficient resources{failure reasons= add (failure reasons,R.Reason)}go back quadruple.New state(quadruple.Unplanbar,failure reasons...)}go back Null}

From the code above, you can see that the scheduler (schedule thread) calculates the resources needed to schedule the pod. In particular, it calculates the total startup container requirements or the total worker container requirements according to the module specifications. The largest is used. Note that for lightweight virtual machines (for example, Kata containers), the consumption of cached virtualization's own resources must be counted. henceforthFilterIn the first phase, all nodes are checked to see if they meet the conditions.


The planning process involves several steps, including:prefiltro,Filter,NachfilterGenericNamemiscore. For more information, seeFilter and qualify nodes.

After filtering, if there is only one applicable node, the pod will be scheduled for that one. If there are multiple applicable pods, the scheduler chooses the node with the highest weighted sum of scores. The classification is based on a variety of factors, such as:plugin programmingImplement one or more extension points. Note that the value ofrequests forand the value ofBoundariesdirect influence on the final result of the pluginNodeResourcesLeastAllocated. Here is the source code:

Bargain it menorResourceScorer(resToWeightMap resourceToWeightMap)Bargain it(resourceToValueMap,resourceToValueMap,bool,And t,And t)int64{go back Bargain it(required,zuordenbar resourceToValueMap,include volumes bool,quantities requested And t,Zuordenbare volumes And t)int64{Guerra node score,total weight int64for Resource,Weight := Area resToWeightMap{resource score := at least RequestedScore(required[Resource],zuordenbar[Resource])node score += resource score * Weighttotal weight += Weight}go back node score / total weight}}...Bargain it at least RequestedScore(required,ability int64)int64{Se ability == 0{go back 0}Se required>ability{go back 0}go back((ability - required)*int64(quadruple.Max Node Score))/ ability}

ForNodeResourcesLeastAllocated, a node scores higher if it has more resources for the same pod. In other words, a pod is more likely to be scheduled for the node with sufficient resources.

When a pod is created, Kubernetes must allocate various resources, including CPU and memory. Each type of resource has a weight (theresToWeightMapstructure in the source code). Together, they tell the Kubernetes scheduler what the best decision might be to achieve resource balance. In itscoreStage, developer uses other plugins to score in addition to other pluginsNodeResourcesLeastAllocated, asInterPod Affinity.

QoS and scheduling

As a resource protection mechanism in Kubernetes, QoS is primarily used to control incompressible resources like storage. It also affects the OOM score of various pods and containers. When a node runs out of memory, the kernel (OOM Killer) kills low priority pods (higher scores mean lower priority). Here is the source code:

Bargain it ObtenerContenedorOOMScoreAdjust(cocoon *v1.cocoon,Container *v1.Container,Storage capacity int64)And t{Se The type.EsCriticalPod(cocoon) {// Critical capsules should be removed last.go back garantíaOOMScoreAdj}trocar v1qos.Get PodQOS(cocoon) {Cair v1.podQOS guaranteed:// Guaranteed containers must be the last to be thrown.go back garantíaOOMScoreAdjCair v1.PodQOSBestEffort:go back best effortOOMScoreAdj}// Explosive containers are a middle layer between guaranteed effort and best effort. Ideally,// We want to protect expandable containers that use less memory than requested.// The following formula is a heuristic. A container requesting 10% of a system//Memory has an OOM score of 900. If a process in Container Y// uses more than 10% of the memory, its OOM score will be 1000. The idea is that the container// that use more than their order have an OOM score of 1000 and are prime numbers// Targets for OOM kills.// Note that this is a heuristic, it doesn't work if a container has many small processes.memory request := Container.resources.requests for.Store().Wert()oomScoreAdjust := 1000 -(1000*memory request)/Storage capacity// A pod guaranteed to use 100% memory can have an OOM value of 10. Make sure// that expandable pods have a higher OOM score adjustment.SeAnd t(oomScoreAdjust) < (1000 + garantíaOOMScoreAdj) {go back(1000 + garantíaOOMScoreAdj)}// Gives pods that can explode a better chance of surviving than best effort pods.SeAnd t(oomScoreAdjust)== best effortOOMScoreAdj{go backAnd t(oomScoreAdjust - 1)}go backAnd t(oomScoreAdjust)}


As a portable and extensible open source platform, Kubernetes was built for managing containerized workloads and services. It has an extensive and rapidly growing ecosystem that has helped secure its position as the de facto standard in container orchestration. However, it is not always easy for users to learn Kubernetes, and that is where KubeSphere comes into play. KubeSphere allows users to perform virtually any operation on their dashboard, plus they have the option of using the built-in kubectl web tool to execute commands. This article focuses on Kubernetes resource requirements and limits, their underlying logic in Kubernetes, and how to use KubeSphere to configure them to make it easy to operate and maintain your cluster.

(Video) Namespace Resource Quota and Limit Range in Azure Kubernetes Service(AKS)



What are the max resource limits in Kubernetes? ›

Each container has a limit of 0.5 CPU and 128MiB of memory.

What is the minimum resource requirements for Kubernetes? ›

The application can use more than 256MB, but Kubernetes guarantees a minimum of 256MB to the container. On the other hand, limits define the max amount of resources that the container can consume.

What is the difference between Kubernetes resources requests and limits? ›

Kubernetes defines Limits as the maximum amount of a resource to be used by a container. This means that the container can never consume more than the memory amount or CPU amount indicated. Requests, on the other hand, are the minimum guaranteed amount of a resource that is reserved for a container.

What is resource quota and limit range in Kubernetes? ›

The resource quota is the total allocated resources for a particular namespace, while limit range also used to assign limits for objects like containers (Pods) running inside the namespace. This is one of the best practice recommended.

What is resource quota vs limit range? ›

The resource quota is the total allocated resources for a particular namespace, while limit range also used to assign limits for objects like containers (Pods) running inside the namespace.

What happens if pod exceeds CPU limit? ›

If a container attempts to exceed the specified limit, the system will throttle the container.

What is the biggest disadvantage of Kubernetes? ›

The transition to Kubernetes can become slow, complicated, and challenging to manage. Kubernetes has a steep learning curve. It is recommended to have an expert with a more in-depth knowledge of K8s on your team, and this could be expensive and hard to find.

Why is storage on Kubernetes so hard? ›

The reason for the difficulty is because you should not store data with the application or create a dependency on the filesystem by the application. Kubernetes supports cloud providers very well and you can run your own storage system.

What does CPU 100m mean in Kubernetes? ›

cpu: 100m. The unit suffix m stands for “thousandth of a core,” so this resources object specifies that the container process needs 50/1000 of a core (5%) and is allowed to use at most 100/1000 of a core (10%). Likewise 2000m would be two full cores, which can also be specified as 2 or 2.0 .

What does 0.5 CPU mean in Kubernetes? ›

According to the docs, CPU requests (and limits) are always fractions of available CPU cores on the node that the pod is scheduled on (with a resources. requests. cpu of "1" meaning reserving one CPU core exclusively for one pod). Fractions are allowed, so a CPU request of "0.5" will reserve half a CPU for one pod.

What is the limit of pods in Kubernetes? ›

MAXIMUM_PODS : the default maximum number of Pods per node for your cluster, can be configured up to 256 . If omitted, Kubernetes assigns the default value of 110 .

How many pods can Kubernetes handle? ›

On Google Kubernetes Engine (GKE), the limit is 100 pods per node, regardless of the type of node. On Azure Kubernetes Service (AKS), the default limit is 30 pods per node but it can be increased up to 250.

What is the difference between pod requests and limits? ›

A request is the amount of that resources that the system will guarantee for the container, and Kubernetes will use this value to decide on which node to place the pod. A limit is the maximum amount of resources that Kubernetes will allow the container to use.

What does it mean to exceed resource limit? ›

If you see a "Resource Limit Reached" error message in the browser window for your website, it typically means your account is constantly exceeding the resources assigned to it - these can include CPU usage, RAM usage, or Entry Processes (the number of concurrent processes running under your cPanel account).

What are resource limits? ›

Resource limit means the maximum combined value of all resources an individual can have an ownership interest in and still qualify for medicaid.

What is maxSurge in Kubernetes? ›

maxSurge: The number of pods that can be created above the desired amount of pods during an update. This can be an absolute number or percentage of the replicas count. The default is 25%. maxUnavailable: The number of pods that can be unavailable during the update process.

What is 508 resource limit reached? ›

The "508 Resource Limit is Reached" error is an error that appears when an account exceeds the resources assigned to it, which can include CPU usage, RAM usage, and/or concurrent processes for that account. This usually occurs when an account hits the allocated resource limit set in CloudLinux LVE Manager.

What is limit range in Kubernetes? ›

A LimitRange provides constraints that can: Enforce minimum and maximum compute resources usage per Pod or Container in a namespace. Enforce minimum and maximum storage request per PersistentVolumeClaim in a namespace. Enforce a ratio between request and limit for a resource in a namespace.

What is resource maximum? ›

Resource limit means the maximum combined value of all resources an individual can have an ownership interest in and still qualify for medicaid.


1. Optimizing Application Performance on Kubernetes by Dinakar. Guniguntala
(CNCF [Cloud Native Computing Foundation])
2. Understanding CPU & Memory with the Kubernetes Vertical Pod Autoscaler
(That DevOps Guy)
3. Throttling: New Developments in Application Performance with CPU Limits - Dave Chiluk, Indeed
(CNCF [Cloud Native Computing Foundation])
4. Kubernetes Basics: Pods, Nodes, Containers, Deployments and Clusters
(Anton Putra)
5. Manage The Cost Of Kubernetes Clusters And Cloud Resources With Kubecost
(DevOps Toolkit)
6. Kubernetes CRDs (Custom Resource Definitions) and CRs (Custom Resources) explained, with examples
(Vivek Singh)


Top Articles
Latest Posts
Article information

Author: Rubie Ullrich

Last Updated: 06/10/2023

Views: 5923

Rating: 4.1 / 5 (52 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Rubie Ullrich

Birthday: 1998-02-02

Address: 743 Stoltenberg Center, Genovevaville, NJ 59925-3119

Phone: +2202978377583

Job: Administration Engineer

Hobby: Surfing, Sailing, Listening to music, Web surfing, Kitesurfing, Geocaching, Backpacking

Introduction: My name is Rubie Ullrich, I am a enthusiastic, perfect, tender, vivacious, talented, famous, delightful person who loves writing and wants to share my knowledge and understanding with you.