How to ensure your Kubernetes Pods have enough CPU

A common risk is deploying Pods without setting a CPU request. While it may seem like a low-impact, low-severity issue, not using CPU requests can have a big impact, including preventing your Pod from running. In this blog, we explain why missing CPU requests is a risk, how you can detect it using Gremlin, and how you can address it.

Looking for more Kubernetes risks lurking in your system? Grab a copy of our comprehensive ebook, “Kubernetes Reliability at Scale.”

What are CPU requests and why are they important?

In Kubernetes, you can control how resources are allocated to individual Deployments, Pods, and even containers. When you specify a limit, Kubernetes won't allocate more than that amount to the Pod. Conversely, when you specify a request, you're specifying the amount that the Pod requires to run.

Kubernetes measures CPU request values as CPU units. For example, 1 CPU unit is the same as 1 physical or virtual CPU core. This value can be fractional: 0.5 is half of one core, 0.1 is one tenth of a core, etc.

Requests serve two key purposes:

They tell Kubernetes the minimum amount of the resource to allocate to a Pod. This helps Kubernetes determine which node to schedule the Pod on and how to schedule it relative to other Pods.
They protect your nodes from resource shortages by preventing over-allocating Pods on a single node.

Without this, Kubernetes might schedule a Pod onto a node that doesn't have enough capacity for it. Even if the Pod uses a small amount of CPU at first, that amount could increase over time, leading to CPU exhaustion.

How do I mitigate missing CPU requests?

To mitigate this risk, specify an appropriate resource request for each of your containers using spec.containers[].resources.requests.cpu. If you're not sure what to set as a value, you can get a baseline estimate using this process:

Run your Pod normally.
Collect metrics using the Kubernetes Metrics API, an observability tool, or a cloud platform. An easy way to do this is by running kubectl top pod. Ideally, you should gather these metrics from a production system for the most accurate results.
Find the CPU usage for your Pod, then use that value as the CPU request amount. You might want to increase this amount to leave some overhead, especially if the Pod isn't under any load.

For example, imagine we have a Pod running Nginx that we want to set CPU requests for. After some testing, we determined that the container uses 200m of CPU time. To be safe, we'll request 250m by adding it to our Kubernetes manifest (see lines 10—12 below):

YAML


# nginx-manifest.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - name: nginx
      image: nginx:1.25.2
      resources:
        requests:
          cpu: '250m'
      ports:
        - containerPort: 80

Then, apply the change and wait for Kubernetes to re-deploy your Pod:

‍


kubectl apply -f nginx-manifest.yaml

How do I validate that I'm resilient?

Once your Pod finishes restarting, you can use the Kubernetes Dashboard (or kubectl describe node <node name>) to list each Pod running on the specified node, along with their resource requests and limits. If your memory request applied successfully, then the Nginx Pod should have a value listed in the "Memory Requests" column:

YAML


Non-terminated Pods: (23 in total)
  Namespace                   Name                                        CPU Requests  CPU Limits  Memory Requests   Memory Limits  Age
  ---------                   ----                                        ------------  ----------  ---------------   -------------  ---
  default                     nginx-767687bc57-b4g6w                      250m (3%)     0 (0%)      1Gi (7%)          0 (0%)         143d
  kubevirt                    virt-api-66859f4c8d-4c2pn                   5m (0%)       0 (0%)      500Mi (3%)        0 (0%)         10d
  kube-system                 coredns-59b4f5bbd5-ns25b                    100m (1%)     0 (0%)      70Mi (0%)         170Mi (1%)     124d
  kubevirt                    virt-controller-8545966675-2fjd9            10m (0%)      0 (0%)      275Mi (1%)        0 (0%)         10d
  kubevirt                    virt-operator-6c649b9567-9l7g4              10m (0%)      0 (0%)      450Mi (3%)        0 (0%)         10d
  kubevirt                    virt-handler-g9phl                          10m (0%)      0 (0%)      325Mi (2%)        0 (0%)         10d

You can also use Gremlin to verify your mitigation. Gremlin's Detected Risks feature immediately detects any high-priority reliability concerns in your environment. These can include misconfigurations, bad default values, or reliability anti-patterns. If you've addressed this risk, then the CPU requests risk will show as "Mitigated" instead of "At Risk".

A more thorough way to validate this is by seeing how Kubernetes responds when the Pod grows beyond its request. For example, what happens when our Pod uses exactly 250m of CPU time? What about 300m? This requires an active approach to testing using a method called fault injection.

Using fault injection to validate your fix

With fault injection, you can consume specific amounts of CPU time within a Pod or container to ensure your Pod doesn't get evicted or moved to a different node. In Gremlin, an ad-hoc fault injection is called an experiment.

To test this scenario:

Log into the Gremlin web app at app.gremlin.com.
Select Experiments in the left-hand menu and select New Experiment.
Select Kubernetes, then select our Nginx Pod.
Expand Choose a Gremlin, select the Resource category, then select the CPU experiment.
Change CPU Capacity to the percentage of CPU we want to consume. We want to use 250m of CPU time, which equates to 1/4 of a single core. In other words, we want to use 25%. In Gremlin, we'll set CPU Capacity to 25 and keep the number of cores set to 1.
Click Run Experiment to start the experiment.

Now, we keep an eye on our Nginx Pod. We'll see usage increase above 250m, but the Pod itself will keep running just fine. If it gets evicted or rescheduled, this tells us one of several things:

We're requesting an unnecessarily high number of CPU units.
We don't have enough capacity to run our workloads, and we need to scale our cluster vertically.
We're not leaving enough overhead for this Pod to let it grow, and so we should increase our minimum requested CPU.


kubectl top pod

‍


NAME                                  CPU(cores)   MEMORY(bytes)
...
frontend-6f7b5f7f88-cn5xr             293m         66Mi
...

‍

What similar risks should I be looking for?

You can use these same methods to test for memory requests. In fact, Gremlin's Detected Risks automatically finds Kubernetes resources that don't have memory requests defined, just like how it finds resources without CPU requests. For a complete list of the most critical Kubernetes risks, download a free copy of our ebook.

Ready to find out which of your Kubernetes resources are missing CPU request definitions? Sign up for a free 30-day trial, install the Gremlin agent, and get a report of your reliability risks in minutes.

No items found.

Start your free trial

Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.

sTART YOUR TRIAL

K8s Reliability at Scale

To learn more about Kubernetes failure modes and how to prevent them at scale, download a copy of our comprehensive ebook

Get the Ultimate Guide