Skip to main content

Command Palette

Search for a command to run...

⚖️ Kubernetes Resource Requests and Limits Explained (with Best Practices)

Updated
6 min read

When running workloads in Kubernetes, one of the most important things to configure is how much CPU and memory a pod can use. Without proper settings, a single greedy pod can starve others, or your nodes may crash under heavy load.

This is where requests and limits come in. Let’s break it down step by step.


🟢 What Are Requests and Limits?

Resource Requests

  • The minimum amount of CPU and memory a container is guaranteed.

  • The scheduler uses these values to decide on which node to place the pod.

  • Example:

      resources:
        requests:
          cpu: "500m"
          memory: "512Mi"
    

    ➝ Pod gets at least 0.5 CPU and 512Mi RAM.


Resource Limits

  • The maximum resources a container can consume.

  • Prevents one pod from hogging all resources.

  • Example:

      resources:
        limits:
          cpu: "1"
          memory: "1Gi"
    

    ➝ Pod can’t use more than 1 CPU and 1Gi RAM.


🖥️ CPU vs 💾 Memory

CPU

  • 1 CPU = 1 vCPU (AWS), 1 core (Azure/GCP), or 1 hyperthread.

  • Can specify fractions:

    • 100m = 0.1 CPU.
  • If pod exceeds its CPU limit, it gets throttled (slowed down).

Memory

  • Units: Mi (Mebibytes), Gi (Gibibytes).

  • If pod exceeds its memory limit, it is killed (OOMKilled).

  • Memory cannot be throttled like CPU.


⚖️ Scenarios to Know

  1. No requests, no limits

    • Pod can take everything → others may starve. ❌ Not safe.
  2. Limits only (no requests)

    • Kubernetes treats request = limit.

    • Pod is guaranteed exactly that much.

    • Safe, but not flexible.

  3. Requests + Limits set

    • Pod always gets its request.

    • Can burst up to the limit.

    • Balanced, but unused limits may waste resources.

  4. Requests only (no limits)

    • Pod guaranteed its request.

    • Can use more if node has free capacity.

    • Best for most cases, but requires discipline (all pods should set requests).

⚖️ Memory Scenarios to Know

1. No requests, no limits

  • Pod can take all memory on the node if available → may cause other pods or system processes to OOM. ❌ Not safe.

2. Limits only (no requests)

  • Kubernetes treats request = limit.

  • Pod is guaranteed exactly that much memory.

  • Safe, but can waste memory if pod doesn’t need the full limit.

  • If pod exceeds limit → killed (OOMKilled).

3. Requests + Limits set

  • Pod guaranteed its request.

  • Can use memory up to the limit.

  • Balanced, but if pod uses more than limit → killed.

  • Best practice for workloads with predictable memory usage.

4. Requests only (no limits) ✅

  • Pod guaranteed its request.

  • Can use more memory if node has free capacity, but if it grows too much → node might run out of memory, and pod or others may be OOMKilled.

  • Safer than no requests, but requires careful monitoring and discipline.

✅ CPU

  • Pod is guaranteed its request (minimum CPU it needs).

  • Can use more CPU if node has spare capacity.

  • CPU is throttled, so even if a pod uses more, it won’t crash others.

  • Pros: Flexible, efficient, lets pods burst when resources are available.

  • Cons: If a pod consumes too much CPU, it may slow down other pods sharing the same node.


⚠️ Memory

  • Pod is guaranteed its request (minimum memory).

  • Can use more memory if node has free capacity, but no limit means it can potentially use all memory on the node.

  • Memory cannot be throttled, so if it grows too much → pod or other pods may be OOMKilled.

  • Pros: Flexible if memory usage is predictable and you trust all pods to behave.

  • Cons: Risky if some pods may have memory leaks or high spikes.


💡 Best Practice

  1. Always set requests — ensures pods get guaranteed resources.

  2. Set limits for memory if workload can spike — prevents a single pod from crashing the node.

  3. CPU limits are optional — only needed if you want to prevent noisy neighbors or enforce strict resource isolation.


In short:

  • ✅ CPU: requests only is safe and flexible. ⚠️ Set limits only when necessary → e.g., in multi-tenant clusters or public labs.

  • ⚠️ Memory: requests only is flexible but can be risky → consider setting a reasonable limit.


  • ✅ Use LimitRange at the namespace level to enforce defaults:
apiVersion: v1
kind: LimitRange
metadata:
  name: cpu-mem-defaults
  namespace: dev
spec:
  limits:
  - default:
      cpu: "1"
      memory: "512Mi"
    defaultRequest:
      cpu: "0.5"
      memory: "256Mi"
    type: Container

This ensures every pod has a baseline, even if developers forget to specify resources.


📊 Visual Flow: How Scheduling Works

  1. Pod created → Scheduler checks requests.

  2. Scheduler finds a node with enough free resources.

  3. Pod is scheduled onto that node.

  4. At runtime:

    • If pod exceeds CPU limit → throttled.

    • If pod exceeds memory limit → killed.

    • If pod stays within requests → always guaranteed that much.


🚀 Conclusion

  • Requests = Minimum guarantee

  • Limits = Maximum cap

  • Always set requests to prevent resource starvation.

  • Use limits carefully, only when you need strict isolation.

With the right balance, you’ll keep your Kubernetes cluster fair, efficient, and stable.



LimitRange & ResourceQuota

LimitRange (CPU & Memory)

A LimitRange is a namespace-level policy that defines default resource requests and limits for pods and containers if they are not explicitly set. It ensures that no pod in the namespace runs without some resource constraints.

Key points:

  • Applies at namespace level.

  • Can define default requests and limits for CPU and memory.

  • Can define minimum and maximum allowed values for requests and limits.

  • Only affects new pods created after the LimitRange is applied. Existing pods are not affected.

Example:

apiVersion: v1
kind: LimitRange
metadata:
  name: example-limitrange
  namespace: my-namespace
spec:
  limits:
  - type: Container
    default:
      cpu: 500m
      memory: 512Mi
    defaultRequest:
      cpu: 250m
      memory: 256Mi
    max:
      cpu: 1
      memory: 1Gi
    min:
      cpu: 100m
      memory: 128Mi
  • defaultRequest → scheduler guaranteed resources if pod doesn’t specify.

  • default → runtime limit if pod doesn’t specify.

  • max → maximum allowed resource (ceiling).

  • min → minimum allowed resource (floor).

Explanation:

  • If a pod does not specify requests/limits, it gets defaultRequest and default.

  • Pods cannot request more than max or less than min.

✅ Ensures fair resource usage and prevents runaway pods.


ResourceQuota

A ResourceQuota is a namespace-level limit on the total resources that all pods/containers together can consume.

Key points:

  • Limits the sum of resources used by all pods in the namespace.

  • Can limit CPU, memory, number of pods, services, persistent volumes, etc.

  • Prevents a single namespace from consuming all cluster resources.

Example:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: example-quota
  namespace: my-namespace
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 4Gi
    limits.cpu: "10"
    limits.memory: 10Gi
    pods: "10"

Explanation:

  • The sum of CPU requests across all pods ≤ 4 CPU.

  • The sum of memory requests across all pods ≤ 4 GiB.

  • The sum of limits across all pods ≤ 10 CPU and 10 GiB memory.

  • Max 10 pods in this namespace.

✅ Ensures overall resource governance at the namespace level.


⚖️ Summary

ObjectScopeWhat it controlsNotes
LimitRangeNamespaceDefault requests & limits, min/maxApplied to new pods only
ResourceQuotaNamespaceTotal resource usage by all podsControls aggregate CPU, memory, pod count, etc.

💡 Tip:

  • Use LimitRange to ensure each pod has same defaults and prevents very small/huge pods.

  • Use ResourceQuota to prevent a namespace from consuming all cluster resources.

5 views