Published April 26, 2026 ยท Comparison
KuberNap vs KEDA scale-to-zero: when to use which
KEDA scales event-driven Kubernetes workloads up from zero on queue depth, HTTP traffic, or custom metrics. KuberNap auto-sleeps idle dev/staging namespaces on activity signals and wakes them via an explicit HTTP call. They solve different problems and stack cleanly in the same cluster.
What KEDA does well
KEDA - Kubernetes Event-Driven Autoscaler - is a CNCF graduated project that scales workloads from zero to N replicas based on the depth of an external event source. It ships with more than 70 built-in scalers covering RabbitMQ, Kafka, AWS SQS, Azure Service Bus, Google Pub/Sub, Redis Streams, Prometheus, cron, and many more. You declare a ScaledObject on a Deployment, point it at a trigger source, and KEDA's controller polls that source and adjusts replica count in real time.
For event-driven workloads, KEDA is the right answer. A queue consumer with zero pending messages should run zero pods; a workload with 10,000 pending messages should fan out. KEDA models that natively. The KEDA HTTP add-on (HTTP-Add-On / KEDA HTTP scaler) extends the same model to HTTP services, queueing requests at an interceptor while the target deployment scales from zero. This works well for synchronous services where cold-start latency is acceptable.
KEDA also integrates cleanly with HPA - under the hood, a ScaledObject creates an HPA managed by KEDA, so existing CPU/memory-based scaling rules continue to work alongside event-driven triggers.
What KuberNap does that KEDA doesn't
KuberNap targets a different question: not "what should the replica count of this workload be in response to events?" but "should this dev-cluster workload exist at all right now?"
Dev and staging Kubernetes clusters run 24 hours a day, 7 days a week, but engineers actively use them roughly 24% of the time - weekday business hours minus meetings, focus time, and breaks. The remaining ~76% of hours are pure idle waste. The workloads in those clusters are rarely event-driven; they are HTTP APIs, internal tools, databases, and supporting microservices that exist by default and are accessed by exception.
KuberNap auto-detects per-deployment idleness using a composite score (40% CPU usage from metrics-server, 40% recent traffic via pod restart frequency, 20% pod age) and scales sleep candidates to zero replicas. State lives in kubernap.com/* annotations on the workloads themselves - no external database, no CRDs, no webhooks. Wake is a single HTTP POST that restores the saved replica count.
Critically, KuberNap operates at namespace blanket level on demand: one POST sleeps every Deployment, StatefulSet, and CronJob in a staging namespace. KEDA requires a ScaledObject per workload, with a per-workload trigger configured. For a 20-deployment staging namespace, that's 20 ScaledObject manifests to author and maintain.
When to choose KEDA
- You have queue consumers, stream processors, or webhook receivers that should scale from zero to N pods on event arrival.
- Your workload's idleness can be precisely expressed as "the trigger source has zero items" (queue depth = 0, no pending HTTP requests in the last N seconds, etc.).
- You're comfortable authoring a ScaledObject per workload and maintaining trigger config alongside your deployments.
- The scale-from-zero path is acceptable: cold start latency, image pull, and readiness probe duration are tolerable for the use case.
- You want fine-grained replica scaling (not just on/off): KEDA can scale to 5, 50, or 500 replicas based on event volume.
When to choose KuberNap
- You have dev, staging, preview, or sandbox clusters that idle for nights and weekends and want to recover that ~76% of compute spend without per-workload config.
- Your dev workloads aren't naturally event-driven - they're HTTP APIs, internal dashboards, supporting services, and ephemeral environments accessed by humans, not queues.
- You want wake-on-demand: a Slack bot, a CI hook, or an engineer's script can wake a whole namespace via one HTTP call.
- You want to avoid managing a ScaledObject per Deployment in a 20-deployment staging namespace.
- You want an audit trail of every sleep/wake action with timestamps and prior-state info, exposed at
GET /api/v1/events.
Can you use both?
Yes - they're complementary in real clusters. A typical setup:
- KEDA on the workloads that genuinely scale on events: queue consumers, stream processors, webhook handlers.
- KuberNap on the rest of the dev/staging namespace: the HTTP APIs, databases, monitoring stack, internal admin tools.
- KuberNap's protected-namespace list (
kube-system,production, etc.) covers the boundary; KEDA scaling rules in production are unaffected.
Because both tools store state on the workloads themselves (KEDA via ScaledObject, KuberNap via kubernap.com/* annotations), there is no shared store to reconcile. They can run side by side on the same cluster without coordination.
Common pitfalls when picking between them
A few patterns we see teams fall into when choosing between KEDA and KuberNap for dev clusters:
- Treating KEDA as a generic dev-cluster sleeper. KEDA's HTTP add-on can technically scale arbitrary HTTP services to zero, but the model doesn't fit non-HTTP workloads (databases, internal cron processors, monitoring stack components) and the per-service config overhead is real. If you're authoring a dozen ScaledObjects to cover a dev namespace, you're doing KuberNap's job with KEDA's tools.
- Trying to use KuberNap for event-driven workloads. KuberNap's idle score doesn't observe queue depth or external metrics. A queue consumer with a backlog of 10,000 messages but low CPU and no recent restarts will read as idle and be sleep candidate - wrong call for that workload class. Mark event-driven workloads with KEDA and let KuberNap manage the rest.
- Forgetting wake latency budgets. Both tools start workloads from zero on demand. The cold-start latency (image pull + readiness probe) is the same for both. Budget for it explicitly when designing UX flows that depend on a from-zero scale-up.
Quick reference
| Capability | KEDA | KuberNap |
|---|---|---|
| Scale trigger | External event sources (70+) | Activity (CPU + traffic + pod age) |
| Scope | Per-workload (ScaledObject) | Per-deployment + per-namespace (one API call) |
| Wake-on-demand | Event-triggered | HTTP POST |
| State storage | ScaledObject CRD | Resource annotations only |
| Best for | Event-driven workloads | Auto-sleeping dev/staging clusters |
| License | Apache 2.0 | Apache 2.0 |
Related reading
- KuberNap vs kube-downscaler - schedule-based vs activity-based scaling.
- How to scale idle Kubernetes namespaces to zero - canonical setup guide.
- Why your dev cluster wastes 75% of compute - original-data piece on idle waste math.
FAQ
- Is KEDA the same as KuberNap?
- No. KEDA scales workloads up and down based on external events - queue depth, HTTP request rate, custom metrics. KuberNap scales dev/staging workloads to zero based on observed idleness (low CPU, no recent traffic, old pods) and wakes them via an explicit HTTP API call.
- Can I run KEDA and KuberNap in the same cluster?
- Yes. They target different workload classes. Use KEDA for event-driven services like queue consumers, and KuberNap for the rest of the dev/staging namespace. KEDA's own HTTP add-on can scale individual HTTP services to zero, but it doesn't sleep an entire namespace blanket-style the way KuberNap does.
- When should I pick KEDA over KuberNap?
- When your workload is event-driven and 'idle' can be expressed as 'no events': queue consumers waiting on RabbitMQ, Kafka, or SQS; webhook receivers; cron-triggered batch jobs. KEDA was built for these and has 70+ scalers that read directly from event sources.
- When should I pick KuberNap over KEDA?
- When you have a dev or staging cluster with web services, internal tools, and supporting infrastructure that idle for nights and weekends. KEDA can scale these on HTTP traffic but requires a per-service trigger config; KuberNap auto-detects per-deployment idleness and sleeps a whole namespace with a single API call.
- Does KuberNap support event-based wake-ups like KEDA?
- Wake is a single HTTP POST to /api/v1/namespaces/{namespace}/wake, so any system that can fire an HTTP request - Slack bot, GitHub webhook, CI pipeline - can wake a namespace. The wake itself is not driven by KEDA-style metric polling; it's an explicit imperative call.