Published April 26, 2026 ยท Comparison

    KuberNap vs KEDA scale-to-zero: when to use which

    KEDA scales event-driven Kubernetes workloads up from zero on queue depth, HTTP traffic, or custom metrics. KuberNap auto-sleeps idle dev/staging namespaces on activity signals and wakes them via an explicit HTTP call. They solve different problems and stack cleanly in the same cluster.

    What KEDA does well

    KEDA - Kubernetes Event-Driven Autoscaler - is a CNCF graduated project that scales workloads from zero to N replicas based on the depth of an external event source. It ships with more than 70 built-in scalers covering RabbitMQ, Kafka, AWS SQS, Azure Service Bus, Google Pub/Sub, Redis Streams, Prometheus, cron, and many more. You declare a ScaledObject on a Deployment, point it at a trigger source, and KEDA's controller polls that source and adjusts replica count in real time.

    For event-driven workloads, KEDA is the right answer. A queue consumer with zero pending messages should run zero pods; a workload with 10,000 pending messages should fan out. KEDA models that natively. The KEDA HTTP add-on (HTTP-Add-On / KEDA HTTP scaler) extends the same model to HTTP services, queueing requests at an interceptor while the target deployment scales from zero. This works well for synchronous services where cold-start latency is acceptable.

    KEDA also integrates cleanly with HPA - under the hood, a ScaledObject creates an HPA managed by KEDA, so existing CPU/memory-based scaling rules continue to work alongside event-driven triggers.

    What KuberNap does that KEDA doesn't

    KuberNap targets a different question: not "what should the replica count of this workload be in response to events?" but "should this dev-cluster workload exist at all right now?"

    Dev and staging Kubernetes clusters run 24 hours a day, 7 days a week, but engineers actively use them roughly 24% of the time - weekday business hours minus meetings, focus time, and breaks. The remaining ~76% of hours are pure idle waste. The workloads in those clusters are rarely event-driven; they are HTTP APIs, internal tools, databases, and supporting microservices that exist by default and are accessed by exception.

    KuberNap auto-detects per-deployment idleness using a composite score (40% CPU usage from metrics-server, 40% recent traffic via pod restart frequency, 20% pod age) and scales sleep candidates to zero replicas. State lives in kubernap.com/* annotations on the workloads themselves - no external database, no CRDs, no webhooks. Wake is a single HTTP POST that restores the saved replica count.

    Critically, KuberNap operates at namespace blanket level on demand: one POST sleeps every Deployment, StatefulSet, and CronJob in a staging namespace. KEDA requires a ScaledObject per workload, with a per-workload trigger configured. For a 20-deployment staging namespace, that's 20 ScaledObject manifests to author and maintain.

    When to choose KEDA

    • You have queue consumers, stream processors, or webhook receivers that should scale from zero to N pods on event arrival.
    • Your workload's idleness can be precisely expressed as "the trigger source has zero items" (queue depth = 0, no pending HTTP requests in the last N seconds, etc.).
    • You're comfortable authoring a ScaledObject per workload and maintaining trigger config alongside your deployments.
    • The scale-from-zero path is acceptable: cold start latency, image pull, and readiness probe duration are tolerable for the use case.
    • You want fine-grained replica scaling (not just on/off): KEDA can scale to 5, 50, or 500 replicas based on event volume.

    When to choose KuberNap

    • You have dev, staging, preview, or sandbox clusters that idle for nights and weekends and want to recover that ~76% of compute spend without per-workload config.
    • Your dev workloads aren't naturally event-driven - they're HTTP APIs, internal dashboards, supporting services, and ephemeral environments accessed by humans, not queues.
    • You want wake-on-demand: a Slack bot, a CI hook, or an engineer's script can wake a whole namespace via one HTTP call.
    • You want to avoid managing a ScaledObject per Deployment in a 20-deployment staging namespace.
    • You want an audit trail of every sleep/wake action with timestamps and prior-state info, exposed at GET /api/v1/events.

    Can you use both?

    Yes - they're complementary in real clusters. A typical setup:

    • KEDA on the workloads that genuinely scale on events: queue consumers, stream processors, webhook handlers.
    • KuberNap on the rest of the dev/staging namespace: the HTTP APIs, databases, monitoring stack, internal admin tools.
    • KuberNap's protected-namespace list (kube-system, production, etc.) covers the boundary; KEDA scaling rules in production are unaffected.

    Because both tools store state on the workloads themselves (KEDA via ScaledObject, KuberNap via kubernap.com/* annotations), there is no shared store to reconcile. They can run side by side on the same cluster without coordination.

    Common pitfalls when picking between them

    A few patterns we see teams fall into when choosing between KEDA and KuberNap for dev clusters:

    • Treating KEDA as a generic dev-cluster sleeper. KEDA's HTTP add-on can technically scale arbitrary HTTP services to zero, but the model doesn't fit non-HTTP workloads (databases, internal cron processors, monitoring stack components) and the per-service config overhead is real. If you're authoring a dozen ScaledObjects to cover a dev namespace, you're doing KuberNap's job with KEDA's tools.
    • Trying to use KuberNap for event-driven workloads. KuberNap's idle score doesn't observe queue depth or external metrics. A queue consumer with a backlog of 10,000 messages but low CPU and no recent restarts will read as idle and be sleep candidate - wrong call for that workload class. Mark event-driven workloads with KEDA and let KuberNap manage the rest.
    • Forgetting wake latency budgets. Both tools start workloads from zero on demand. The cold-start latency (image pull + readiness probe) is the same for both. Budget for it explicitly when designing UX flows that depend on a from-zero scale-up.

    Quick reference

    CapabilityKEDAKuberNap
    Scale triggerExternal event sources (70+)Activity (CPU + traffic + pod age)
    ScopePer-workload (ScaledObject)Per-deployment + per-namespace (one API call)
    Wake-on-demandEvent-triggeredHTTP POST
    State storageScaledObject CRDResource annotations only
    Best forEvent-driven workloadsAuto-sleeping dev/staging clusters
    LicenseApache 2.0Apache 2.0

    Related reading

    FAQ

    Is KEDA the same as KuberNap?
    No. KEDA scales workloads up and down based on external events - queue depth, HTTP request rate, custom metrics. KuberNap scales dev/staging workloads to zero based on observed idleness (low CPU, no recent traffic, old pods) and wakes them via an explicit HTTP API call.
    Can I run KEDA and KuberNap in the same cluster?
    Yes. They target different workload classes. Use KEDA for event-driven services like queue consumers, and KuberNap for the rest of the dev/staging namespace. KEDA's own HTTP add-on can scale individual HTTP services to zero, but it doesn't sleep an entire namespace blanket-style the way KuberNap does.
    When should I pick KEDA over KuberNap?
    When your workload is event-driven and 'idle' can be expressed as 'no events': queue consumers waiting on RabbitMQ, Kafka, or SQS; webhook receivers; cron-triggered batch jobs. KEDA was built for these and has 70+ scalers that read directly from event sources.
    When should I pick KuberNap over KEDA?
    When you have a dev or staging cluster with web services, internal tools, and supporting infrastructure that idle for nights and weekends. KEDA can scale these on HTTP traffic but requires a per-service trigger config; KuberNap auto-detects per-deployment idleness and sleeps a whole namespace with a single API call.
    Does KuberNap support event-based wake-ups like KEDA?
    Wake is a single HTTP POST to /api/v1/namespaces/{namespace}/wake, so any system that can fire an HTTP request - Slack bot, GitHub webhook, CI pipeline - can wake a namespace. The wake itself is not driven by KEDA-style metric polling; it's an explicit imperative call.

    Built by KuberNap - Kubby naps so your cluster doesn't have to. kubernap.com