Kubernetes Tutorials: Traffic-Driven Autoscaling on Kubernetes & Comparing KEDA, HPA, VPA & Custom Adapters

Comparing KEDA, HPA, VPA & Custom Adapters for Real-World Scaling with Cost, Complexity & Best Practices

Category	KEDA (Event-driven)	Prometheus-based (Adapter/HPA)	Datadog-based (Cluster Agent)	CloudWatch-based (Adapter/HPA)
Primary Function	Event-driven autoscaler (creates HPAs dynamically from external metrics like ALB, SQS, Kafka, etc.)	Uses HPA with Prometheus metrics via adapter (e.g., kube-metrics-adapter or prometheus-adapter)	Uses Datadog metrics (via Cluster Agent) as external metrics for HPA	Uses CloudWatch metrics (via AWS CloudWatch Metrics Adapter) for HPA
Metric Source	Multiple external sources: CloudWatch, Prometheus, SQS, Kafka, HTTP, etc. (50+ scalers)	Prometheus time-series metrics (scraped from exporters or apps)	Datadog platform metrics (ingested from AWS, custom apps, APM)	AWS CloudWatch (e.g., ALB metrics, RDS, SQS, Lambda, etc.)
Data Flow Model	Pull metrics or events → internal HPA → scale	Prometheus scrapes → adapter → Kubernetes Metrics API → HPA	Datadog agent → Cluster Agent → External Metrics API → HPA	CloudWatch Adapter → External Metrics API → HPA
Setup Complexity	🟢 Medium (Helm + few YAMLs; no exporter needed)	🔵 Medium-High (need Prometheus + adapter configuration)	🟣 Medium (if Datadog is already deployed)	🟠 Medium (adapter installation + IAM + mappings)
Integration with ALB Traffic	✅ Native (via CloudWatch scaler – uses `RequestCountPerTarget`, `TargetResponseTime`)	⚠️ Requires Prometheus CloudWatch exporter (YACE or similar)	✅ Native (Datadog already pulls ALB metrics)	✅ Native (direct access to ALB metrics)
Supports Scale-to-Zero	✅ Yes	❌ No (HPA cannot scale to zero)	❌ No	❌ No
Responsiveness / Latency	~30–60 seconds (depends on CloudWatch polling)	~15–30 seconds (depends on scrape interval)	~30–60 seconds (depends on Datadog ingestion)	~60 seconds (CloudWatch metric delay)
Operational Cost	💲 Low (CloudWatch API calls only)	💲💲 Medium (Prometheus infra + storage + exporter costs)	💲💲💲 High (Datadog licensing per host/container)	💲 Low (CloudWatch API calls)
Infrastructure Overhead	Lightweight (1 KEDA controller)	Heavy (Prometheus, exporters, adapter)	Moderate (Datadog Cluster Agent)	Moderate (Adapter deployment)
Ease of Maintenance	🟢 Easy – one Helm upgrade for all namespaces	🔵 Moderate – maintain adapter & Prometheus	🟣 Easy if Datadog already managed	🟠 Moderate – periodic IAM & adapter updates
EKS Auto Mode Compatibility	✅ Fully compatible – scales pods; NodePools handle nodes	✅ Compatible	✅ Compatible	✅ Compatible
Multi-Namespace Scaling	✅ Native support (Scoped per namespace)	✅ Supported	✅ Supported	✅ Supported
Security / IAM	Uses IRSA or static keys for AWS APIs	No AWS permissions required (depends on Prometheus)	Uses Datadog API key & IAM integration	Uses IRSA for AWS CloudWatch read access
Supported Triggers / Metrics	50+ sources (CloudWatch, Kafka, RabbitMQ, HTTP, Redis, MySQL, etc.)	Limited to Prometheus metrics	Limited to Datadog metrics	Limited to AWS metrics
Scales on Events (not metrics)	✅ Yes	❌ No	❌ No	❌ No
Can Combine Multiple Triggers	✅ Yes (multi-trigger scaling rules)	⚠️ Only via complex PromQL expressions	⚠️ Limited (Datadog composite metrics)	⚠️ Limited (one metric per HPA)
Recommended For	Event-driven / traffic-based workloads (ALB, queues, web APIs)	Resource or app-metric-based workloads	Organizations using Datadog for monitoring & APM	AWS-centric workloads without Prometheus
Learning Curve	🟢 Low	🔵 Medium	🟣 Low	🟠 Medium
Vendor Lock-in	Low (Open Source)	Low (OSS ecosystem)	High (Datadog SaaS)	Medium (AWS-only)
Community & Ecosystem	Very active (CNCF Sandbox Project)	Large (K8s ecosystem standard)	Proprietary (Datadog documentation)	AWS-maintained (moderate community)
Use with WAF + ALB	✅ Seamless (uses ALB TG metrics directly)	⚠️ Need exporter for ALB metrics	✅ Seamless (Datadog ALB integration)	✅ Seamless (ALB metrics native in CloudWatch)
Example Metric	CloudWatch → `RequestCountPerTarget`, `TargetResponseTime`	Prometheus → `nginx_ingress_controller_requests_total`	Datadog → `aws.applicationelb.request_count`	CloudWatch → `RequestCountPerTarget`
Scale Behavior Visualization	KEDA Metrics API + Grafana dashboards	Prometheus / Grafana	Datadog Dashboards	CloudWatch Dashboards
Maturity (as of 2025)	⭐⭐⭐⭐⭐ (CNCF Incubating)	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐
Overall Recommendation (for EKS + ALB)	✅✅✅ Best Option	✅ Good (if Prometheus is already in place)	⚙️ Suitable for Datadog-native orgs	✅ Good fallback if KEDA not allowed

🌐 1 | Architecture Overview

Flow:
Client → DNS → WAF → ALB → TargetGroup → EKS Service/Pods

Layer	Purpose	Key AWS / K8s Component
Edge Security	Filter malicious traffic	AWS WAF (Web ACL)
Load Balancing	Distribute inbound requests	ALB (AWS Load Balancer Controller)
Routing	Path/host-based dispatch to namespaces	Kubernetes Ingress
Compute	Run workloads	EKS Pods/Deployments
Node Capacity	Provision nodes automatically	EKS Auto Mode NodePools (Karpenter)
Autoscaling Brain	Adjust replicas dynamically	KEDA / HPA / VPA / Custom Adapter

With EKS Auto Mode, AWS manages node scaling.
Your responsibility is pod-level scaling — deciding how many replicas each service needs based on traffic or resource metrics.

🧩 2 | Namespace-Scoped Design Pattern

Each microservice (e.g., booking, auth, medical, telematics) lives in its own namespace.
Each namespace has its Ingress, Service, Deployment, ConfigMap, and autoscaler objects.
Optionally, multiple namespaces can share one ALB via alb.ingress.kubernetes.io/group.name to save cost while keeping per-namespace isolation.

⚙️ 3 | Ingress & WAF Setup (Shared ALB Example)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: booking-ing
  namespace: booking
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/group.name: shared-edge
    alb.ingress.kubernetes.io/group.order: "20"
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/wafv2-acl-arn: arn:aws:wafv2:ap-northeast-1:111111111111:regional/webacl/mywebacl/abcd1234
spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /booking
        pathType: Prefix
        backend:
          service:
            name: booking-svc
            port:
              number: 80

Each namespace can repeat this pattern using different paths (/auth, /legal, etc.) but share the same group.name → one ALB under one WAF.

🚀 4 | Autoscaling Options for Pod Level Control

Below are five viable mechanisms for pod autoscaling inside EKS.

#	Method	Scaling Source	Scales To Zero	Works with ALB Metrics	Typical Latency	Setup Time	Maint. Effort	Approx. Cost*	Skill Level
1	KEDA	External events (CloudWatch ALB, SQS, Prometheus, etc.)	✅	✅ (native scaler)	30-60 s	⚙️ Medium	🧩 Low (once installed)	💲💲 CloudWatch API calls	Intermediate
2	HPA	CPU / memory / custom metrics	❌	⚠️ via adapter	15-30 s	⚙️ Low	🧩 Low	💲 free	Beginner
3	VPA	Internal resource usage	❌	❌	N/A	⚙️ Medium	🧩 Low	💲 free	Intermediate
4	Custom Metric Adapter	Prometheus / CloudWatch	❌	✅ with manual mapping	45-60 s	⚙️ High	🧩 High	💲💲 metrics infra	Advanced
5	Manual Scaling	Human input	❌	❌	N/A	⚙️ Instant	🧩 High Opex	💲 none	Basic

* Cost = relative AWS service charges + operational overhead

🧮 5 | Detailed Analysis of Each Approach

🔹 A | KEDA (Event-Driven Autoscaler)

How it works:
KEDA reads external metrics (CloudWatch ALB RequestCountPerTarget, TargetResponseTime, SQS depth, PromQL queries, etc.) and creates an internal HPA.

Pros

Supports 50 + scalers (AWS, Azure, Kafka, Prometheus, etc.).
Scales to zero during idle.
Simple YAML (ScaledObject) per Deployment.
Works seamlessly with EKS Auto Mode and NodePools.
Natively integrates with CloudWatch ALB metrics.

Cons

Extra component to operate.
CloudWatch polling → small metric costs and ≈ 1 min delay.
Needs IRSA permissions for CloudWatch API.

Setup time: ≈ 1 hr (Helm install + ScaledObject YAMLs)
Maintenance: Low (central Helm upgrade + namespace YAMLs)
Recommended for: Multi-namespace EKS clusters with real-traffic scaling.

🔹 B | HPA (Native Horizontal Pod Autoscaler)

How it works:
Built into Kubernetes; scales based on CPU and memory by default.
Can also use custom metrics with an adapter.

Pros

Native, stable, zero extra components.
Predictable behavior and fine-grained control.

Cons

Default metrics = CPU / memory only.
Cannot scale to zero.
Needs a metric adapter to use ALB metrics.
Not event-driven; reactive after load hits CPU.

Setup time: ≈ 30 min
Maintenance: Minimal
Recommended for: Steady workloads or CPU-bound apps.

🔹 C | VPA (Vertical Pod Autoscaler)

How it works:
Adjusts CPU and memory requests/limits per pod automatically.

Pros

Prevents over/under-provisioning.
Complements KEDA/HPA.

Cons

No replica count scaling.
Not suited for traffic bursts.

Setup time: ≈ 45 min
Maintenance: Low
Recommended for: Batch or steady apps to optimize resources.

🔹 D | Custom Metric Adapters (Prometheus / CloudWatch)

How it works:
Deploy an external-metrics adapter exposing selected metrics to HPA.
HPA then scales on those metrics.

Pros

Fine control; use any metric you own.
Integrates into existing monitoring plane.

Cons

Complex to deploy and maintain.
Harder to debug.
No scale-to-zero.
Usually delayed by scrape interval + adapter polling.

Setup time: 1 – 2 hrs
Maintenance: High
Recommended for: Large orgs with centralized Prometheus or Datadog.

🔹 E | Manual Scaling

kubectl scale deployment <name> --replicas=N

Pros: 100 % control, simple to understand.
Cons: No automation; wastes capacity; high operational risk.
Use only for: testing or stable low-traffic sites.

💡 6 | KEDA Setup Walkthrough (for EKS + ALB)

Install KEDA helm repo add kedacore https://kedacore.github.io/charts helm install keda kedacore/keda -n keda --create-namespace
Enable IRSA (for CloudWatch) eksctl utils associate-iam-oidc-provider --cluster my-eks --approve
IAM Policy { "Version":"2012-10-17", "Statement":[{"Effect":"Allow","Action":[ "cloudwatch:GetMetricData","cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics","cloudwatch:DescribeAlarms" ],"Resource":"*"}] }
ServiceAccount + TriggerAuthentication apiVersion: v1 kind: ServiceAccount metadata: name: svc-traffic-autoscale namespace: booking annotations: eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/eks-traffic-autoscale --- apiVersion: keda.sh/v1alpha1 kind: TriggerAuthentication metadata: name: alb-cw-auth namespace: booking spec: podIdentity: provider: aws
ScaledObject apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: booking-traffic namespace: booking spec: scaleTargetRef: name: booking-deployment minReplicaCount: 2 maxReplicaCount: 30 triggers: - type: aws-cloudwatch authenticationRef: name: alb-cw-auth metadata: namespace: AWS/ApplicationELB metricName: RequestCountPerTarget dimensionName: TargetGroup dimensionValue: targetgroup/k8s-xyz/abc123456 statistic: Sum period: "60" metricUnit: Count targetValue: "100"
Observe Scaling kubectl get hpa -n booking kubectl get pods -n booking -w

📈 7 | Performance & Cost Considerations

Factor	KEDA	HPA	VPA	Custom Adapter
Responsiveness	30-60 s	15-30 s	N/A	45-60 s
Infra Cost	Low (CloudWatch polling)	None	None	Medium (Prometheus infra)
Setup Overhead	Medium	Low	Medium	High
Maintenance	Low	Low	Low	High
Complexity	Medium	Low	Low	High
Best for	Traffic / Event driven	CPU/Mem	Resource tuning	Centralized metrics
Scale-to-Zero	✅	❌	❌	❌

🧠 8 | Decision Matrix

Requirement	Best Choice	Reason
Real ALB traffic scaling	KEDA	Direct CloudWatch integration
CPU/memory bound apps	HPA	Native simple autoscaler
Optimize pod resources over time	VPA	Adjusts requests/limits
Central metrics team wants Prometheus-based control	Custom Adapter + HPA	Full metric plane
Low-traffic or manual control	Manual	No automation needed

🧰 9 | Combining Approaches

A production-grade EKS stack often mixes them:

Layer	Tool	Role
Replica Scaling	KEDA + HPA	Respond to traffic & CPU
Resource Tuning	VPA	Adjust limits automatically
Node Scaling	EKS Auto Mode (NodePools)	Provide capacity
Monitoring	CloudWatch + AMP + Grafana	Visibility into metrics

🔒 10 | Security and Auth Notes

Keep Firebase OIDC at pod level (not ALB listener), which avoids auth redirect limits.
Enable IRSA for KEDA & pods requiring AWS API access.
WAF rules protect ALB from volumetric attacks before KEDA reacts.
Monitor 5xx errors + TargetResponseTime to guard against scaling loops.

🧭 11 | Final Recommendation

For your multi-namespace, WAF-protected, ALB-routed EKS cluster running in Full Auto Mode,
KEDA is the best fit for traffic-driven autoscaling:

Event-driven and responsive to real user load.

Scales independently per namespace/service.

Integrates cleanly with EKS NodePools for capacity.

Minimizes cost via scale-to-zero and fine-grained rules.

Use HPA as a fallback for CPU-based logic, VPA for optimization, and custom adapters only when you already maintain Prometheus or Datadog metric infrastructure.

🏁 Summary Matrix

Dimension	Best Fit
Speed to implement	HPA
Responsiveness to traffic	KEDA
Ease of maintenance	KEDA / HPA
Cost efficiency	KEDA (scale-to-zero)
Complex metric logic	Custom Adapter
Resource tuning	VPA

📚 References & Further Reading

AWS Blog – Autoscaling EKS with KEDA & CloudWatch
AWS Docs – EKS Auto Mode & NodePools
KEDA Docs – CloudWatch Scaler
SpectroCloud – Kubernetes Autoscaling Patterns: HPA, VPA & KEDA

✅ Final Takeaway:
If you need hands-off, event-driven, traffic-aware, namespace-isolated scaling for an ALB-fronted EKS cluster,
KEDA + EKS Auto Mode (NodePools) is the modern production-grade combination—
balancing performance, cost, and operational simplicity for any multi-service cloud platform.

Rajesh Kumar

I’m a DevOps/SRE/DevSecOps/Cloud Expert passionate about sharing knowledge and experiences. I have worked at Cotocus. I share tech blog at DevOps School, travel stories at Holiday Landmark, stock market tips at Stocks Mantra, health and fitness guidance at My Medic Plus, product reviews at TrueReviewNow , and SEO strategies at Wizbrand.

Do you want to learn Quantum Computing?

Please find my social handles as below;

Rajesh Kumar Personal Website
Rajesh Kumar at YOUTUBE
Rajesh Kumar at INSTAGRAM
Rajesh Kumar at X
Rajesh Kumar at FACEBOOK
Rajesh Kumar at LINKEDIN
Rajesh Kumar at WIZBRAND

Rajesh Kumar DailyLogs

1 Comment

Newest

Oldest Most Voted

Inline Feedbacks

View all comments

Jason Mitchell

14 days ago

Really liked how you broke down autoscaling — you made KEDA vs HPA vs VPA way easier to understand than most guides out there. The traffic-based examples were super helpful too. Awesome write-up.