The Numbers: Why Kubernetes Is Under Siege
The shift to cloud-native infrastructure has been underway for years, but 2026 marks the inflection point where attackers have fully caught up. In Q1 2026, Cloudflare reported mitigating 14.2 million Layer 7 DDoS attacks targeting Kubernetes-hosted workloads. That figure represents a 312% increase over the same period in 2025, and it only accounts for traffic Cloudflare sees at its edge. The real number of attacks hitting K8s clusters globally is almost certainly higher.
Why the explosion? Kubernetes adoption has crossed a critical threshold. The majority of new production workloads at mid-to-large enterprises now deploy on K8s, which means attackers who want to disrupt revenue-generating services are, by default, targeting containerized environments. But it is not just about where workloads happen to live. Kubernetes introduces entirely new attack surfaces that do not exist in traditional VM or bare-metal deployments, and threat actors have learned to exploit them with precision.
Concurrently, Kubernetes-related threat actor operations have surged. Operations involving stolen Kubernetes tokens, compromised service accounts, and abused RBAC configurations increased 282% over the last year. Attackers are not merely flooding K8s from the outside; they are infiltrating clusters and using that access to amplify, redirect, or sustain DDoS campaigns from within.
Yo-Yo Attacks: Exploiting Auto-Scaling for Financial Damage
Perhaps the most insidious attack pattern in cloud-native environments is the "yo-yo attack." Traditional DDoS aims to take services offline. Yo-yo attacks have a different objective entirely: they exploit Kubernetes Horizontal Pod Autoscaler (HPA) and cloud provider auto-scaling groups to generate massive cloud bills without causing visible downtime.
The attack works in cycles. The attacker sends a burst of traffic that triggers HPA to scale pods from, say, 3 replicas to 40. Cloud provider node auto-scaling kicks in to accommodate the new pods, spinning up additional VMs. Once the scaling event completes, the attacker backs off. The cluster starts to cool down and scale back in. Before it finishes, another burst arrives. The cluster scales out again.
A yo-yo attack can run for hours, keeping your cluster in a perpetual scaling loop. The service stays up, monitoring dashboards look green, and nobody notices until the cloud bill arrives at 8x to 15x the normal monthly spend.
This pattern is especially effective because it bypasses most alerting. CPU and memory utilization stay within healthy ranges (the autoscaler is working as designed). Latency remains acceptable. Error rates stay flat. The only signal is the rate of scaling events and the accumulating compute cost, which most teams do not monitor in real time.
Here is what a yo-yo attack looks like in HPA event logs:
$ kubectl get events --field-selector reason=SuccessfulRescale -w LAST SEEN TYPE REASON OBJECT MESSAGE 2m Normal SuccessfulRescale deployment/api-gateway New size: 38; reason: cpu above target 4m Normal SuccessfulRescale deployment/api-gateway New size: 6; reason: All metrics below target 6m Normal SuccessfulRescale deployment/api-gateway New size: 41; reason: cpu above target 8m Normal SuccessfulRescale deployment/api-gateway New size: 5; reason: All metrics below target 10m Normal SuccessfulRescale deployment/api-gateway New size: 44; reason: cpu above target
The oscillation pattern is unmistakable once you know to look for it. Defending against yo-yo attacks requires setting HPA scaleDown.stabilizationWindowSeconds to a longer period (300 seconds or more), configuring maximum replica limits, and alerting on scaling event frequency rather than just resource utilization.
Token Theft and Insider Amplification
The 282% increase in Kubernetes token theft operations is not a coincidence. Service account tokens in Kubernetes grant API access to the cluster, and in many deployments they are more permissive than they need to be. When an attacker compromises a pod (through an application vulnerability, supply chain attack, or misconfigured RBAC), they inherit that pod's service account token and can use it to interact with the Kubernetes API.
From a DDoS perspective, stolen tokens enable several dangerous capabilities:
- Pod creation for amplification: An attacker with
create podspermission can spin up dozens of pods running UDP amplification tools, generating outbound attack traffic from within your cluster. Your nodes become unwitting participants in attacks against other targets. - Service disruption via API flooding: The Kubernetes API server itself can be DDoSed from inside the cluster. Rapid, repeated
listandwatchcalls against large resource sets (like listing all pods across all namespaces) can overwhelm etcd and render the control plane unresponsive. - Network policy manipulation: If the compromised service account has permissions to modify NetworkPolicy resources, the attacker can open network paths that were previously blocked, enabling external DDoS traffic to reach internal services that were supposed to be isolated.
- DNS record poisoning: CoreDNS configurations can be altered to redirect traffic, enabling the attacker to reroute legitimate requests through attacker-controlled endpoints that log credentials or inject malicious responses.
The mitigation here is straightforward but rarely implemented fully: disable automatic service account token mounting with automountServiceAccountToken: false, use bound service account tokens with short TTLs, and enforce least-privilege RBAC policies. Audit your cluster with:
$ kubectl get pods --all-namespaces -o json | \ jq -r '.items[] | select(.spec.automountServiceAccountToken != false) | "\(.metadata.namespace)/\(.metadata.name) mounts SA token"'
Any pod that does not explicitly need API access should have token mounting disabled. In practice, that applies to the vast majority of application workloads.
Attack Vectors: How DDoS Traffic Reaches Kubernetes
Kubernetes environments face the same volumetric and application-layer attacks as traditional infrastructure, but the containerized architecture introduces new paths and new complications for each vector.
SYN Floods
SYN floods remain the most common volumetric attack against K8s clusters. Traffic typically enters through a cloud load balancer, hits the ingress controller (NGINX, Envoy, or Traefik), and is then distributed to backend pods. The challenge is that each layer in this chain has different connection limits and different failure modes. A SYN flood that saturates the ingress controller's connection table will take down every service behind it, not just the targeted one.
In Kubernetes, the kernel's SYN backlog is shared at the node level, not isolated per pod. This means a flood targeting one pod's service can exhaust the SYN backlog for the entire node, affecting all pods scheduled there. You can check and tune the backlog with:
# Check current SYN backlog on a K8s node $ cat /proc/sys/net/ipv4/tcp_max_syn_backlog 1024 # Increase it (requires privileged access or init container) $ sysctl -w net.ipv4.tcp_max_syn_backlog=65535 $ sysctl -w net.core.somaxconn=65535
ICMP Floods
ICMP floods target the node network stack directly. In cloud environments, most providers filter ICMP at the VPC level, but on-prem K8s clusters and hybrid deployments often leave ICMP wide open. A sustained ICMP flood can degrade the node's ability to process legitimate traffic, and because Kubernetes health checks (liveness and readiness probes) may traverse the same network path, pod health checks can start failing, triggering unnecessary restarts and rescheduling cascades.
HTTP-Based Application Layer Attacks
HTTP floods are the fastest-growing category and the hardest to detect in K8s. Attackers send high volumes of legitimate-looking HTTP requests to expensive endpoints (complex search queries, report generation, GraphQL resolvers with deep nesting). These requests pass through the load balancer and ingress controller without triggering rate limits because each individual request looks normal.
The damage occurs at the pod level: CPU spikes, database connection pools exhaust, and response latency climbs until the readiness probe fails. Kubernetes then removes the pod from the service endpoint list, sending more traffic to the remaining pods, which accelerates their failure. This cascade can take an entire deployment offline in under a minute.
DNS Amplification
DNS amplification attacks targeting K8s are particularly effective when clusters run their own CoreDNS instances (which is the default). Attackers can send spoofed DNS queries to open resolvers that generate amplified responses directed at the cluster's ingress IPs. If the cluster's CoreDNS is exposed externally (a surprisingly common misconfiguration), it can also be used as an amplification reflector.
Why Detection Is Harder in Containerized Environments
Traditional DDoS detection relies on monitoring traffic at the network perimeter: border routers, firewalls, or dedicated scrubbing appliances. In Kubernetes, this model breaks down in several fundamental ways.
Ephemeral workloads. Pods are created and destroyed constantly. A detection system that tracks traffic by IP address loses context every time a pod is rescheduled or scaled. The IP that was your payment service 30 seconds ago is now your image resizer. Baseline traffic models built on per-IP patterns become useless when IPs are recycled every few minutes.
East-west traffic dominance. In a microservices architecture, the majority of traffic is internal (east-west) rather than external (north-south). A DDoS attack that compromises one service and uses it to flood another stays entirely within the cluster network. Perimeter-only detection never sees it.
Encrypted service mesh traffic. If you run Istio, Linkerd, or another service mesh with mTLS enabled (and you should), east-west traffic is encrypted. Network-level inspection tools cannot distinguish between legitimate inter-service calls and a compromised pod flooding a neighbor at 50,000 requests per second. You need visibility at the node or sidecar level to detect anomalies in encrypted mesh traffic.
On March 3, 2026, Radware launched a dedicated cloud-based Web DDoS Protection service for encrypted traffic, underscoring how critical this visibility gap has become. The industry is actively racing to solve encrypted-traffic DDoS detection because the problem is only growing as more organizations adopt service meshes and zero-trust networking.
Shared tenancy and blast radius. Most K8s clusters run multiple services on the same nodes. An attack targeting one service can degrade every other service co-located on those nodes. Without per-pod traffic visibility, it is difficult to isolate which service is under attack and which are simply collateral damage.
Practical Defense Strategies for Kubernetes
Defending Kubernetes against DDoS requires a layered approach that addresses the unique characteristics of containerized environments. No single tool or technique is sufficient.
1. Enforce Network Policies at the CNI Level
Network policies are your first line of defense for east-west traffic. By default, Kubernetes allows all pod-to-pod communication. Define explicit ingress and egress policies for every namespace:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-gateway-policy
namespace: production
spec:
podSelector:
matchLabels:
app: api-gateway
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
zone: public
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: backend-api
ports:
- protocol: TCP
port: 3000
This limits the blast radius of a compromised pod. Even if an attacker gains code execution in the api-gateway, they can only reach backend-api on port 3000. Lateral movement for amplification is constrained.
2. Configure Ingress Rate Limiting
Your ingress controller should enforce per-client rate limits. For NGINX Ingress Controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
nginx.ingress.kubernetes.io/limit-rps: "50"
nginx.ingress.kubernetes.io/limit-burst-multiplier: "3"
nginx.ingress.kubernetes.io/limit-connections: "20"
spec:
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-gateway
port:
number: 8080
Rate limiting at the ingress layer stops HTTP floods before they reach your application pods. However, it does not protect against volumetric attacks that saturate bandwidth before reaching the ingress controller.
3. Harden Auto-Scaling Against Yo-Yo Abuse
Configure HPA with conservative scale-down behavior and absolute maximums:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-gateway-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-gateway
minReplicas: 3
maxReplicas: 20 # Hard ceiling prevents runaway scaling
behavior:
scaleDown:
stabilizationWindowSeconds: 600 # 10 min cooldown
policies:
- type: Percent
value: 25
periodSeconds: 120
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Pods
value: 4
periodSeconds: 60 # Max 4 pods per minute
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
The key parameters: maxReplicas prevents unbounded scaling. scaleDown.stabilizationWindowSeconds at 600 seconds prevents the rapid oscillation yo-yo attacks depend on. scaleUp.policies limits how fast the cluster can grow, giving you time to detect the attack before costs spiral.
4. Disable Unnecessary Service Account Tokens
Add this to every pod spec that does not require Kubernetes API access:
spec:
automountServiceAccountToken: false
containers:
- name: app
image: myapp:latest
For pods that do need API access, use bound tokens with short lifetimes instead of the default long-lived tokens:
apiVersion: v1
kind: Pod
metadata:
name: api-consumer
spec:
containers:
- name: app
image: myapp:latest
volumeMounts:
- name: bound-token
mountPath: /var/run/secrets/tokens
readOnly: true
volumes:
- name: bound-token
projected:
sources:
- serviceAccountToken:
path: token
expirationSeconds: 3600
audience: api-server
5. Deploy Node-Level Traffic Monitoring
Perimeter monitoring misses east-west attacks, encrypted mesh traffic, and per-pod anomalies. The most effective detection strategy places lightweight monitoring agents on each Kubernetes node. These agents observe traffic at the node's network interface, which sees all pod traffic (both ingress and egress) before encryption and after decryption.
Node-level agents can build per-pod traffic baselines, detect anomalous packet rates, flag unusual protocol distributions, and correlate patterns across multiple nodes to distinguish distributed attacks from localized spikes. This visibility is critical for identifying attacks that perimeter tools cannot see.
6. Monitor Scaling Events as a Security Signal
Treat HPA scaling events as security-relevant signals, not just operational metrics. Alert when:
- A deployment scales more than 3 times in a 30-minute window
- Scale-up events occur during off-peak hours
- Multiple unrelated deployments scale simultaneously (carpet-bombing pattern)
- The ratio of scaling events to legitimate traffic increase is disproportionate
You can export HPA events to your SIEM or monitoring stack by watching the Kubernetes event stream and filtering for SuccessfulRescale events.
7. Isolate Critical Workloads with Dedicated Node Pools
Place revenue-critical services on dedicated node pools with taints and tolerations. This ensures that a DDoS targeting a less critical service cannot degrade performance for your most important workloads through resource contention:
apiVersion: v1
kind: Pod
metadata:
name: payment-service
spec:
nodeSelector:
workload-type: critical
tolerations:
- key: "dedicated"
operator: "Equal"
value: "critical"
effect: "NoSchedule"
The Detection Gap: Why Cloud Provider Tools Fall Short
AWS Shield, Azure DDoS Protection, and GCP Cloud Armor provide valuable perimeter defense, but they share a common limitation: they only see traffic at the load balancer boundary. They have zero visibility into east-west cluster traffic, no ability to correlate pod-level metrics with network anomalies, and no awareness of Kubernetes-specific attack patterns like yo-yo scaling abuse or API server flooding.
This is not a criticism of those tools. They do what they were designed to do. But Kubernetes defense requires an additional detection layer that operates inside the cluster, at the node level, with awareness of pod identity, service relationships, and container orchestration events.
The most effective architecture combines cloud provider DDoS protection at the perimeter with node-level agents inside the cluster. The perimeter layer handles volumetric attacks before they reach your infrastructure. The node-level layer catches application-layer floods, east-west attacks, and the subtle patterns (like yo-yo scaling) that only become visible from inside the cluster.
Flowtriq's agent (ftagent) runs directly on Kubernetes nodes as a DaemonSet, providing per-pod traffic visibility, per-node packet and flow analysis, and real-time anomaly detection that works alongside your existing cloud provider DDoS protection. It sees east-west traffic, detects yo-yo scaling patterns, and correlates network anomalies with K8s events. Learn more about node-level detection.