Why L7 Is Eating the Threat Landscape
For years, DDoS meant raw bandwidth. Botnets would hurl terabits of UDP traffic at a target and hope the pipe burst. That approach still exists, but it is no longer where the growth is. Cloudflare mitigated 20.5 million DDoS attacks in Q1 2025 alone, a 358% year-over-year spike, and the dominant trend across every major provider's data is the same: attackers are shifting from L3/L4 to L7 because protective mechanisms are most limited at the application layer.
The logic is straightforward. Volumetric attacks are increasingly commoditized and increasingly easy to absorb. Any organization behind a major CDN can soak up a multi-terabit UDP flood without blinking. But application-layer attacks exploit business logic, protocol semantics, and computational asymmetry. A single crafted HTTP request can force a server to execute an expensive database query, render a complex template, or resolve a deeply nested API call. The attacker's cost per unit of damage is orders of magnitude lower at L7 than at L3.
DDoS attacks targeting APIs specifically increased by over 200% in 2025. This is not a coincidence. APIs are the backbone of modern applications, and they expose far more surface area than a traditional web page. A login endpoint, a search function, a GraphQL resolver — each one is a potential amplification point where a small request triggers disproportionate backend work.
GraphQL: the API Protocol Attackers Love
GraphQL was designed to give clients flexible, efficient data fetching. That same flexibility makes it a natural target for abuse. Unlike REST, where each endpoint has a fixed scope, GraphQL lets the client define the shape and depth of the response. An attacker does not need to find a vulnerable endpoint. The query language itself is the vulnerability.
GraphQL APIs saw a 140% increase in abuse attempts in 2025, and the attack surface breaks down into several distinct patterns.
Deeply Nested Recursive Queries
The most common GraphQL DDoS vector exploits self-referential type relationships. If your schema allows a User type to have a friends field that returns [User], an attacker can nest that relationship arbitrarily deep:
{
user(id: "1") {
friends {
friends {
friends {
friends {
friends {
friends {
friends {
friends {
friends {
friends {
id
name
email
}
}
}
}
}
}
}
}
}
}
}
}
Each level of nesting triggers a separate database query or resolver call. Ten levels deep against a social graph with an average of 200 connections per user means the server is theoretically resolving 20010 nodes. In practice the server will exhaust memory or time out long before completing, but the damage is done: a single HTTP request consumed seconds of CPU time and potentially gigabytes of memory allocation attempts. Multiply that by a few thousand concurrent connections and the application is down.
Batch Query Abuse
Many GraphQL implementations accept batched queries in a single HTTP request. This is intended to reduce round trips for legitimate clients, but it gives attackers a way to bypass per-request rate limits entirely:
[
{ "query": "{ user(id: \"1\") { orders { items { product { reviews { author { orders { items { product { name } } } } } } } } } }" },
{ "query": "{ user(id: \"2\") { orders { items { product { reviews { author { orders { items { product { name } } } } } } } } } }" },
{ "query": "{ user(id: \"3\") { orders { items { product { reviews { author { orders { items { product { name } } } } } } } } } }" },
... // hundreds more in a single POST
]
A WAF that limits clients to 100 requests per second sees one HTTP request. The GraphQL server sees 500 independent operations, each one computationally expensive. This discrepancy between the network view and the application view is exactly the kind of gap that makes L7 attacks so effective.
Introspection as Reconnaissance
GraphQL introspection lets clients query the schema itself. In production, this is almost always a liability:
{
__schema {
types {
name
fields {
name
type {
name
fields {
name
type {
name
}
}
}
}
}
}
}
Attackers use introspection to map every type, field, and relationship in your API before crafting their payload. It is the equivalent of handing someone a blueprint of your building before they attempt a break-in. Worse, the introspection query itself can be expensive to resolve on large schemas, making it both a reconnaissance tool and a weapon.
Alias-Based Multiplication
Even without batching, a single GraphQL query can multiply its cost using aliases:
{
a1: expensiveOperation(input: "payload1")
a2: expensiveOperation(input: "payload2")
a3: expensiveOperation(input: "payload3")
a4: expensiveOperation(input: "payload4")
# ... repeat hundreds of times
a500: expensiveOperation(input: "payload500")
}
This is a single valid GraphQL query that executes 500 operations. Traditional rate limiting, WAF rules, and even most GraphQL-aware firewalls will count it as one request. The server resolves all 500 aliases sequentially or in parallel, depending on implementation, but either way the computational cost is 500x what the network layer perceives.
Slowloris: Still Deadly After All These Years
Slowloris was first described in 2009. Seventeen years later, it remains effective against a surprisingly large number of production deployments. The attack is elegant in its simplicity: open an HTTP connection, send headers slowly (one byte at a time, or one header line every few seconds), and never complete the request. The server holds the connection open, waiting for the rest of the headers, consuming a thread or connection slot.
Apache's default threading model is particularly vulnerable. Each Slowloris connection ties up a worker thread, and Apache's default MaxRequestWorkers is typically 150 to 256. An attacker with 300 concurrent connections — trivial to maintain from a single machine — can exhaust every available worker. Legitimate users get 503 Service Unavailable while the server sits at near-zero CPU utilization. It is not overloaded in the traditional sense. It is simply out of connection slots.
The reason Slowloris persists in 2026 is that many organizations deploy Apache as a legacy component behind a reverse proxy but leave the Apache instance directly reachable on an alternate port, or exposed to internal networks where attackers have gained a foothold. The proxy handles public traffic; the Apache backend is "internal" and therefore unprotected. One compromised internal host running Slowloris takes down the entire application tier.
Defending Apache Against Slowloris
Apache's mod_reqtimeout module is the primary defense. It enforces time limits on how long a client can take to send headers and body:
# /etc/apache2/conf-enabled/reqtimeout.conf # Require headers within 20 seconds, with 500ms grace per 500 bytes received RequestReadTimeout header=20-40,MinRate=500 # Require body within 20 seconds for most requests RequestReadTimeout body=20,MinRate=500 # Also reduce the global timeout and keepalive Timeout 60 KeepAliveTimeout 5 MaxKeepAliveRequests 100
Additionally, limit the maximum number of connections from a single IP using mod_evasive or firewall rules:
# iptables: limit concurrent connections per source IP to 20 iptables -A INPUT -p tcp --dport 80 \ -m connlimit --connlimit-above 20 --connlimit-mask 32 \ -j REJECT iptables -A INPUT -p tcp --dport 443 \ -m connlimit --connlimit-above 20 --connlimit-mask 32 \ -j REJECT
Nginx Is Resistant, Not Immune
Nginx uses an event-driven architecture that handles thousands of concurrent connections with a small number of worker processes. This makes it inherently resistant to Slowloris because a slow connection does not tie up a dedicated thread. However, Nginx is not completely immune. Enough slow connections will still consume file descriptors and memory. Hardening your Nginx configuration is still important:
# /etc/nginx/nginx.conf
http {
# Close connections that send headers too slowly
client_header_timeout 10s;
client_body_timeout 10s;
# Limit connections per IP
limit_conn_zone $binary_remote_addr zone=addr:10m;
limit_conn addr 20;
# Limit request rate per IP
limit_req_zone $binary_remote_addr zone=req:10m rate=30r/s;
server {
listen 443 ssl;
limit_conn addr 20;
limit_req zone=req burst=50 nodelay;
# Aggressive keepalive limits
keepalive_timeout 15s;
keepalive_requests 100;
}
}
HTTP/2 Rapid Reset: the Protocol-Level Callback
The HTTP/2 Rapid Reset attack (CVE-2023-44487) demonstrated in 2023 that protocol-layer vulnerabilities can create catastrophic amplification. The attack exploits HTTP/2 stream multiplexing: the client opens a stream, immediately sends a RST_STREAM frame to cancel it, and repeats at maximum speed. The server allocates resources for each stream before processing the reset, creating a massive asymmetry between client cost and server cost.
Although patches were deployed across major implementations in late 2023, variants of the rapid reset pattern continue to appear. Attackers now combine stream resets with header compression bombs (HPACK bombs) and priority tree manipulation to increase per-stream server-side cost. The core lesson endures: HTTP/2's efficiency features — multiplexing, header compression, server push — each introduce new attack surface that does not exist in HTTP/1.1.
What made rapid reset particularly dangerous was how small the botnets could be. Google recorded 398 million requests per second from approximately 20,000 machines. That is roughly 20,000 RPS per bot node, achievable on consumer hardware. Compare that to a traditional volumetric attack where generating 1 Tbps requires thousands of high-bandwidth nodes or massive amplification. Protocol-level L7 attacks are simply more capital-efficient for the attacker.
Massive Distributed L7 Floods
Not all application-layer attacks rely on protocol tricks. Some simply use enormous botnets to generate legitimate-looking HTTP requests at a scale that overwhelms any application. In 2025, one documented attack reached 2.45 billion requests using 1.2 million unique IP addresses. At that scale, per-IP rate limiting is almost useless: each individual IP sends only a handful of requests per second, well within any reasonable threshold.
These distributed L7 floods are difficult to distinguish from legitimate traffic because each request looks normal in isolation. The attack signal only emerges when you analyze aggregate patterns: the total request rate across all sources, the geographic distribution of clients, the uniformity of request patterns (identical User-Agent strings, identical request paths, suspiciously even timing intervals), and the absence of typical browser behaviors like JavaScript execution, cookie handling, or asset loading.
This is where traditional WAFs struggle the most. A WAF evaluates each request independently against a ruleset. It can block known bad patterns, enforce rate limits per IP, and challenge suspicious clients. But when 1.2 million IPs each send 2 requests per second and every request looks like a legitimate GET / with valid headers, no individual rule fires. The attack is only visible at the aggregate level — and most WAFs do not operate at that level.
Defense Strategies for the L7 Era
GraphQL-Specific Hardening
Defending a GraphQL API against abuse requires controls that are specific to the query language:
- Query depth limiting. Reject any query that exceeds a maximum nesting depth (typically 6 to 10 levels). Libraries like
graphql-depth-limitfor Node.js orgraphql-query-complexitymake this straightforward. - Query cost analysis. Assign a computational cost to each field and resolver. Reject queries whose total estimated cost exceeds a budget. This catches alias multiplication and batch abuse that depth limiting alone misses.
- Disable introspection in production. There is almost never a legitimate reason for production clients to introspect your schema. Disable it and serve schema documentation through a separate developer portal.
- Batch size limits. If you support batched queries, enforce a hard cap (e.g., 10 operations per batch). This closes the gap between what the network layer sees and what the server actually processes.
- Persisted queries. Require clients to use pre-registered query hashes instead of sending arbitrary query strings. This eliminates the entire class of crafted query attacks because the server only executes queries you have explicitly approved.
- Timeout enforcement. Set aggressive per-resolver and per-request timeouts. A legitimate query should never need 30 seconds to resolve. Kill anything that exceeds 5 seconds and return a partial result or error.
Connection State Monitoring
Slowloris and its variants (R.U.D.Y., Slow POST, Slow Read) all share a common signature: they hold connections open far longer than legitimate clients do. Effective defense requires visibility into connection state across your entire infrastructure:
- Track connection duration distributions. Normal web requests complete in milliseconds to low single-digit seconds. A population of connections lingering for 30 seconds or more in the header-sending phase is a clear signal.
- Monitor connections per source IP. Legitimate browsers open 6 to 8 connections per domain. A source maintaining 50 or more concurrent connections to a single server warrants investigation.
- Correlate connection state with throughput. A connection that has been open for 20 seconds but transferred fewer than 500 bytes is almost certainly not legitimate.
Aggregate Traffic Analysis
Detecting distributed L7 floods requires moving beyond per-request evaluation to aggregate pattern analysis:
- Dynamic baseline comparison. Establish rolling baselines for total RPS, request path distribution, geographic origin mix, and protocol version distribution. Alert when any metric deviates significantly from its baseline.
- Request fingerprinting. Cluster requests by their TLS fingerprint (JA3/JA4), header order, and behavioral patterns. Botnet nodes tend to share identical fingerprints even when they randomize surface-level headers.
- JavaScript challenge gates. Serve lightweight JavaScript challenges to new clients. Legitimate browsers execute them transparently. Most botnet nodes, which use raw HTTP libraries, fail. This technique has diminishing returns as botnets adopt headless browsers, but it still filters a large percentage of automated traffic.
Where Flowtriq Fits In
Flowtriq operates at your network edge, analyzing traffic patterns per node in real time. For L7 attack detection specifically, Flowtriq provides several capabilities that complement traditional WAFs and CDN-level protection:
- Per-node connection state visibility. Flowtriq tracks connection duration, data transfer rate, and state transitions for every active connection on every monitored node. Slowloris-style attacks surface immediately as anomalous connection populations — dozens of connections stuck in the header-reading state with near-zero throughput.
- RPS anomaly detection independent of bandwidth. Flowtriq maintains dynamic baselines for requests per second separate from BPS and PPS. An L7 flood that barely registers on a bandwidth graph triggers alerts in Flowtriq because the request rate deviates from the node's established pattern.
- Cross-node correlation. When an attack targets multiple nodes simultaneously (common with distributed L7 floods), Flowtriq correlates patterns across your entire infrastructure. A 10% RPS increase on a single node might be within normal variance. The same 10% increase across 15 nodes simultaneously, from overlapping source IP sets, is an attack.
- Source IP behavioral clustering. Flowtriq groups source IPs by their connection behavior — request rate, connection duration, protocol version, and request pattern. This clustering catches coordinated botnet activity that per-IP rate limiting misses, because the signal is in the similarity between sources, not in any individual source's volume.
WAFs evaluate requests. CDNs absorb bandwidth. Flowtriq watches the patterns that emerge across your infrastructure's connection state and request flows — the aggregate view that catches what per-request inspection misses.
Key Takeaway: The shift to L7 is not a temporary trend. As volumetric defenses become commoditized, attackers will continue investing in application-layer techniques that exploit business logic, protocol semantics, and computational asymmetry. Defending against this requires visibility at both the request level (WAF) and the infrastructure level (connection state and aggregate traffic analysis). Organizations that rely solely on one layer will have blind spots that attackers will find.
Detect L7 attacks before they take your application down
Flowtriq monitors connection state, request patterns, and aggregate traffic across every node. 7-day free trial, no credit card required.
Start Free Trial $9.99/node/month per node · Unlimited team members