Back to Blog

The Sampling Problem

NetFlow and sFlow are sampling protocols by design. A router configured with a 1:4096 sampling rate examines one out of every 4,096 packets that traverse the interface. The rest are forwarded without inspection. The sampled packet metadata (source IP, destination IP, ports, protocol, byte count) is recorded into a flow record and exported to a collector.

This works well for traffic engineering. When you are measuring aggregate bandwidth utilization or identifying top talkers, a 1:4096 sample gives you statistically accurate results. The law of large numbers is on your side when traffic volumes are in the millions of packets per second.

But DDoS detection is not traffic engineering. You are looking for specific anomalies, often at traffic volumes far below the sampling threshold's statistical confidence interval.

The math of what gets missed

At a 1:4096 sampling rate with a 60-second export interval:

# Attack: 50,000 PPS SYN flood targeting a single server
# Sampled packets per second: 50,000 / 4,096 = ~12 packets/sec
# Sampled packets per export interval: 12 * 60 = ~732 flow records
#
# But those 732 records are mixed with legitimate traffic samples.
# If the server normally receives 30,000 PPS of legitimate HTTPS:
# Legitimate samples per interval: (30,000 / 4,096) * 60 = ~439 records
#
# Total TCP/443 records: 732 + 439 = 1,171
# Normal TCP/443 records: 439
# Increase: 2.7x
#
# Depending on baseline variance, a 2.7x increase may or may not
# trigger an alert. On a server with variable traffic patterns
# (game server, e-commerce during sales), this is within normal range.

Compare this to what the server's kernel sees: 50,000 SYN packets per second with no corresponding ACK completions, TcpExtSyncookiesSent climbing at 48,000/sec, and TcpExtListenDrops incrementing. The attack is unambiguous at the host level. It is a statistical guess at the flow level.

The Export Interval Problem

Flow records are not exported in real time. They are batched and sent on a timer (active timeout) or when the flow expires (inactive timeout). Typical configurations use a 60-second active timeout, meaning the collector receives data about traffic that happened up to 60 seconds ago.

Then the collector has to process the records. Parse. Aggregate by destination. Compare to baselines. Evaluate threshold rules. Fire alerts. With a well-tuned system, add another 5 to 15 seconds for processing.

The result is a minimum detection latency of 65 to 75 seconds in an optimized deployment. Most production environments detect in 90 to 120 seconds.

Some vendors advertise "sub-minute" detection by reducing the active timeout to 10 or 15 seconds. This is technically possible but generates 4 to 6 times more flow export traffic, which can overwhelm the collector and the router's CPU. In practice, aggressive timeouts often cause more problems than they solve.

What happens during the detection gap

During the 60 to 120 seconds between attack start and detection, there is zero visibility. No alert. No classification. No mitigation trigger. The attack is hitting the target at full force, and the only people who know about it are the users experiencing packet loss and connection timeouts.

For a 10 Gbps volumetric flood, that gap means 75 to 150 GB of attack traffic delivered before any response begins. For a targeted SYN flood against a web server, it means 60 to 120 seconds of degraded service or complete unavailability.

Blind Spot #1: Pulsing Attacks

Pulsing (or burst) attacks send traffic in short, intense bursts followed by quiet periods. A common pattern: 5 seconds of 200,000 PPS, then 25 seconds of silence, repeated indefinitely.

With a 60-second export interval, the collector sees the 5-second burst averaged over the full minute: effectively 200,000 * 5 / 60 = ~16,667 PPS average. If the baseline for that destination is 15,000 PPS, the averaged value barely registers as an anomaly.

Meanwhile, the server experiences 5 seconds of severe packet loss every 30 seconds. Users see intermittent timeouts. TCP connections reset. API calls fail sporadically. The monitoring dashboard shows "everything is fine" because the 60-second average smooths out the bursts.

A node-level agent checking every second sees the 200,000 PPS spike immediately, classifies it, and alerts. The pulsing pattern is clearly visible in per-second PPS graphs, and the team can respond before the next burst.

Blind Spot #2: Multi-Vector Rotation

Sophisticated attackers rotate between attack vectors every 10 to 20 seconds: UDP flood for 15 seconds, then SYN flood for 15 seconds, then ICMP flood for 15 seconds, then back to UDP. Each protocol's traffic volume stays below the per-protocol threshold in the flow collector, but the server is under continuous attack.

Flow-based detection evaluates each protocol independently. If your UDP threshold is 100,000 PPS and the attacker sends UDP at 80,000 PPS before switching to TCP, the UDP threshold is never breached. The TCP threshold is never breached. The ICMP threshold is never breached. But the server is processing 80,000 PPS of garbage traffic continuously.

Node-level detection sees the aggregate PPS rate, regardless of protocol. 80,000 PPS is 80,000 PPS whether it is UDP, TCP, or ICMP. The kernel counters show the total load, and the agent's dynamic baseline catches the anomaly on the first rotation.

Blind Spot #3: Same-Port Attacks

When the attacker targets the same port that the server uses for legitimate traffic (UDP/53 against a DNS server, TCP/443 against a web server, UDP/27015 against a game server), flow-based detection has no way to distinguish attack traffic from legitimate traffic based on flow metadata alone.

Both the attack and the legitimate traffic have the same destination port. Both are the same protocol. The only differentiator in flow data is volume, and if the attacker keeps the rate just above the server's capacity but below the flow threshold, the attack is invisible.

Node-level detection uses kernel counters that expose the difference. A DNS server under a UDP flood shows Udp InErrors and Udp RcvbufErrors climbing because the attack packets overflow socket buffers. A web server under a SYN flood shows TcpExtSyncookiesSent and TcpExtListenDrops. These counters reveal server stress that flow data cannot see.

Blind Spot #4: Carpet Bombing

Carpet bombing distributes attack traffic across an entire subnet. Instead of sending 1 Gbps to a single IP, the attacker sends 10 Mbps to each of 100 IPs in a /24. No single destination IP triggers a threshold, but the aggregate traffic saturates the upstream link.

Flow-based detection evaluates per-destination. Each individual IP shows a modest traffic increase that falls within normal variance. The collector sees 100 destinations with slightly elevated traffic, which is unremarkable in a hosting environment.

Node-level detection catches carpet bombing because each affected server independently detects the anomaly. Even 10 Mbps of attack traffic to a server that normally receives 1 Mbps is a 10x deviation from baseline. When 50 servers simultaneously report anomalies, the pattern is clear, and each server's alert includes protocol classification and timing data.

Blind Spot #5: No Post-Attack Forensics

After an attack ends, the investigation begins. What type of attack was it? Were the sources spoofed or real? Was it a botnet or amplification? Which service was targeted? What was the payload?

Flow data can answer the first and last questions approximately. It can tell you the destination IP, the peak flow rate, and the top source IPs (subject to sampling accuracy). But it cannot tell you whether sources were spoofed (no TTL data in standard flow records), what the payload contained (no packet data), or what the server-side impact was.

Node-level detection with PCAP capture provides complete forensic data: full packet headers with TTL values for spoof detection, payload bytes for botnet signature identification, precise timing for correlation with application logs, and kernel counter history showing exactly when and how the server was affected.

Close the detection gaps

Flowtriq detects every attack pattern that flow-based systems miss: pulsing, multi-vector, same-port, carpet bombing. Per-second kernel counter analysis with automatic classification.

Start Free Trial →

When NetFlow Still Makes Sense

Flow-based monitoring is not obsolete. It serves purposes that node-level detection does not:

  • Transit utilization monitoring across backbone links where you cannot install an agent on every router
  • Peering analysis to understand traffic patterns between autonomous systems
  • Capacity planning using aggregate traffic trends over weeks and months
  • Upstream coordination when you need to share traffic data with your transit provider or IXP

The issue is not that NetFlow is bad. The issue is that NetFlow alone is insufficient for DDoS detection. The sampling rates and export intervals that make it efficient for traffic engineering make it structurally unable to detect the targeted, sub-threshold, multi-vector attacks that define the 2026 threat landscape.

The answer is layered detection: network-level flow analysis for aggregate visibility and transit management, combined with node-level kernel counter analysis for per-second, per-server DDoS detection with protocol classification and forensics.

Add the detection layer your flow tools are missing

Deploy Flowtriq alongside your existing NetFlow/sFlow infrastructure. Per-second detection, automatic classification, PCAP capture. $9.99/node/month with a 7-day free trial.

Start your free 7-day trial →
Back to Blog

Related Articles