The Colocation DDoS Problem
Bare metal and colocation providers occupy a difficult position in the DDoS landscape. They own and operate the physical network infrastructure, but their customers control what runs on the servers. A hosting provider running virtual machines has full visibility into guest traffic and can deploy detection and mitigation tools at the hypervisor level. A colocation provider has no access to the customer's server. The customer racks their own hardware, installs their own operating system, and manages their own applications. The colo provider sees only what crosses the network boundary: switch port traffic, uplink utilization, and router-level flow data.
This lack of visibility becomes critical during a DDoS attack. When one customer's IP address receives a 20 Gbps flood, the traffic saturates the shared uplink that serves dozens or hundreds of other customers on the same rack, row, or data center zone. Every customer sharing that uplink experiences packet loss, increased latency, and potential service outages. The customer being attacked may not even notice (if their server is already offline), but their neighbors are immediately affected. This collateral damage is the defining challenge for colocation DDoS protection.
The problem is amplified by the customer profile typical of bare metal and colocation environments. Game server operators, cryptocurrency nodes, streaming platforms, and high-frequency trading infrastructure all gravitate toward bare metal for performance reasons. Several of these workloads are also among the most frequently targeted by DDoS attacks. A colocation provider with a significant game server customer base will experience frequent attacks, and each attack threatens every other customer on the shared infrastructure.
Null Routing vs. Surgical Mitigation
When a DDoS attack hits a customer's IP, the colocation provider's first instinct is often to null route the target IP. Null routing (also called blackholing or RTBH) injects a BGP route that sends all traffic destined for the attacked IP to a null interface, effectively making the IP unreachable. This immediately stops the attack traffic from consuming the shared uplink. It works, and it works fast. But it has a fundamental problem: it does exactly what the attacker wanted.
By null routing the victim's IP, the provider has taken the customer offline. The DDoS attack has achieved its objective without the attacker spending a single additional packet. For the attacked customer, the experience is indistinguishable from the attack itself: their service is down. From their perspective, their own provider just participated in the denial of service. This destroys the customer relationship. A customer whose IP gets null routed during an attack will start looking for a new provider the same day.
Surgical mitigation takes a different approach. Instead of dropping all traffic to the attacked IP, the provider filters out attack traffic and allows legitimate traffic through. This can be done with inline scrubbing appliances, upstream scrubbing services, or BGP FlowSpec rules that match specific attack characteristics (source ports, packet sizes, protocol types). Surgical mitigation keeps the customer online while neutralizing the attack, preserving the customer relationship and demonstrating the provider's value.
The challenge is that surgical mitigation requires intelligence about the attack. The provider must know which traffic is malicious and which is legitimate. This requires real-time detection that identifies the attack vector, characterizes the traffic pattern, and generates filtering rules within seconds. Without this intelligence, the provider's only options are null routing (which kills the customer) or doing nothing (which kills everyone on the shared uplink).
Per-Customer Detection in Shared Infrastructure
Effective DDoS detection for colocation environments must provide per-customer visibility while operating at the network level. The provider needs to answer: which customer is being attacked, what type of attack is it, and how much traffic is involved. This requires monitoring at multiple points in the network.
Top-of-Rack Switch Monitoring
Top-of-rack (ToR) switches see all traffic entering and leaving the servers in their rack. Most modern ToR switches support sFlow or NetFlow export, providing sampled flow data that can be analyzed for anomalies. By configuring flow export on every ToR switch, the provider gets per-port traffic visibility. When a specific switch port suddenly receives 10x its normal traffic volume, the provider can immediately identify which customer and which IP is being targeted.
# Cumulus Linux: enable sFlow on all switch ports # /etc/cumulus/switchd.conf (excerpt) sflow.poll_interval = 10 sflow.counter_interval = 10 sflow.sampling_rate = 1024 sflow.collector = 10.0.0.50:6343 # For higher-fidelity attack detection, reduce sampling rate on edge ports # 1:512 sampling catches most attacks while keeping CPU manageable sflow.sampling_rate = 512
The limitation of sFlow/NetFlow is the sampling rate. At 1:1024 sampling, a 10,000 PPS attack produces approximately 10 sampled packets per second. This is enough to detect the attack, but it takes several seconds of samples to build confidence that the anomaly is real. For environments where 2-second detection is critical, agent-based monitoring on customer servers (where the customer permits it) or higher sampling rates on critical switch ports provide faster detection.
Agent-Based Monitoring on Customer Servers
Where customers permit agent installation, deploying a lightweight monitoring agent on each bare metal server provides the highest fidelity detection. The agent reads kernel-level network counters every second, capturing exact packets-per-second and bytes-per-second rates without sampling. This provides detection within 1 to 2 seconds of attack onset, along with protocol-level classification (UDP flood, TCP SYN flood, ICMP flood, DNS amplification, etc.).
Flowtriq's agent is designed for exactly this deployment model. At $9.99 per node per month, a colocation provider can offer agent-based DDoS detection as an optional add-on for customers who want maximum protection. The agent runs in userspace with minimal resource consumption (under 1% CPU, under 50 MB RAM), reads only kernel counters (not packet payloads), and reports to the Flowtriq dashboard where both the provider and the customer can view detection status and incident history. For colocation providers, this solves the visibility problem without requiring access to the customer's applications or data.
Router-Level Flow Analysis
Edge routers that connect the colocation facility to upstream transit providers see all traffic entering and leaving the network. NetFlow or IPFIX export from these routers provides a global view of traffic patterns across the entire facility. This is the right vantage point for detecting large volumetric attacks that affect the upstream link, but it lacks the per-customer granularity of ToR or agent-based monitoring. Combining router-level flow analysis with ToR and agent data creates a complete picture: the router tells you how much attack traffic is entering the network, the ToR tells you which rack it is hitting, and the agent tells you exactly which server and IP are targeted and what the attack vector is.
Upstream Transit Provider Coordination
Colocation providers do not operate in isolation. Their network connects to the internet through one or more transit providers. When a DDoS attack exceeds the colocation facility's total ingress capacity, the only effective mitigation is upstream: the transit provider must filter or divert the attack traffic before it reaches the facility's handoff.
This requires pre-established relationships and procedures with each transit provider. The key elements are:
- BGP blackhole communities: Most transit providers support a well-known BGP community that, when attached to a route advertisement, instructs the transit provider to null route traffic for that prefix at their edge. This is the upstream equivalent of RTBH. The colocation provider announces the attacked IP with the blackhole community, and the transit provider drops all traffic to that IP before it enters their network. This is fast (takes effect in seconds via BGP) but has the same limitation as local null routing: the victim IP goes offline.
- Scrubbing service activation: Some transit providers offer scrubbing services that can be activated on demand. Traffic to the attacked IP is diverted through the scrubbing center, cleaned, and returned to the colocation facility. This preserves the customer's service while filtering attack traffic. The colocation provider needs a pre-configured diversion mechanism (typically a BGP community that triggers the diversion) and must test it before relying on it during an actual attack.
- Out-of-band communication: During a large attack, the colocation facility's internet connectivity may be degraded. Email, VoIP, and web-based ticketing systems may all be affected. A pre-established out-of-band communication channel (a dedicated phone line, a mobile hotspot, or a separate ISP connection for management traffic) ensures the NOC can reach the transit provider even when the primary network is saturated.
BGP Communities for Blackholing
BGP communities are the primary mechanism for automated upstream mitigation. Understanding how to use them correctly is essential for any colocation provider operating their own AS number.
# Announce a /32 blackhole route to transit provider
# Example: customer IP 203.0.113.50 is under attack
# Transit provider's blackhole community: 64500:666
# Bird configuration
protocol static blackhole_routes {
route 203.0.113.50/32 blackhole;
}
filter transit_export {
if dest = RTD_BLACKHOLE then {
bgp_community.add((64500,666));
# Set next-hop to a well-known blackhole address
bgp_next_hop = 192.0.2.1;
accept;
}
# Normal export policy continues...
accept;
}
# To withdraw the blackhole (after attack subsides):
# Remove the static route and Bird will withdraw the BGP announcement
The RFC 7999 well-known blackhole community 65535:666 is supported by many transit providers and IXP route servers. Using this standardized community simplifies multi-transit configurations because the same community works across providers. However, always verify with each transit provider which communities they honor, as support varies.
A critical operational detail: blackhole announcements must be /32 (for IPv4) or /128 (for IPv6). Announcing a /24 blackhole would take the entire customer subnet offline, which is catastrophic collateral damage. Most transit providers reject blackhole advertisements for prefixes shorter than /32 specifically to prevent this mistake.
Deploying Detection Across the Infrastructure
A complete detection deployment for a colocation facility covers three tiers:
- Tier 1 — Edge routers: NetFlow/IPFIX export at 1:1000 sampling. Provides facility-wide visibility and detects attacks that affect upstream links. Detection latency: 5 to 15 seconds depending on sampling rate and collection interval.
- Tier 2 — Top-of-rack switches: sFlow export at 1:512 sampling. Provides per-rack and per-port visibility. Identifies which customer is being attacked. Detection latency: 3 to 8 seconds.
- Tier 3 — Customer servers (agent-based): Kernel counter monitoring every second. Provides per-server, per-IP, per-protocol detection with attack vector classification. Detection latency: 1 to 2 seconds.
Not every customer will install an agent, and that is expected. The tiered approach ensures that every attack is detected at the network level (Tiers 1 and 2) even if agent-based detection is not available. Customers who opt into agent-based monitoring get faster detection and richer classification. This creates a natural upsell path: basic detection included in the colocation service, premium detection (with agent-based monitoring, PCAP capture, and custom alerting) as a paid add-on.
DDoS Protection as a Billable Service
For colocation providers, DDoS protection is not just a cost center. It is a revenue opportunity. Customers who host DDoS-prone workloads (game servers, streaming, cryptocurrency, e-commerce) actively seek providers that offer built-in DDoS detection and mitigation. They will pay a premium for it, and they will stay longer because switching providers means losing the protection they depend on.
A typical pricing model for colocation DDoS protection tiers:
- Basic (included): Network-level detection via sFlow/NetFlow. BGP blackhole mitigation (null routing) for attacks that threaten the shared infrastructure. Email notification on detection.
- Standard ($25 to $50/server/month): Agent-based detection with per-server dashboards. Attack vector classification. Webhook and Slack/Discord alerting. Incident history and forensic data.
- Premium ($75 to $150/server/month): Everything in Standard, plus automated surgical mitigation via BGP FlowSpec. PCAP capture during incidents. Upstream scrubbing activation. Dedicated support channel during active attacks.
The economics work in the provider's favor. Using Flowtriq's multi-tenant platform, the per-server cost of agent-based detection is $9.99/node/month. Even the Standard tier priced at $25/server/month generates a healthy margin while providing genuine value to the customer. The Premium tier, which includes FlowSpec automation and PCAP, has even higher margins because the automated mitigation reduces the provider's own NOC labor during incidents.
Every colocation customer whose server gets null routed during a DDoS attack is a customer shopping for a new provider the next morning. Surgical mitigation keeps them online and keeps them paying rent.
Multi-Tenant Visibility Without Per-Customer Tooling
The operational nightmare for colocation providers has historically been the tooling overhead of per-customer DDoS detection. Building custom monitoring for each customer's traffic profile, maintaining separate dashboards, configuring individual alerting rules, and managing the infrastructure to run all of this does not scale. A provider with 500 customers cannot maintain 500 custom detection configurations.
Multi-tenant detection platforms solve this by providing per-customer visibility through a single platform. Each customer's servers are grouped as sources within the provider's account. Detection thresholds are learned automatically from each source's baseline traffic, eliminating manual configuration. Dashboards are segmented by customer, and alert routing sends notifications to the right customer contact. The provider's NOC sees everything across all customers, while each customer sees only their own infrastructure.
Flowtriq's architecture is built for this multi-tenant model. The provider creates sources for each customer, deploys agents on the customer's servers (or monitors via flow data), and the platform handles baseline learning, detection, classification, and alerting per customer. The provider can optionally white-label the dashboard, offering branded DDoS monitoring as a native feature of their colocation service. This turns a cost center (DDoS mitigation labor) into a branded product that generates recurring revenue and reduces churn.
Give every customer per-server DDoS visibility
Flowtriq's multi-tenant platform provides per-customer detection, dashboards, and automated mitigation across your entire colocation infrastructure. From $19/source/month with a free 7-day trial.
Start your free trial →