Hosting Provider DDoS Playbook: Per-Tenant Detection at Scale

The Hosting Provider DDoS Problem

Hosting providers face a unique version of the DDoS problem: multi-tenancy. When a DDoS attack targets one tenant on a shared node, every tenant on that node suffers degraded performance. When the attack is large enough to saturate the node's uplink, every tenant goes offline. The target tenant is the victim. Everyone else is collateral damage.

This creates a cascading support crisis. The targeted tenant opens a ticket about the attack. Twenty other tenants on the same node open tickets about degraded performance or connectivity loss. Your support team spends hours explaining to unrelated tenants that they were affected by an attack on someone else's service. Meanwhile, the tenants who were collateral damage are questioning whether your platform is reliable enough for their workloads.

The business impact extends beyond the immediate incident. Tenants who experience collateral damage from other tenants' attacks often leave. They do not blame the attacker. They blame you for not isolating them from the problem. "Noisy neighbor" DDoS is a leading cause of churn for hosting providers, and it is entirely preventable with proper per-tenant detection and isolation.

Per-Node Agent Deployment

The Flowtriq agent runs on each physical or virtual server node in your hosting infrastructure. One agent per node covers all tenants running on that node, whether they are VMs, containers, or bare-metal slices.

Deployment at scale

For hosting providers with hundreds or thousands of nodes, manual agent installation is not practical. Use your configuration management system to deploy the agent as part of your standard node provisioning:

# Ansible example
- name: Install Flowtriq agent
  hosts: hosting_nodes
  tasks:
    - name: Install agent
      shell: pip install ftagent && ftagent --setup --token {{ flowtriq_token }}
      args:
        creates: /usr/local/bin/ftagent

    - name: Configure node tags
      template:
        src: ftagent.conf.j2
        dest: /etc/flowtriq/ftagent.conf
      notify: restart ftagent

# ftagent.conf.j2 template
[agent]
token = {{ flowtriq_token }}
node_tags = {{ hosting_region }},{{ node_type }},{{ rack_id }}

[detection]
mode = per-ip
tenant_isolation = true

The per-ip detection mode tells the agent to track traffic baselines and detect anomalies on a per-IP basis rather than per-node aggregate. This is what enables per-tenant detection: each tenant's IP addresses are monitored independently, so an attack on one tenant does not skew the detection baselines for other tenants on the same node.

Resource considerations at scale

At hosting scale, the agent's resource footprint matters. On a node running 20 tenant VMs, you cannot afford an agent that consumes significant CPU or memory:

CPU: Less than 1% of a single core under normal traffic. The agent samples packet headers, not full payloads.
Memory: Base footprint of 30-50 MB, plus approximately 2 MB per active tenant IP being tracked. A node with 100 active tenant IPs uses around 230 MB.
Network: Less than 100 Kbps of reporting traffic to the Flowtriq API. Metrics are batched and compressed.

Hypervisor vs. guest deployment: Always deploy the agent on the hypervisor or host node, not inside tenant VMs. Host-level deployment gives the agent visibility into all tenant traffic without requiring tenant cooperation, and it avoids the overhead of running an agent inside every VM.

Tenant Isolation During Attacks

Detection is only half the problem. The other half is ensuring that mitigation for one tenant does not disrupt other tenants. Flowtriq's per-IP detection enables surgical mitigation that targets only the affected tenant's traffic.

Surgical FlowSpec rules

When an attack is detected on a specific tenant IP, the auto-mitigation runbook generates a FlowSpec rule that matches only the attack traffic destined for that tenant. Other tenants on the same node, the same subnet, and the same switch are completely unaffected:

Attack detected:
  Target: 198.51.100.42 (Tenant: customer-xyz)
  Node: hosting-us-east-14
  Classification: DNS amplification
  Volume: 4.2 Gbps

FlowSpec rule:
  Match: dst 198.51.100.42/32, protocol UDP, src-port 53
  Action: discard

Impact on other tenants: None
  198.51.100.41 (Tenant: customer-abc) - unaffected
  198.51.100.43 (Tenant: customer-def) - unaffected
  198.51.100.44 (Tenant: customer-ghi) - unaffected

Rate limiting as an alternative to dropping

For some attack types, dropping all matching traffic is too aggressive. If the tenant is running a DNS server, dropping all UDP port 53 traffic would block their legitimate DNS queries along with the amplification attack. In these cases, use rate-limited FlowSpec rules instead of discard rules:

FlowSpec rule (rate-limited):
  Match: dst 198.51.100.42/32, protocol UDP, src-port 53
  Action: traffic-rate 50000000 (50 Mbps)

Result: Attack traffic capped at 50 Mbps (from 4.2 Gbps)
        Legitimate DNS responses still flow (typically < 10 Mbps)

Preventing collateral damage

The key architectural decision is where mitigation rules are applied. For hosting providers, FlowSpec rules should be applied at the edge router or top-of-rack switch, not at the server itself. Edge filtering drops attack traffic before it traverses your internal network, protecting the uplink capacity for all tenants.

If attack traffic reaches the server before being filtered (because your switches do not support FlowSpec), the agent can apply local iptables or nftables rules as a fallback. This does not protect the uplink, but it does protect the server's CPU and other tenants' performance on that node.

Auto-Mitigation Escalation

Not every attack can be handled by a single FlowSpec rule. The escalation chain for hosting providers typically has four levels:

Level 1: FlowSpec filtering (0 to 5 seconds)

The first response to any detected attack. A surgical FlowSpec rule is pushed to edge routers to filter the specific attack vector. This handles 80% of attacks without any tenant impact.

Level 2: Aggressive rate limiting (after 60 seconds)

If the attack continues after FlowSpec filtering (indicating a multi-vector attack or an attack type that evades the initial rule), the system applies broader rate limits. Traffic to the target IP is rate-limited to 2x the tenant's normal baseline, regardless of protocol or port.

Level 3: Upstream scrubbing (after 5 minutes)

For attacks large enough to threaten upstream link saturation, the system can signal an upstream scrubbing service (if configured). This diverts the target tenant's traffic through a scrubbing center that absorbs the volumetric attack and forwards only clean traffic.

Scrubbing activation:
  Trigger: attack.volume > 80% of upstream link capacity
  Action: BGP announcement to scrubbing provider
    Announce: 198.51.100.42/32 via scrubbing tunnel
    Community: scrubbing-provider:scrub
  Duration: Until attack subsides + 15 minute cooldown

Level 4: RTBH (last resort)

If the attack is so large that it threatens the entire network (upstream link saturation despite scrubbing), the final escalation is RTBH: blackholing the target tenant's IP. This sacrifices one tenant to protect all others. The tenant is notified immediately with an explanation and an estimated restoration time.

Level 4 should be rare. If you are reaching RTBH regularly, your Level 3 scrubbing capacity needs to be increased.

Status Pages Per Tenant

Hosting providers need layered communication during incidents. The target tenant needs detailed information about the attack on their IP. Collateral damage tenants (if any) need information about why their service was affected. And all tenants need access to a general platform status page for ongoing visibility.

Platform-level status page

A single status page for your hosting platform with components for each region, data center, or node cluster. All tenants can subscribe. This page shows platform-wide incidents and scheduled maintenance.

Tenant-specific notifications

When an attack targets a specific tenant, that tenant receives a direct notification (email, webhook, or both) with details specific to their service. This notification is separate from the platform status page and includes information that is only relevant to the targeted tenant:

Subject: DDoS Attack Detected on Your Service (198.51.100.42)

A DDoS attack has been detected targeting your IP address 198.51.100.42.

Attack details:
  Type: DNS Amplification
  Volume: 4.2 Gbps peak
  Start time: 2026-06-07 14:22:03 UTC
  Status: Auto-mitigated (FlowSpec filtering active)

Your service should be operating normally. The attack traffic is being
filtered at our network edge. No action is required from you.

We will send a follow-up notification when the attack has fully subsided.

BGP Scrubbing for Volumetric Attacks

Hosting providers who peer with upstream scrubbing services (Voxility, Path.net, NTT, etc.) can configure Flowtriq to automatically activate scrubbing when an attack exceeds local filtering capacity.

The integration works via BGP: when the attack volume crosses your configured threshold, Flowtriq announces the target prefix to the scrubbing provider's BGP session. The scrubbing provider attracts the traffic, filters out the attack, and tunnels clean traffic back to your network.

Scrubbing provider configuration:
  Provider: scrub-provider
  BGP session: 10.0.0.1 (AS 64513)
  Activation threshold: 8 Gbps per target IP
  Announcement community: 64513:100 (activate scrubbing)
  Withdrawal community: 64513:200 (deactivate)
  GRE tunnel: 10.255.0.1 -> 10.255.0.2
  Cooldown: 15 minutes after attack subsides

The entire activation and deactivation cycle is automated. Your NOC team is notified but does not need to intervene unless the scrubbing provider reports issues.

Pricing at Scale

For hosting providers, the cost of DDoS detection needs to make sense at the per-node economics of the business. A hosting provider running 500 nodes at $50/month per tenant cannot justify a DDoS solution that costs $100/node/month.

Flowtriq's volume pricing is designed for hosting scale:

Volume tiers:
  1-25 nodes:     $9.99/node/month
  26-100 nodes:   $7.99/node/month
  101-500 nodes:  $5.99/node/month
  500+ nodes:     Contact sales for custom pricing

At 200 nodes ($5.99/node), your total monthly cost is $1,198. If each node hosts an average of 10 tenants at $50/month, your total tenant revenue on those nodes is $100,000/month. The DDoS detection cost is 1.2% of revenue. Compare that to the cost of a single serious DDoS incident: a major attack that causes visible collateral damage across your platform can result in dozens of tenant cancellations, each representing $600+ in annual revenue.

The cost of one "noisy neighbor" DDoS incident that drives away 20 tenants exceeds the annual cost of DDoS detection for your entire hosting fleet. Prevention is not just cheaper. It is the only option that scales.

Getting Started

Start with a pilot deployment on your most attack-prone nodes (typically game hosting or proxy service nodes). Deploy the agent, let it baseline for 48 to 72 hours, configure per-IP detection mode, and build your first runbook. Once you have validated detection accuracy on the pilot nodes, roll out to the rest of your fleet via your configuration management system.

Flowtriq starts at $9.99/node/month with volume discounts starting at 26 nodes. Start your free 14-day trial or review pricing for your deployment size.

Back to Blog

Hosting provider DDoS playbook:
per-tenant detection at scale

The Hosting Provider DDoS Problem

Per-Node Agent Deployment

Deployment at scale

Resource considerations at scale

Tenant Isolation During Attacks

Surgical FlowSpec rules

Rate limiting as an alternative to dropping

Preventing collateral damage

Auto-Mitigation Escalation

Level 1: FlowSpec filtering (0 to 5 seconds)

Level 2: Aggressive rate limiting (after 60 seconds)

Level 3: Upstream scrubbing (after 5 minutes)

Level 4: RTBH (last resort)

Status Pages Per Tenant

Platform-level status page

Tenant-specific notifications

BGP Scrubbing for Volumetric Attacks

Pricing at Scale

Getting Started

Related Articles

Hosting provider DDoS playbook:per-tenant detection at scale

The Hosting Provider DDoS Problem

Per-Node Agent Deployment

Deployment at scale

Resource considerations at scale

Tenant Isolation During Attacks

Surgical FlowSpec rules

Rate limiting as an alternative to dropping

Preventing collateral damage

Auto-Mitigation Escalation

Level 1: FlowSpec filtering (0 to 5 seconds)

Level 2: Aggressive rate limiting (after 60 seconds)

Level 3: Upstream scrubbing (after 5 minutes)

Level 4: RTBH (last resort)

Status Pages Per Tenant

Platform-level status page

Tenant-specific notifications

BGP Scrubbing for Volumetric Attacks

Pricing at Scale

Getting Started

Related Articles

Game server DDoS playbook: from first alert to resolution

ISP DDoS playbook: flow-based detection with BGP mitigation

MSP DDoS playbook: multi-tenant setup with white-label

Hosting provider DDoS playbook:
per-tenant detection at scale