Back to Blog

The Bare-Metal Security Gap

When you deploy a server on AWS, Azure, or GCP, it arrives with layers of network security built in. Security groups filter traffic before it reaches the instance. VPC networking isolates your servers from other tenants. The cloud provider runs its own DDoS protection service (AWS Shield, Azure DDoS Protection, Cloud Armor) that absorbs volumetric attacks at the network edge before they ever touch your infrastructure.

Bare-metal servers have none of this. A dedicated server in a colocation facility or from a bare-metal provider (Equinix Metal, OVH, Hetzner, Vultr bare metal) sits on a network with a public IP address and no abstraction layer between it and the internet. Every packet that arrives at the upstream router gets forwarded to the server's network interface. There is no managed firewall service. There is no automatic volumetric attack absorption. There is no "enable DDoS protection" checkbox in a control panel.

This gap matters more than ever because the workloads moving to bare-metal are increasingly high-value. GPU clusters for AI training and inference need bare metal for direct hardware access. High-frequency trading systems need bare metal for predictable latency. Game server hosting needs bare metal for consistent performance. These are workloads where downtime is measured in thousands of dollars per minute, running on infrastructure with the least built-in protection.

Why Cloud Scrubbing Adds Unacceptable Latency

The conventional answer to DDoS protection is cloud scrubbing: route your traffic through a third-party scrubbing center that filters out attack traffic and forwards only clean traffic to your origin server. Services like Akamai Prolexic, Imperva, and dedicated scrubbing providers operate this way. For web applications behind a CDN, this model works well.

For bare-metal GPU workloads, cloud scrubbing introduces problems that often outweigh the protection it provides.

Latency overhead. Every packet traverses an additional network hop through the scrubbing center. Even well-connected scrubbing providers add 2-10ms of round-trip latency. For an inference API serving predictions at sub-50ms latency targets, adding 5ms represents a 10% performance degradation that is always on, not just during attacks. Customers are paying bare-metal prices specifically for low-latency performance; adding a network detour undermines the value proposition.

Bandwidth costs. GPU workloads can generate substantial legitimate traffic. A distributed training cluster moving gradients between nodes, or an inference API returning large model outputs (generated images, long text completions), can sustain tens of gigabits per second of clean traffic. Scrubbing services charge per gigabit of clean traffic forwarded. At $3-5 per Mbps per month, protecting a 10 Gbps link costs $30,000-50,000 per month in scrubbing fees alone, often exceeding the cost of the server itself.

Protocol limitations. Cloud scrubbing works by proxying traffic, which means it needs to understand the protocol. HTTP and HTTPS are well-supported. But GPU cluster traffic (NCCL over TCP/UDP, RDMA over Converged Ethernet, custom protocols for model parallelism) cannot be proxied through a scrubbing service that expects standard web traffic. You end up with protection for your API endpoint but no protection for inter-node training communication.

Cloud scrubbing was designed for a world where servers sit behind web frontends that speak HTTP. Bare-metal GPU infrastructure operates in a fundamentally different model where traffic diversity, latency sensitivity, and bandwidth volume make the scrubbing architecture a poor fit.

Agent-Based Detection: The Right Architecture for Bare Metal

If you cannot put protection in front of the server (scrubbing) or around it (cloud security groups), the remaining option is on the server itself. Agent-based detection places a lightweight monitoring process directly on each bare-metal node, observing all traffic as it arrives at the network interface.

This architecture has several advantages that align with bare-metal constraints:

  • Zero additional latency: The agent observes traffic passively. It adds no network hops, no proxy layers, no detours. Clean traffic flows at native network speed.
  • Protocol-agnostic detection: The agent monitors at the packet level, not the application level. It detects anomalies in PPS, BPS, protocol distribution, and source patterns regardless of whether the traffic is HTTP, NCCL, RDMA, or any other protocol. Every workload type is protected equally.
  • Per-node baselines: Each server builds its own traffic model based on its specific workload. A training node's baseline reflects the bursty inter-node communication pattern. An inference server's baseline reflects the steady stream of API requests. A storage node's baseline reflects large sequential data transfers. Per-node specificity eliminates the false positives that come from applying a single network-wide threshold to diverse workloads.
  • No bandwidth fees: There is no per-gigabit charge for clean traffic. The agent runs on the server regardless of traffic volume.

Flowtriq's agent is built for this model. It runs as a single process, consuming under 1% CPU and less than 50 MB of memory. On GPU servers, it uses zero GPU resources. It installs with a single command and begins building baselines immediately.

Kernel-Level Filtering as First-Line Defense

When the agent detects an attack, the first mitigation layer is kernel-level packet filtering using iptables or nftables. This is the fastest possible response because it drops malicious packets before they reach the application, eliminating the CPU interrupt overhead that would otherwise degrade GPU workloads.

Kernel-level filtering is effective against volumetric attacks where the attack traffic has identifiable characteristics: source IP ranges, specific protocols or ports, packet sizes, or TCP flag combinations. The agent translates detection results into filter rules automatically.

# Example: agent-generated nftables rules during a SYN flood

table inet flowtriq_mitigation {
    set blocked_sources {
        type ipv4_addr
        flags interval,timeout
        timeout 5m
    }

    chain input {
        type filter hook input priority -200; policy accept;

        # Drop packets from identified attack sources
        ip saddr @blocked_sources counter drop

        # Rate-limit SYN packets from unknown sources
        tcp flags syn limit rate over 100/second burst 200 packets counter drop

        # Allow established connections (preserves active sessions)
        ct state established,related accept
    }
}

The priority of -200 ensures these rules execute before any application-level firewall rules, providing protection at the earliest possible point in the kernel's packet processing pipeline. Established connections (including active batch inference sessions and training communication) are explicitly preserved.

BGP FlowSpec for Upstream Filtering

Kernel-level filtering has a capacity ceiling. If the attack volume exceeds the server's network port capacity (a 20 Gbps flood on a 10 Gbps port), packets are dropped at the switch before they reach the server's NIC, and the server is effectively offline regardless of any local filtering rules.

BGP FlowSpec solves this by pushing filter rules upstream to the router or the transit provider's edge. When the agent detects an attack exceeding a configured volume threshold, it announces FlowSpec rules via BGP that instruct the upstream router to drop matching traffic before it reaches the server's port.

FlowSpec rules can match on source/destination prefix, protocol, port, packet length, DSCP value, and TCP flags. This granularity allows the agent to block attack traffic while preserving legitimate traffic, unlike RTBH (remotely triggered blackhole) which drops all traffic to the destination IP.

For GPU cloud providers with BGP-capable networks, FlowSpec provides a scalable mitigation path that handles attacks of any volume without adding latency to clean traffic. The detection happens on the node (where visibility is best), and the filtering happens at the network edge (where capacity is highest).

Deployment with Configuration Management

Bare-metal infrastructure at scale requires automated deployment. GPU cloud providers managing hundreds or thousands of nodes cannot install and configure DDoS protection manually on each server.

Flowtriq's agent is designed for configuration management integration. The agent installs from a single package (DEB, RPM, or pip), reads its configuration from a YAML file, and registers with the central dashboard via API key. This maps cleanly to Ansible playbooks, Terraform provisioners, or any other configuration management tooling.

# Ansible example: deploy Flowtriq agent to GPU nodes

- name: Install Flowtriq agent
  hosts: gpu_nodes
  become: true
  tasks:
    - name: Install agent package
      pip:
        name: ftagent
        state: latest

    - name: Deploy configuration
      template:
        src: ftagent.yml.j2
        dest: /etc/ftagent/config.yml
      notify: restart ftagent

    - name: Enable and start agent
      systemd:
        name: ftagent
        enabled: true
        state: started

New nodes provisioned through your deployment pipeline automatically receive DDoS protection. Decommissioned nodes automatically deregister. The central dashboard provides fleet-wide visibility across all nodes, with per-node drill-down for investigating individual attacks.

For providers using Terraform to manage bare-metal infrastructure through APIs (Equinix Metal, Vultr), the agent deployment can be included as a provisioning step in the Terraform configuration, ensuring every new server is protected from the moment it comes online.

Purpose-built for bare metal. Flowtriq's agent deploys on any bare-metal server with a single command, builds per-node baselines, and mitigates attacks at the kernel level and via BGP FlowSpec. No scrubbing fees, no added latency, no protocol limitations. See the GPU/AI Cloud use case or start your free trial.

Back to Blog

Related Articles