Seconds 0-15: Confirm the Attack and Identify the Target
Your monitoring fires an alert. User reports are flooding in. Before you do anything else, you need to answer two questions: which concentrator is under attack, and how many users are affected?
If you have per-node monitoring, the answer is immediate. The alert tells you exactly which server is seeing anomalous traffic. If you are working with aggregate dashboards, you need to narrow it down fast. SSH into your management network (never rely on in-band access to the attacked server) and check traffic counters on each concentrator.
# Quick triage: identify the target concentrator # Run from management/out-of-band access # Check interface traffic rates across servers for host in vpn-us-01 vpn-eu-01 vpn-ap-01; do echo "=== $host ===" ssh $host "cat /proc/net/dev | grep eth0" 2>/dev/null & done wait # On the suspected target, check current connections ssh vpn-eu-01 "wg show wg0 | grep -c 'latest handshake'" # Check current packet rate on WireGuard port ssh vpn-eu-01 "timeout 3 tcpdump -i eth0 -nn 'udp port 51820' -c 10000 2>&1 | tail -1"
The wg show output tells you how many peers have active handshakes, which directly maps to affected users. The tcpdump sample tells you the current packet rate. If you are seeing 500,000 packets in 3 seconds on a server that normally handles 50,000, you have your target and your confirmation.
Seconds 15-30: Deploy Emergency nftables Rules
With the target confirmed, your first mitigation action is to reduce the processing load on the concentrator. The goal is not to block the attack completely (you may not be able to at the server level if the link is saturated), but to ensure that whatever bandwidth is available is used for legitimate tunnel traffic rather than attack processing.
# Emergency nftables rules for WireGuard under attack
# Deploy immediately - refine later
nft add table inet emergency
nft add chain inet emergency filter \
'{ type filter hook input priority -200; policy accept; }'
# Priority 1: Accept transport data from established peers
# (type 4 packets - these are your active user tunnels)
nft add rule inet emergency filter \
udp dport 51820 @th,0,8 4 accept
# Priority 2: Hard rate limit new initiations globally
# Allows 50/sec total - enough for legitimate reconnects
nft add rule inet emergency filter \
udp dport 51820 @th,0,8 1 limit rate 50/second accept
# Priority 3: Drop everything else to WireGuard port
nft add rule inet emergency filter \
udp dport 51820 drop
These rules are deliberately aggressive. A global rate limit of 50 initiations per second will cause some legitimate clients to fail their reconnection attempts temporarily, but it will keep the server running for the users who are already connected. You can tune the rate limit upward once the attack subsides.
For OpenVPN servers, the approach is similar but targets TCP or UDP on the OpenVPN port, rate-limiting new connections while preserving established ones using conntrack state matching.
Always deploy nftables rules via the management interface, not the VPN interface. If the attack has saturated the link, your SSH session through the VPN IP will be unresponsive. Out-of-band management access is not optional for VPN infrastructure.
Seconds 30-60: Evaluate Whether Upstream Intervention Is Needed
If the emergency nftables rules stabilize the server and user tunnels resume normal operation, the on-server mitigation is sufficient. Monitor packet rates and user connectivity for the next few minutes to confirm.
If the attack volume exceeds the server's link capacity (the link is saturated and packets are being dropped at the NIC or upstream router before they even reach your nftables rules), on-server mitigation is not enough. You need upstream help.
Your options, in order of preference:
- Request upstream ACL filtering: Contact your hosting provider or upstream transit provider and request they apply ACL rules to filter the attack traffic before it reaches your port. Provide them with the attack characteristics: target IP, target port, and any source IP patterns you have identified. Most transit providers can deploy ACLs within minutes if you have a pre-established escalation path.
- Activate BGP-triggered remote null route (RTBH): If your provider supports it and you have pre-configured BGP communities, announce a null route for the attacked IP. This drops all traffic to that IP at the provider's edge, stopping the attack but also stopping legitimate traffic. Users must reconnect to a different server.
- Migrate users to a backup concentrator: If you have a standby concentrator on a different IP (and ideally a different provider), update your DNS or API endpoint to direct clients to the backup server. This is the fastest way to restore service, though users will experience a brief disconnection during the migration.
Null routing should be your last resort because it means the attacker wins for the targeted IP. But if the alternative is a saturated link that takes out other services sharing the same upstream connection, it may be the right call. The key is having the backup concentrator ready so users can migrate immediately.
User Communication During the Attack
While you are mitigating the attack, your users are experiencing disconnections. How you communicate during the incident significantly impacts churn. Users who understand what is happening and see that you are responding are far more likely to stay than users who are left in the dark.
Prepare template communications in advance so you are not writing status updates under pressure:
- Initial acknowledgment (within 2 minutes): "We are aware of connectivity issues affecting [region/server]. Our team is actively mitigating. Users may experience brief disconnections. If your client does not reconnect within 60 seconds, please switch to [backup server]."
- Mitigation update (within 10 minutes): "Traffic has been stabilized on [server]. Users should now be able to reconnect. If you are still experiencing issues, please restart your VPN client."
- Resolution notice (after attack ends): "The connectivity issue has been fully resolved. The disruption was caused by a volumetric network attack that has been mitigated. No user data was compromised."
Publish these on your status page and push them through any in-app notification channels. Never mention specific attack details publicly (volume, duration, or source) that could help the attacker calibrate future attempts.
Post-Attack Analysis
Once the attack has ended and service is stable, the work is not over. Post-attack analysis turns a reactive incident into a proactive defense improvement. Within 24 hours, you should document:
- Timeline: When did the attack start? When was it detected? When was mitigation deployed? When did service recover? Every gap in this timeline is a gap in your detection or response capability.
- Attack characteristics: Peak volume (PPS and BPS), protocol/port targeted, source IP diversity, attack duration, any vector rotation patterns.
- User impact: How many users were disconnected? For how long? How many reconnected automatically versus needing manual intervention?
- Detection gap: How long between the attack starting and the first alert? If this was more than 30 seconds, your monitoring needs improvement.
- Mitigation effectiveness: Did the nftables rules work? Was upstream intervention required? How long did it take to deploy?
Review the PCAP data if you captured it. Look at source IP diversity, packet size distribution, and message type breakdown. This data tells you whether this was a simple volumetric flood or a more targeted attack exploiting VPN protocol specifics. It also helps you tune your nftables rules and detection thresholds for next time.
Setting Up Automated Detection for Next Time
The playbook above assumes manual detection and manual response. That works once. For sustained protection, you need automated detection that alerts in seconds and can deploy mitigation rules without waiting for a human to SSH in.
What automated detection needs to cover for VPN infrastructure:
- Per-concentrator traffic baselines: Each server needs its own baseline for packet rate, byte rate, and connection rate on VPN ports. An attack on one server should not be averaged out by normal traffic on other servers.
- Protocol-aware classification: The system should distinguish between WireGuard transport data, handshake initiations, and invalid packets. A spike in initiations without a corresponding increase in transport data is an attack signal, not a traffic spike.
- Automated nftables deployment: When an attack is detected, the system should be able to deploy rate-limiting rules at the kernel level within seconds, not minutes.
- Upstream escalation triggers: If on-server mitigation is insufficient (link saturation detected), the system should automatically open a ticket with the upstream provider or trigger a pre-configured BGP community announcement.
- User impact correlation: The detection system should correlate attack traffic with active tunnel counts. A drop in active WireGuard peers concurrent with a traffic spike confirms user impact and elevates the severity of the alert.
Automate your VPN incident response. Flowtriq deploys per-concentrator agents that detect DDoS attacks in seconds, deploy nftables mitigation automatically, and alert your team with full attack context. Stop scrambling during attacks. See how Flowtriq protects VPN providers or start your free trial.