Business Continuity
Availability Commitments
Flowtriq maintains a 99.9% monthly uptime SLA for its web dashboard and REST API. The full SLA including credit calculation and exclusions is published at flowtriq.com/legal.
Real-time platform status and incident history are available at flowtriq.com/status. Users can subscribe to status updates by email or webhook from the status page.
Infrastructure Redundancy
| Layer | Redundancy Measure |
|---|---|
| Network edge | Cloudflare's global anycast network provides redundant routing to the Flowtriq origin. Cloudflare has 300+ points of presence globally, ensuring requests are routed around network failures automatically. |
| DDoS protection | Cloudflare provides unlimited DDoS mitigation capacity at the network edge, protecting the Flowtriq platform against the same attacks it helps customers detect on their own infrastructure. |
| Origin servers | Flowtriq's application servers are located in Canada. Infrastructure is configured to support maintenance and updates without requiring full service downtime. |
| Database | The Flowtriq database is backed up regularly. Backup integrity is verified periodically. Recovery from backup is tested as part of business continuity practices. |
| Agent communication | ftagent implements retry logic and queuing for metric transmission. If the Flowtriq API is temporarily unavailable, the agent continues monitoring locally and resumes transmission when connectivity is restored. Attack detection and local mitigation are not dependent on cloud connectivity. |
Maintenance Windows
Flowtriq's maintenance window feature allows customers to schedule planned downtime on their monitored servers without generating false alerts. When a maintenance window is active for a node:
- Alert suppression — notifications for that node are paused for the duration of the window
- Monitoring continues — the ftagent keeps running and collecting metrics; data collection is not interrupted
- Audit logged — maintenance window creation and completion are recorded in the tamper-evident audit log
- Automatic resumption — alerting resumes automatically when the maintenance window expires, with no manual action required
Flowtriq communicates its own planned maintenance periods via the status page at flowtriq.com/status with at least 24 hours notice for non-emergency maintenance.
Disaster Recovery
| Scenario | Recovery Approach |
|---|---|
| Platform unavailability (DDoS against Flowtriq) | Cloudflare edge network absorbs attack traffic. Origin servers remain protected behind Cloudflare's anycast network. ftagent continues operating locally during any platform unavailability — detection and local mitigation are not interrupted. |
| Database corruption or failure | Recovery from the most recent verified backup. RPO (Recovery Point Objective) is targeted at 24 hours or less. RTO (Recovery Time Objective) is targeted at 4 hours for critical services. |
| Application server failure | Service restoration from the most recent known-good deployment. Status page updated within 30 minutes of a service-affecting incident being declared. |
| Third-party processor outage (SendGrid, Stripe) | Email alert delivery failure degrades gracefully — alerts are queued and retried. Billing is handled by Stripe independently; platform access is not immediately affected by Stripe outages. Webhook and direct integrations (Discord, Slack, PagerDuty) are unaffected by email provider outages. |
Status & Communication
- Status page: flowtriq.com/status — real-time platform status, incident history, and planned maintenance notices
- Status subscriptions: Subscribe by email or webhook from the status page to receive automatic notifications on incidents and maintenance
- Incident communication: All platform incidents are posted to the status page within 2 hours of declaration, with updates posted throughout the incident lifecycle
- Post-incident reports: Published on the status page following resolution of significant incidents