Cloudflare Outage: June 12, 2025 – What Happened and What Comes Next
On June 12, 2025, a widespread Cloudflare outage disrupted multiple services globally. Spanning 2 hours and 28 minutes, this incident affected key products such as Workers KV, Access, WARP, Gateway, Images, Stream, Workers AI, Turnstile, and parts of the Cloudflare Dashboard.
While the root cause stemmed from a failure in the third-party cloud storage provider supporting Workers KV, Cloudflare acknowledges its responsibility for the architecture and dependencies it builds upon.
No user data was lost, and core services like DNS, caching, WAF, and Magic Transit continued to operate normally. However, services built on Workers KV suffered significantly.
What Caused the Outage?
Workers KV is a foundational component across many Cloudflare services. While it is designed to be coreless and distributed, it still depends on a central data store for synchronization and consistency.
The failure originated from this central storage backend — a third-party cloud provider — which resulted in the inability to read or write fresh data (“cold reads”) from the KV store. Cached data served normally, but anything requiring a backend fetch returned 503 or 500 errors.
Services Impacted
Here’s a breakdown of affected services and the specific issues they encountered:
- Workers KV: 90% of uncached requests failed. Cached content served normally.
- Access: 100% failure in identity-based logins. SCIM updates failed. Service-token based access remained unaffected.
- Gateway: DNS mostly unaffected except for identity-based DoH queries. Proxy and TLS decryption services failed.
- WARP: New device registration and session authentication failed. Existing sessions with active tokens were unaffected.
- Dashboard: Standard, Google, and SSO logins failed due to dependencies on Turnstile and KV. The v4 API remained functional.
- Turnstile & Challenges: Siteverify API returned errors. Kill switches prevented user-facing failure, but valid tokens could be reused during the incident.
- Images: Uploads failed. Delivery degraded slightly.
- Stream: 90%+ error rate. Live services failed. Uploads were unaffected.
- Realtime: TURN and SFU services degraded. Near-complete failure for TURN.
- Workers AI & AutoRAG: All AI inference and document processing tasks failed.
- Browser Isolation: Existing and new sessions failed due to dependencies on Access and Gateway.
- Pages & Workers Assets: Pages had 100% failure in builds. Assets delivery saw a minor spike in errors (0.06%).
- Durable Objects & D1: Shared backend failure caused up to 22% error rates.
- Queues & Notifications: Messaging halted due to KV mapping failures.
- AI Gateway: 97% of requests failed during the peak of the outage.
- CDN: Some traffic rerouting failed in regions like São Paulo, Atlanta, and Philadelphia, leading to latency and 503/499 errors.
- Zaraz: 100% failure. Config updates during the incident were lost for one user.
Incident Timeline (UTC)
- 17:52 – Device registration fails in WARP. Incident begins.
- 18:05 – Error rates spike across multiple services.
- 18:06 – Root cause traced to Workers KV. Incident escalated.
- 18:21 – Incident severity upgraded to P0 (highest priority).
- 18:43 – Access team begins re-architecting to avoid KV.
- 19:09 – Gateway team starts removing KV dependencies.
- 19:32 – Load-shedding begins to protect remaining KV capacity.
- 20:23 – KV storage backend begins recovering.
- 20:28 – Incident ends. Monitoring continues.
Remediation Measures and Next Steps
Cloudflare is accelerating several infrastructure updates to avoid such outages in the future:
- Decoupling from single-provider dependencies: Infrastructure is being re-architected to reduce reliance on external vendors.
- KV migration to Cloudflare R2: Ensures higher data resilience and meets compliance needs.
- Progressive namespace reactivation tooling: Helps gradually restore services without overwhelming the backend.
- Blast radius mitigation: Making each product resilient to failures of underlying components like Workers KV.
Cloudflare emphasized that internal teams are auditing service dependencies and pushing for greater infrastructure independence and fault tolerance. The transition to more robust architecture is ongoing and a top priority.
Transparency and Accountability
Cloudflare has taken full responsibility for the outage, acknowledging that even though the immediate failure was caused by a third-party provider, the architectural decisions leading to this dependency fall on them.
They have committed to publishing further technical insights and will continue to update stakeholders through their blog and incident reports.








