How we built Cloudflare's data platform and an AI agent on top of it
2026-05-28
Here’s how we built Town Lake, Cloudflare's unified analytics platform, alongside Skipper, an internal AI agent running on top of it....
Continue reading »
2026-05-28
Here’s how we built Town Lake, Cloudflare's unified analytics platform, alongside Skipper, an internal AI agent running on top of it....
Continue reading »
2026-05-18
In recent weeks, we pointed Mythos and other security-focused LLMs at live code across critical parts of our infrastructure. We share what we observed, the models’ strengths and weaknesses, and what the work around them needs to look like before any of it can scale....
2026-05-14
When a partitioning change to our petabyte-scale ClickHouse cluster caused critical billing jobs to stall, standard metrics showed no obvious errors. This post explores how we identified severe lock contention in ClickHouse's query planner and built upstream patches to fix it....
2026-04-22
Panics in Rust Workers were historically fatal, poisoning the entire instance. By collaborating upstream on the wasm‑bindgen project, Rust Workers now support resilient critical error recovery, including panic unwinding using WebAssembly Exception Handling....
2026-03-23
Cloudflare's Gen 13 servers introduce AMD EPYC™ Turin 9965 processors and a transition to 100 GbE networking to meet growing traffic demands. In this technical deep dive, we explain the engineering rationale behind each major component selection....
March 23, 2026
Cloudflare’s Gen 13 servers double our compute throughput by rethinking the balance between cache and cores. Moving to high-core-count AMD EPYC ™ Turin CPUs, we traded large L3 cache for raw compute density. By running our new Rust-based FL2 stack, we completely mitigated the lat...
February 27, 2026
We serve 7.6 billion challenges daily. Here’s how we used research, AAA accessibility standards, and a unified architecture to redesign the Internet’s most-seen user interface....
February 13, 2026
ecdysis is a Rust library enabling zero-downtime upgrades for network services. After five years protecting millions of connections at Cloudflare, it’s now open source....
November 13, 2025
We explore the fundamentals of Saltstack and how we use it at Cloudflare. We also explain how we built the infrastructure to reduce release delays due to Salt failures on the edge by over 5%. ...
September 26, 2025
We’ve replaced the original core system in Cloudflare with a new modular Rust-based proxy, replacing NGINX. ...
September 26, 2025
We reduced Cloudflare Workers cold starts by 10x by optimistically routing to servers with already-loaded Workers. Learn how we did it here....
September 25, 2025
We are further hardening Cloudflare Workers with the latest software and hardware features. We use defense-in-depth, including V8 sandboxes and the CPU's memory protection keys to keep your data safe....
July 23, 2025
Faced with a data-ingestion challenge at a massive scale, Cloudflare's Business Intelligence team built a new framework called Jetflow....
October 22, 2024
Cloudflare's Vectorize is now generally available, offering faster responses, lower pricing, a free tier, and supporting up to 5 million vectors....
October 09, 2024
We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities ...
June 03, 2024
Recently, Cloudflare's Observability team undertook an effort to migrate our existing syslog-ng backed logging infrastructure to instead being backed by OpenTelemetry Collectors. In this post, we detail the process that we undertook, and the difficulties we faced along the way...