Discovered by Google Project Zero researchers, the problem was mitigated within an hour, fixed in seven

Feb 24, 2017 10:59 GMT  ·  By

A Cloudflare bug uncovered by Google researchers has large websites leaking people's private session keys, alongside personal information straight into strangers' browsers. 

The bug, discovered by Google's Project Zero, causes Cloudflare reverse proxies to dump uninitialized memory. Under these circumstances, "Cloudbleed" seems to be the perfect name for this situation.

"It looked like that if an html page hosted behind cloudflare had a specific combination of unbalanced tags, the proxy would intersperse pages of uninitialized memory into the output (kinda like heartbleed, but cloudflare specific and worse for reasons I'll explain later). My working theory was that this was related to their 'ScrapeShield' feature which parses and obfuscates html - but because reverse proxies are shared between customers, it would affect *all* Cloudflare customers," the blog post signed by vulnerability researcher Tavis Ormandy reads.

After investigating the issue a little more closely, they managed to get a few live samples full of encryption keys, cookies, passwords, chunks of POST data, and even HTTPS requests for other major cloudflare-hosted sites from other users.   

Quick contact, quick results

Cloudflare was contacted immediately after Google's researchers figured out what was happening. "After I explained the situation, cloudflare quickly reproduced the problem, told me they had convened an incident and had an initial mitigation in place within an hour," they add, indicating just how quick such problems can be fixed if notifications arrive at the right time.

In fact, Cloudflare says that the industry standard time allowed to deploy a fix for a bug like this is usually three months, but they managed to completely finish it globally in under seven hours, with an initial mitigation in 47 minutes.

About 2 million websites on the Cloudflare network may have been affected. However, the service says that customer SSL private keys were not leaked and that they always terminated SSL connections through an isolated instance of NGINX that was not affected by the bug.

"We quickly identified the problem and turned off three minor Cloudflare features (email obfuscation, Server-side Excludes and Automatic HTTPS Rewrites) that were all using the same HTML parser chain that was causing the leakage. At that point it was no longer possible for memory to be returned in an HTTP response," Cloudflare writes.

How did this happen

So what happened? Well, the company explains that in order to modify the HTML of a page, they need to read and parse the GMTL to find elements that need changing, something for which they used a parser written using Ragel. One .rl file contains an HTML parser used for all on-the-fly HTML modifications performed by Cloudflare.

Then, about a year ago, Cloudflare decided to transition to a new parser, named cf-html, which works correctly with HTML5 and it's faster and easier to maintain.

This new parser was used for the Automatic HTTP Rewrites feature and Cloudflare has been slowly migrating functionality that uses the old Ragel parser to cf-html.

The bug that caused the problem is in Cloudflare's use of Ragel, not in Ragel itself, the company admitted.

"It turned out that the underlying bug that caused the memory leak had been present in our Ragel-based parser for many years but no memory was leaked because of the way the internal NGINX buffers were used. Introducing cf-html subtly changed the buffering which enabled the leakage even though there were no problems in cf-html itself," Cloudflare explains.

The features causing the problem were disabled. The Email Obfuscation global kill was activated 47 minutes after the problem was relayed to them and the Automatic HTTPS Rewrites global kill 3 hours later. The third feature, Server-Side Excludes, did not have a global kill switch, so they had to implement one first before deploying it.