A "diff" algorithm that offers ten times bsdiff's performance

Jul 16, 2009 12:15 GMT  ·  By
Google introduces Courgette, a "diff" algorithm that offers ten times bsdiff's performance
   Google introduces Courgette, a "diff" algorithm that offers ten times bsdiff's performance

Google wants to keep its software up to date so most of its applications – Chrome, Picasa, etc. – have an automated update system. This can prove very convenient for the users but because of the company's fast development cycle it can mean that they regularly have to install new versions, which can get a little annoying. As such, Google engineers have come up with a solution that offers the best compromise with a new compression technology called Courgette that offers greatly reduced downloads for Chrome.

Having an updated piece of software is always a great idea as it generally means less security worries, faster programs and better features. But the hassle of constantly keeping an application updated, searching for newer versions and installing them if they are available, means that most users get the latest versions infrequently. This is where automatic updates come in, removing their need to look for new versions by themselves. But this has its own disadvantages as it means that users have to install the new version every time it's available, wasting time with the download and the reinstall.

This is a problem especially for Google as its development cycles and versioning systems are mostly different from the rest of the software developers and have more to do with the search giant's background as a web company. But while frequent small updates aren't a problem when updating software on your own servers, that changes when dealing with millions of installs around the world. However, the developers behind Google Chrome have come up with a solution.

“On the stable channel these are mainly security bug fixes, but the updates are more adventurous and numerous on developer channel,” Stephen Adams, Google software engineer, wrote. “It is an anathema to us to push out a whole new 10MB update to give you a ten line security fix. We want smaller updates because it narrows the window of vulnerability. If the update is a tenth of the size, we can push ten times as many per unit of bandwidth.”

The new compression algorithm, called Courgette, allows downloads to be several times smaller than before. This is accomplished using a “diff” system that, instead of requiring that full versions of the updated files be downloaded, like now, only needs a small file with the “differences” between the current and the updated file.

Chrome has been using bsdiff to accomplish this up till now but the developers believed they could do better and have come up with an algorithm that, in their case, is producing files ten times smaller than bsdiff. While the full download would be 10,385,920 bytes the bsdiff one is only 704,512 bytes. But, by using Courgette, Google got it to just 78,848 bytes. And because the technology is open source many developers could benefit from the improved algorithm, opening the door for faster and more frequent updates, meaning safer software.