Thanks to open source technology and the Amazon cloud

Nov 25, 2011 14:26 GMT  ·  By

There was a time when open source technology was the poor man's solution. It enabled startups to get going, but real companies paid for their software.

That time has long gone. The generation of companies built on open source technology changed the mindset. Now only the companies stuck in the past are not using open source technology, in one form or another.

In fact, this becomes even more apparent with each new generation of companies. For example, Google relied a lot on low level stuff like Linux, Apache and so on in the early days.

But it found itself in a position where the type of software it needed, that scaled like it wanted to, simply did not exist. So it set about making its own, to date, Google's big infrastructure services are some of the best in the world, but they're very closely guarded.

Google does open source some technologies and is a big contributor in some areas, but core software remains closed.

However, by the time Facebook came about, open source software was in a much better state, so was able to use more "off the shelf" components than Google.

As such, Facebook became a much bigger contributor to open source projects designed to handle huge amounts of data. Its Cassandra project, now run by Apache is just one example.

Another big example of successful big data open source software is the rather popular Hadoop, developed in large part at Yahoo. In fact, Facebook dropped Cassandra for a new system built on top of Hadoop.

At the same time open source technology evolved, the cloud became a real thing. Commoditized computing power became a reality thanks in large part to Amazon's Web Services.

Put these two things together and you have companies like Netflix, the single biggest source of internet traffic in the US, running entirely on Amazon's cloud and relying on open source technology such as the Apache Cassandra.

Netflix is switching over to Cassandra to power its data storage needs and has recently done a scalability test using AWS, the same thing it uses in production.

In its testing, Netflix went from a cluster of 48 instances to a maximum of 288. It used the built-in stress test and several other AWS instances posing as clients. Unsurprisingly, Cassandra scaled linearly as usage grew proving its suitability to Netflix's needs.