It will serve as a platform to run the entire Internet

Feb 6, 2008 16:35 GMT  ·  By

It seems like thinking big is pretty outdated these days and if you want to keep up with the trend you must go giant: Intel's two billion transistors on a single processor or Sun's 16-core chip are the supportive evidence. However, nothing compares to IBM's initiative, that is neither big, nor huge - it's merely insane.

IBM has been reported to work on a megascale computing system that will be able to host the entire Internet and run it like an application. The system is based on a refurbished and tweaked version of the "notorious" BlueGene/L supercomputer, world's most powerful machine at the time of writing. The company engineers are working on "convincing" the stubborn monolith to run a Linux distribution with the most popular web applications: Apache, MySQL and Ruby on Rails.

According to the same sources, IBM thinks that both large SMP (symmetric multi-processing) systems and clusters are fit for hardcore computing. However, enterprises would rather prefer clusters, because they are cheaper and easier to maintain. At the same time, users can invest the money in a long period of time, rather than all at once.

However, such hardware configurations are neither cheap, nor energy-efficient and require large storage space and plenty of energy. IBM wants to use more BlueGene systems to perform miscellaneous web tasks in a single machine, thus lowering the costs.

"We hypothesize that for a large class of web-scale workloads the Blue Gene/P platform is an order of magnitude more efficient to purchase and operate than the commodity clusters in use today," the IBM researchers wrote.

IBM has successfully managed to run these applications on the BlueGene machines, as part of the Kittyhawk project. The SpecJBB benchmark revealed that Java and the LAMP (Linux Apache MySQL Perl/Phython) software performance matches that of today's clusters.

The BlueGene/P system can integrate thousands of low-power PPC processors arranged in card, midplane and server rack configurations. Theoretically, the largest number of connected processors can reach 16,384 racks with 67.1m cores and 32 Petabytes of memory.

"The key fact to note is that the nodes themselves can be viewed for the most part as general purpose computers, with processors, memory and external IO," they wrote. "The one major exception to this is that the cores have been extended with an enhanced floating point unit to enable super-computing workloads."

Up until now, the BlueGene machines would rather run a single but pseudo-infinite job across the system, rather than multiple processes.