Based on the memetracking implementation for the Drupal CMS platform

Jul 30, 2008 14:41 GMT  ·  By

News sites are and always have been popular and at some point many web developers felt the need to start one for a particular group of people or a particular field of interest, for a local community or for their college campus.

The problem with creating a successful news website is content maintenance. In addition to having good programming skills, one needs to have a lot of free time on one's hands because, when the website is finally ready for launch, development-wise, someone has to constantly add content. Or do they?

Apparently not. Content can be indexed from external sources, grouped, ranked, categorized, rated and published to the end-user automatically using memetracking, which is a web 2.0 technology implemented by sites like Google News, Digg, Technorati, Techmeme and others. It works by searching given sources for information (news articles, press releases, blog pages, etc.), which is called aggregation, but also determining the popularity or similarity of the information and grouping it accordingly using all sorts of sophisticated algorithms like the vector space model.

While the memetracking technology is not something new, it usually proves to be too much to handle for the common web developer wishing to implement it, as it certainly has a serious learning curve. Fortunately, easy to deploy platforms have started to adopt memetracking technology for a while now and some have already been released. One such implementation is the Memetracker module for the popular Drupal CMS platform. This module along with a Machine Learning API module are being developed by Kyle Mathews, a graduate student at Brigham Young University as part of Google Summer of Code. Even though the modules are still in alpha stage of development and obviously not ready for production sites, they look very promising and they could be an amazing feature to an already very popular CMS platform.

There have been other implementations of memetracking technology with Drupal, like the Michael Imbeault's Eureka! Science News project. Even though it is a successful and very appreciated effort, the process of development was far from easy, as Michael describes in his article.

Drupal developers who want to get a head start with these modules or even contribute can already get a functional alpha version from the project's website. The Machine Learning API module provides the learning algorithms that the Memetracker module will use to automatically sort, group and filter the content it gets from external or internal sources.

The Memetracker module, which requires the Machine Learning API module is supposed to soften a lot of the rough corners Michael Imbeault encountered while working on his Eureka! project, making development of a human editor-free website much smoother to the common web developer. Of course, it will still need a lot of tweaking and fixing to get it working like you want, but you will have more time to do it, or take care of other development issues, instead of spending it manually gathering, inputting and sorting information.

This technology is very flexible and can have many applications where it is required to aggregate and publish the most important or popular content automatically and as Kyle Mathews puts it - "a memetracker is a smart aggregator."