Technology limitations and solutions

Jul 24, 2008 11:33 GMT  ·  By

Search engines are continuously fighting for supremacy and in order to return the most accurate results to queries, the newest technologies must be exploited in full swing. Yahoo!, whose approach of semantic search was rather cautious, explains why this method does have some technical limitations.

"The current generation of search engines is severely limited in its understanding of the user's intent-and the web's content-and consequently in matching the needs for information with the vast supply of resources on the web." says Peter Mika, Researcher at Yahoo! Research, Barcelona, focusing on Semantic Web technologies in an article for DevX.com.

Considering the fact that language processing has not become the most familiar tool to search engines, some queries are hard to be handled. For example, the secondary meaning of a term is rarely taken into account in a search, which makes the retrieval of precise results very difficult. Moreover, multimedia items are, sometimes, impossible to be retrieved, as they are usually described in a few tag words or sentences that crawlers are hard to recoup. Furthermore, everyone knows that, when they don't know the precise word to use when conducting a query, descriptions of the object are usually useless.

Natural language processing (which disambiguates some keywords from the context offered by a database) and semantic web (which uses metadata offered by publishers) are the two technologies used in semantic search, but they still have flaws. However, Yahoo!'s semantic search business is indicated as a pioneer, all the more so as it is an open platform to be also used by professionals.

"In SearchMonkey, Resource Description Framework made it possible to separate syntax and vocabulary in that publishers are free to use any vocabulary, opening up the system to support the long tail of web content." explains Mika as to why Yahoo!'s search open platform is most useful. Publishers are advised to use Yahoo!'s technologies for their applications because their metadata is allowed to work locally and it is not integrated in the centralized context.