IT buff discovers that the inventor now works at Google

Jan 17, 2012 18:11 GMT  ·  By

A description of the Siri technology coming from the mouth of Shawn Carolan of Menlo Ventures, an investor in Siri Inc., piqued the interest of one tech journalist who claimed to have heard a similar approach being made by a company called Excite.

In his interview with Bloomberg, Carolan explains that Siri takes all words as “one big block” and maps those “strings of words” across a group of 10 possible domains of expertise.

Here's where Robert Cringely — an information technology buff and technology journalist with a mixed track record — steps in and explains just how familiar Siri’s approach is to Excite’s:

Here’s how the ArchiText (later Excite) search engine worked. Every query was stripped to its significant words — subjects, objects, verbs and adjectives — then each query became a vector in a multidimensional space with each unique word being a dimension. “How do space rockets stay in orbit when they are flying through space?” would become a vector string one unit long for each of those words but two units long for the word “space.”  This bit of semantic DNA was then mapped against an index of millions of web pages that had all been similarly converted to multidimensional vectors. It was quick, scalable, concentrated the processing load on the indexing where it didn’t bog down retrieval, and could reliably return pages like “Why satellites fall from the sky” that might answer the question even though none of the same words were used.

Cringely thinks Exite’s approach “sounds darned similar” to the methodology laid out by Carolan in his interview with Bloomberg.

He also believes that Excite’s patents, “while nearing the end of their lives, could turn out to be very valuable to, say, a Google trying to compete with Siri on Android or even to an Apple trying to defend Siri from competitors.”

And it just so happens that the original inventor, Graham Spencer, now works at Google.