Algorithm beats humans with a 28-22 score in 50-round battle

Feb 29, 2016 02:35 GMT  ·  By

A team of two Google employees and a researcher from the RWTH Aachen University in Germany have put together an algorithm that can analyze an image and estimate the location at which it was taken with a very high degree of accuracy.

To develop PlaNet (this is the name of their project), the researchers started by splitting the globe into 26,000 grid tiles, with a higher concentration of tiles in cities, where more photos are generally taken, and with fewer tiles in wild areas like oceans, forests, deserts, or Arctic regions.

Researchers trained PlaNet with 125 million images

The researchers then took over 91 million Flickr images that contained geolocation data and fed them into PlaNet, for the sole purpose of training the algorithm to distinguish subtle clues unique to each grid tile.

After researchers trained the PlaNet AI, they then took another 34 million images and tested the system, in order to see how accurate the algorithm was and make subtle tweaks, an important step in improving the PlaNet's accuracy.

During the final stage of their research, the scientists took another 2.3 million images but stripped their geo-location EXIF data before feeding them into PlaNet.

The end result? PlaNet was able to accurately place each photo in its proper grid tile with an accuracy of 3.6% for street-level images and 10.1% for city-level pictures. At the country level, PlaNet's accuracy grew to 28.4%, and at a continent level, the accuracy was at 48.0%.

All of this while using 377 MB of RAM, unlike similar geo-localization tools that guzzle entire TBs of memory.

PlaNet beats humans in an image geolocation game

But the researchers took their study one step further to see how the algorithm fared against humans. For this, the Google team created the GeoGuessr game, where people were asked to guess where a Google Street View image was taken and then place a pin on a map.

While the general train of thought is that the human brain always trumps neural networks because of its capability of handling many more input points, the PlaNet algorithm managed to edge us out in a 50-round battle, winning 28 to 22.

Researchers say that PlaNet's median localization error was 1131.7 km while the median human localization error was 2320.75 km, more than double PlaNet's value.

For more technical details, check out the PlaNet - Photo Geolocation with Convolutional Neural Networks research paper.