The system could be used to identify and match audio files

Nov 10, 2011 07:58 GMT  ·  By
The algorithms driving Imagine research's MediaMined software differentiate between instruments, voices and other sounds without needing keywords and indexes sound files to allow for sound-similarity searches
   The algorithms driving Imagine research's MediaMined software differentiate between instruments, voices and other sounds without needing keywords and indexes sound files to allow for sound-similarity searches

A group of researchers from the San Francisco, California-based Imagine Research has just released MediaMined, a new artificial intelligence system that is capable of understanding and indexing sound.

The tool may be used for finding and matching audio files that have not been previously labeled. Audio engineers with the company are extremely proud of their AI, and say that the system can be improved even further, since it has the potential to reach even more applications.

The US National Science Foundation's (NSF) Small Business Innovation Research program supported the company through two research grants. MediaMined will have significant applications inside recording studio, and is bound to become popular among musicians themselves.

Once activated, the AI tool is capable of browsing large set of tracks and recordings, discovering and cataloging all inputs, and then labeling them for easy reference. The best part is that it does so based on specifications inputed by the users.

“MediaMined adds a set of ears to cloud computing. It allows computers to index, understand and search sound – as a result, we have made millions of media files searchable,” explains Imagine Research founder and CEO, Jay LeBoeuf.

“It acts as a virtual studio engineer. If your software detects male vocals, then it would also respond by labeling the tracks and acting as intelligent studio assistant – this allows musicians and audio engineers to concentrate on the creative process rather than the mundane steps of configuring hardware and software,” he adds.

The company official says that the system may also be used for searching a particular category of sound. He gives the example of a special effects studio specialist searching for an explosion-like sound. At this point, this is done through text searches inside available sound banks.

By using MediaMined, the engineer will be able to find a category of sounds that is not put together after name, text context or other metadata, but rather after the actual sounds. This will help them differentiate between bombs, huge blasts, nuclear blasts, apocalyptic explosions and similar sounds.

“MediaMined is capable of grouping those sounds together--you would give us an example of what you are looking for (the sound of an explosion) and we are able to return things that sound like an explosion,” LeBoeuf explains.

According to officials at the NSF, this degree of freedom in conducting searches is what attracted them to finance the technology in the first place. They explain that other methods of indexing sounds do exist, but they are far more limited than the new system.

“The software enables users to go beyond finding unique objects, allowing similarity searches--free of the burden of keywords – that generate previously hidden connections and potentially present entirely new applications,” NSF program director Errol Arkilic concludes.