Surprisingly, without CUDA, Kepler’s compute power is not easy to find

May 21, 2012 23:41 GMT  ·  By

AMD’s top single GPU video card, the Radeon HD 7970 reportedly shows considerably better performance in single- and double-precision floating point computing using OpenCL.

We got this impression the first time we saw AMD's GCN architecture GPU compute benchmarks, but we thought it was some AMD-specific estimation or lack of Nvidia's 

necessary optimization.

Practically, after all the fuss with Nvidia’s Fermi specifically architected for GPU compute, we experienced an architecture from AMD that performed much better without any API optimizations, like Nvidia has with its CUDA.

Don’t get us wrong – we know the value of CUDA and we greatly appreciate Nvidia’s software and compiling work, and we still believe AMD is not into software as they need to be.

The thing is that we were simply surprised with AMD’s level of OpenCL performance, when the company doesn't even have a single simple application to show its GPU compute power, like Intel has with their Quick Sync.

In the end, the experts from vr-zone.com have tested AMD and Nvidia’s latest generation of cards in Sandra 2012 SP2 (OpenCL) and practically confirmed our initial impressions.

AMD’s CGN really does offer much better GPU compute performance than Nvidia’s Fermi and Kepler architectures when OpenCL is concerned.

In single-precision floating point performance, AMD’s Radeon HD 7970 card achieves a 42% higher performance than Nvidia’s Kepler and 155% over Fermi.

In double-precision floating point performance, AMD’s Radeon HD 7970 card obtains a 478% higher performance than Nvidia’s Kepler and 293% over Fermi.

CUDA optimizations are quite hard to integrate into a high-performance server application, and the raw GPU compute power is often preferable.

Therefore, we don’t understand why AMD doesn’t have a FireStream card based on the Tahiti GPU to compete with Nvidia’s Tesla.

Photo Gallery (3 Images)

exATI AMD Radeon GPU card
Sandra 2012 SP2 BenchmarkSandra 2012 SP2 Benchmark
Open gallery