Nvidia’s way is the CUDA way and the true fact is that a lot of coding and optimization work is needed to fully enjoy the performance of Nvidia’s Tesla cards. HPC clients might see Intel’s easier Xeon Phi coding as a way to reduce the cost of software coding that needs to be done.
Make sure you check out our first
parts of our GPU compute analysis.
On the other hand, HPC clients really care about performance. We have a hard time deciding whether software coding money savings are more important than the end performance of the installation.
We’re inclined to believe that, in the HPC or supercomputing world, money is usually not an issue and, more importantly, the small amount of money that software porting and optimization represents is not as important when compared with the total cost of the hardware and implementation.
Considering that we’re talking about tens of thousands of dollars worth of man-hours doing coding and optimization, the client paying for the server might give Intel’s Xeon Phi a thought if the performance was the same.
However, the performance is not going to be the same. If Nvidia achieves its targets with the GK110 GPU, the DP F64 performance will be almost twice what Intel’s Xeon Phi brings to the table.
Some might wonder what the point in going for Kepler now is. Why not wait for Xeon Phi or Tesla K20?
The answer is that, if you want your supercomputer ready at the end of this year, you can safely go with Nvidia’s Tesla K10 that’s based on the new Kepler architecture.
Sure, there is more CUDA programming to do, but you’ll be able to have you server ready much earlier than if you did, should you have waited for Xeon Phi or Tesla K20.
Having the final installation ready faster is only one of the advantages Tesla K10 offers.
The second advantage is that, if you’ve ported your applications in CUDA and you’ve already had them optimized for the Kepler architecture, you can simply swap the Tesla K10 card with the K20 models when they hit the market.
Once this upgrade is finalized, your supercomputer will likely have 30 times the DP FP64 raw computing power compared with the initial Kepler K10 installation, and more than 3 times the raw power of a similar Xeon Phi installation.
There is nothing Intel can do this year or the next that would allow it to achieve a doubling of Xeon Phi’s DP FP64 performance and, from a pure performance standpoint, Nvidia’s GK110 is a definite winner.
Once we factor in AMD’s GCN, we’ll clearly see why Nvidia’s Tesla is being squeezed hard in the HPC market, but this will follow in the sixth part of our GPU compute analysis.