The core of AMD's strategy for the processor industry is HSA, Heterogeneous System Architecture, and Kaveri is the apex of that plan, at least until the next generation of APUs shows up. At present, hUMA is one of the most important cornerstones of AMD's technology.
As one might have guessed, hUMA is, like HSA, an acronym. It means Heterogeneous Uniform Memory Access.
In a normal computer, the CPU is the one that decides how tasks are carried out. That means that, even if a GPU can run loads of parallel number computing tasks, that potential cannot truly be tapped.
The CPU has to go through the operating system and memory first, to see how tasks can be divided, then it allocates said memory and, again through the OS, gives the GPU tasks to do.
This severely limits how much can be offloaded to GPU cores, and adds some lag to all computations.
In Kaveri, hUMA totally eliminates that limitation by giving the GPU cores direct access to the entire system memory, which can be of up to 32 GB.
Thus, the x86 CPU cores (in those two Steamroller dual-core modules) and GPU collaborate at a hardware level, allowing processes to interact equally. This step is called Heterogeneous queuing
AMD ran PCMark (system), 3DMark (graphics) and Basemark CL (compute) benchmarks, comparing the Kaveri to Richland and Intel Core i5 4670K Haswell CPU. Kaveri was up to 115% better than the former and up to 187% better than Intel's chip.
Obviously, the details of the system configurations were not released, so we can't know how well real-world results will reflect these benchmarks.
The Richland/Kaveri systems may have otherwise been the same, but we're not sure what hardware the Core i5 chip got. It's not like it could use the same motherboard after all.
Anyway, what really surprised us (though we'll stick to cautious optimism, for now) was the comparison between the AMD A10-7850K and the Intel i5-4670K in Adobe Photoshop Creative Cloud and Libre Office (a popular document writing and editing software).
Kaveri did 2.3 times better in the former (Smart Sharpen elapsed time measured in seconds and four times better in the latter (Elapsed ms of 21 stock calculations and graph plotting), all because the GPU could directly and unrestrictedly add its contribution.
Thanks to hUMA, various game features will run much better. Examples include TressFX Hair and Depth of Field in Tomb Raider, Ambient Occlusion in FarCry 3 and Bioshock Infinite, etc.