ARM's first step into heterogeneous multi-core architectures

Oct 20, 2011 08:44 GMT  ·  By

ARM has recently unveiled a new low-power core, dubbed the Cortex-A7, that is destined to be used in entry-level mobile devices as well as a companion core for the company's more powerful Cortex A15 processors in order to create more energy-efficient devices.

Unlike ARM's contemporary architecture that use a OoO (out-of-order) design, the Cortex A7 takes a step back since it comes as a simple in-order core capable of issuing up to two instructions in parallel.

ARM has taken this decision as it wanted to simplify the design of the A7 so that it could reduce both power consumption and the complexity of the chips using it.

As expected, moving to an in-line architecture does have a few drawbacks as far as performance is concerned, but as AnandTech has found out, ARM believes that it will still be able to deliver better performance per clock and better overall performance compared with the Cortex A8 architecture.

Much of these improvements come from the inclusion of a more modern branch prediction engine as well as from a shallower pipeline that decreases the chances for branch mispredictions.

These changes are also paired with better prefetching algorithms as well as with a very low latency L2 cache (10-cycles).

Despite the improvements brought by ARM to its new core, the biggest news for this architecture is its 100% compatibility with the Cortex A15 ISA, including virtualization instructions, integer divide support and 40-bit memory addressing.

What this means is that all the code that was designed for the A15 will also be able to run on the Cortex A7, just that it will be a tad slower.

This is extremely big news for ARM as it enables its partners to build SoC devices using a heterogeneous multi-core architecture that pairs together both Cortex A7 and Cortex A15 cores.

Such a processor design will behave similarly to Nvidia's Kal-El SoC, as it will use the low-power Cortex A7 core for running simple tasks that don't require that much computing power, while more demanding applications will use the A15 cores.

ARM’s own power management firmware determines which core cluster to activate depending on performance states requested by the operating system, and cache coherency is guaranteed via the CCI-400 interconnect.

This so called big.LITTLE configuration should be completely transparent to the OS, ARM claiming that the switch between the core clusters can be done in about 20 microseconds.

Arm hasn't revealed any information regarding the release date of the first devices to use the Cortex A7, but these can pack between one and four processing cores and will be fabricated using the 40nm process.

Photo Gallery (5 Images)

ARM outs low-power Cortex A7 core
ARM Cortex A7 compared to A15ARM Cortex A7 power efficiency
+2more