NVIDIA releases Kal-El white papers, which announces a fifth Cortex A9 companion core specifically designed to handle less demanding tasks in effort to minimize power consumption caused by active standby processes. The Companion core’s max operating frequency gets capped at 500MHz, offering higher performance and greater efficiency per watt when running menial tasks and this leaves the four main cores free to take care of the stuff it does best. Project Kal-El also includes a brand-new 12-core GPU that delivers up to 3x the graphics performance of Tegra 2 and the Variable SMP architecture is also completely OS transparent, which means that operating systems and applications don’t need to be redesigned to take advantage of the fifth core……………….
Graphics chipmaker Nvidia said its next-generation Kal-El Tegra chip, built under Variable Symmetric Multiprocessing architecture will have five rather than four cores and vSMP technology delivers power saving that not only minimizes power consumption during active standby states, but also delivers quad-core performance benefits while keeping dynamic power consumption within thermal budgets required for mobile devices. NVIDIA has optimized its new chip to excel at high-performance and low power, two goals that are often contradictory and in a paper entitled NVIDIA reveals how it has designed a quad-core chip with a 5th companion core to achieve this seemingly impossible goal and the root of the problem can be found in the process used to built computer chips. It is a fact that every chip leaks some amount of power, this is a natural phenomenon. The power leakage is the electric current that is consumed when the chip is in idle state and especially on mobile devices, it cannot be avoided and the semiconductor industry has developed techniques to reduce power leakage. But chips optimized for low-power leakage in idle state tend to consume more power than non-optimized chips during intense workloads. It’s obvious that people want a mobile device that consumes a minimal amount of power at all times.
The semiconductor reality is not going anywhere, so NVIDIA took a rather interesting approach to the problem: Tegra 3 is using a 5th companion core to handle all the low-power tasks like running the operating system in sleep mode, checking emails and notification and keeping the system alive when you are reading a book, playing media files. When that companion core is working the normal high-performance cores can be shut down. Because this companion core is optimized for low-power, NVIDIA doesn’t want it to handle heavy workloads, or it would start consuming too much. To do so, its frequency has been set with a range of 0 to 0.5GHz. Whenever the companion core is overwhelmed by work, one or several high-performance cores wake up and pick up the work. This is NVIDIA’s definition and implementation of Variable Symmetric Multiprocessing (vSMP), which it has patented. NVIDIA says that the operating system (Android 3.0, aka Honeycomb) assumes that all CPU cores in the chip are identical instances, which is not true in this case. Therefore special management had to be devised at the hardware level, and the software level to make this heterogeneous group of cores completely transparent to the OS.
Cores are switched ON and OFF depending on a real-time analysis of the workload as the diagram above shows. The only limitation seems to be that companion Core cannot be activated when Core 1-4 are. NVIDIA says that not allowing the companion core and the high-performance cores to run at the same time simplifies the cache memory management and avoid performance penalties that would have hindered the high-performance cores. Making this transparent to the OS is very important for many reasons, but for end-users, it means that OS updates don’t have to wait for NVIDIA to tweak some code. Tegra 3 shows power benefits even when compared to the current generation Tegra 2 processor. According to NVIDIA, that is true during sleep state (LPO), media playback and even gaming. NVIDIA also provides perf/Watt comparisons with other high-profile chips that are on the market such as the OMAP4 and the Qualcomm QC8660. NVIDIA is using Coremark, a well-known benchmark that is very multi-core friendly (performance is more or less expected to scale with the number of cores).It is quite innovative and while embedding CPU cores with different capabilities is not something new, it is the first time that it has been used in this way in a high-end mobile processor. In below you will also find some more details in pictures: