Thursday, August 26, 2010

Thermal Design Power (TDP)




To understand power management, it's important to fully appreciate the ways designers deal with average and peak power. Most of this article will focus on how average power can be reduced, but there are also some interesting power management techniques to handle the case of peak power consumption. TDP is a measure of how much power needs to be dissipated by the cooling solution when the CPU is running the maximum software workload that would be expected in normal operating conditions. (With specialized test code, a CPU could generate even more heat.)
More specifically, the CPU manufacturers calculate TDP as the amount of heat that needs to be transferred from the processor die in order to keep the transistor junction temperature (Tj) below the maximum for which the device is guaranteed to operate (Tj is usually 100 degrees centigrade or lower, but note that things are actually much more complicated. Some vendors will often specify a die "case" temperature as low as 70 degrees centigrade in order to get high clock rates. That's why some desktop heat sinks are enormous.)

How that heat gets removed is part of the system thermal design and can be accomplished by heat sinks, fans, and air vents. In a mobile device, a large portion of the heat is conductively transferred through the system chassis—and then onto your lap, highlighting one of the limitations with using a CPU that has a high TDP value. Many laptops use CPUs with a TDP of 30 or more watts. These are easily identified by the fans in the case and the short amount of time you'd actually want the machine on your lap. Note that multicore CPUs make this problem even worse, since the TDP and cooling solutions are based on all cores running simultaneously

CPU's Protect Themselves from Killer Heat

Before the CPU die can exceed the maximum junction temperature, on-chip thermal sensors signal special circuitry to lower the temperature. Over the years, CPUs have incorporated several mechanisms for measuring and controlling temperature. An on-chip thermal diode allows an external analog-to-digital converter to monitor temperature. Basically, the diode current changes as the chip heats up, allowing the system microcontroller to measure the voltage difference and take action to lower the temperature.
The system vendors program the microcontroller with temperature control algorithms to speed up fans, throttle the CPU, etc. In some designs, the CPU will run its own BIOS code to control temperature. However, CPU designers were worried about chip damage if the external microcontroller were to fail. Also, some of the thermal spikes happen so rapidly that it was possible to exceed maximum die temperature before the system could respond. Additional on-chip temperature sensors have been added, directly controlling digital logic that automatically reduces CPU performance and temperature. If for some reason the CPU temperature keeps rising, eventually it reaches a critical condition, and hardware signals the power supply to shut down completely.

Sometimes you'll see references to Thermal Monitor 1 (TM1) and Thermal Monitor 2 (TM2). These are mechanisms used by the CPU to quickly reduce performance and get an accompanying drop in power consumption. TM1 is an older technology and simply inserts idle cycles to effectively halve the pipeline frequency, even though the clock signal continues to run at the same frequency. This is a dramatic drop in performance for a linear drop in power consumption.

TM2 uses dynamic voltage scaling (DVS) techniques to reduce the clock frequency and then signal the external voltage regulator to shift to a lower voltage. The power supply voltage won't drop instantaneously because of capacitance. However, voltage reduction has the biggest impact on temperature, since power varies by the square of voltage. We'll talk more about dynamic voltage scaling, since it is a key power management technique that helps reduce average power consumption. There are differences in the algorithms used by the various CPU vendors for how they throttle clock rate and voltage to keep the die below maximum temperature.

No comments:

Post a Comment

What are some lessons that life teaches you

Past  can not be changed. Opinions  don't define your reality. Everyone's  journey  is different. Judgements  are a confessi...