Power Management Features in Intel Processors Shimin Chen

Power Management Features in Intel Processors Shimin Chen

Power Management Features in Intel Processors Shimin Chen Intel Labs Pittsburgh UPitt CS 3150, Guest Lecture, February 24, 2010 Power Management Many components in a computer system: CPU(s) DRAM memory Hard drives Graphics card Monitor

PC Network card system with Intel System-wide power management actions are basedcore oni7 power management features of individual components Our focus: CPUs 2

Why CPU Power Management? Save power For mobile devices: longer battery life For servers: lower operational cost More environmentally friendly Thermal management (less obvious but very important) Higher power more heat higher temperature Maximum operating temperature Beyond this temperature, transistors may not operate correctly. Then one sees weird bugs, or even system crashes.

Running CPU at too high temperature reduces the CPU life. 3 Many Terms When Reading About CPU Power Management P-states, C-states ACPI Enhanced Intel SpeedStep Dynamic frequency and voltage scaling Halt state Idle state

Suspend 4 Two Perspectives Hardware perspective Bottom up description Hardware mechanisms E.g., Intel processor manuals take this approach ACPI standard perspective ACPI: Advanced Configuration and Power Interface Top down description Define programming APIs and functionalities

Confusions often arise because The same concept may be represented with different terms And the two descriptions do not exactly match 5 The Description in This Talk Combined approach: Provide a high level overview of ACPI Describe the hardware mechanisms and their relationships to ACPI I hope that this can give you a structured

view of the CPU power management, and clarify the aforementioned terms and their relationships 6 Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P- States) Low-Power Idle States (C-States)

Multi-core considerations Summary 7 What Is ACPI? ACPI (Advanced Configuration and Power Interface) Standard interface specification OS can perform power management using this API Hardware and software drivers support this API Mapping from CPU mechanisms to ACPI is provided by BIOS and software drivers

Applications ACPI OS Power Management Software drivers Hardware: CPU, BIOS etc. 8 ACPI State Hierarchy (1/3) Global system states (g-state) G0 : Working G1 : Sleeping (e.g., suspend, hibernate) G2 : Soft off (e.g., powered down but can be

restarted by interrupts from input devices) G3 : Mechanical off Lower number means higher power 9 ACPI State Hierarchy (2/3) Global system states (g-state) G0 : Working Processor power states (C-state) C0 : normal execution C1 : idle C2 : lower power but longer resume latency than C1

C3 : lower power but longer resume latency than C2 G1 : Sleeping (e.g., suspend, hibernate) Sleep State (S-state) S0 S1 S2 S3: suspend S4: hibernate G2 : Soft off (S5) G3 : Mechanical off 10

ACPI State Hierarchy (3/3) G0 : Working Processor power states (C-state) C0 : normal execution Performance state (P-State) P0: highest performance, highest power P1 Pn C1, C2, C3 G1 : Sleeping (e.g., suspend, hibernate)

Sleep State (S-state): S0, S1, S2, S3, S4 G2 : Soft off (S5) G3 : Mechanical off 11 Supporting ACPI States ACPI defines data structures to track the states and functions to operate on the states CPUs implement mechanisms to support these states BIOS and software drivers hide the difference

of CPU implementations to support the ACPI defined data structures and functions 12 Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P- States) Low-Power Idle States (C-States) Multi-core considerations

Summary 13 Enhanced Intel SpeedStep Technology (EIST) Enhanced Intel SpeedStep == dynamic frequency and voltage scaling An operation point (frequency, voltage) == P- state Note that the CPU is in normal operation,

executing instructions (C0) 14 Why Dynamic Frequency and Power Scaling? Physics: Lower voltage slower transistor switch speed longer latency of CPU operations lower frequency Larger power savings if reducing frequency

and voltage at the same time: P= CV2F P: power; C: capacitance; V: voltage; F: frequency 15 Example: Intel Pentium M at 1.6GHz Source: Ref[4]

16 Power vs. Core Voltage of Intel Pentium M at 1.6GHz Source: Ref[4] 17 Hardware Mechanisms Select voltage

Processor Componen ts Vc c Frequen cy multipli er

Voltage Regulat or Clock 18 Enhanced SpeedStep vs. Legacy SpeedStep Enhanced: Supports are mainly in CPU itself as opposed

in chipsets Faster transition time (e.g., 10us down from 250us for the Intel Pentium M processor) 19 How to Control EIST in Software? EIST is available or not? CPUID instruction, ECX feature bit 07 Enable EIST (in OS kernel) Set special register IA32_MISC_ENABLE bit 16 Change operational point (in OS kernel) Write operation point ID to special register

IA32_PERF_CTL This ID is processor model specific 20 EIST Availability Enhanced Intel SpeedStep Technology is available in Pentium M processor Pentium 4 Intel Xeon Intel Core Solo Intel Core Duo

Intel Atom Intel Core2 Duo 21 Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P- States) Low-Power Idle States (C-States) Multi-core considerations

Summary 22 Low-Power Idle State These are the idle C-State: C1, CPU is not executing instructions in these C- states Power saving mechanisms: Stop clock signal Flush and shutdown cache

Turn off cores 23 C-State in Intel Core i7 Processor Core C0 State The normal operating state of a core where code is being executed. Core C1/C1E State The core halts; it processes cache coherence snoops.

C1E: if possible, reduce voltage and frequency to the lowest 24 C-State in Intel Core i7 Processor Core C0 State The normal operating state of a core where code is being executed. Core C1/C1E State

The core halts; it processes cache coherence snoops. Core C3 State The core flushes the contents of its L1 instruction cache, L1 data cache, and L2 cache to the shared L3 cache, while maintaining its architectural state. All core clocks are stopped at this point. No snoops. C2 not defined. The C-States are processor model specific. 25

C-State in Intel Core i7 Processor Core C0 State The normal operating state of a core where code is being executed. Core C1/C1E State The core halts; it processes cache coherence snoops. Core C3 State The core flushes the contents of its L1 instruction cache, L1

data cache, and L2 cache to the shared L3 cache, while maintaining its architectural state. All core clocks are stopped at this point. No snoops. Core C6 State Before entering core C6, the core will save its architectural state to a dedicated SRAM on chip. Once complete, a core will have its voltage reduced to zero volts. 26 C-State Transition

hlt or mwait instruction triggers the transition to lower power states Interrupts (among others) triggers the transition to C0 27 C-State Availability C0 is always available The low power idle C-States are processor model specific Described in processor data sheet.

28 Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P- States) Low-Power Idle States (C-States) Multi-core considerations P-States C-States

Intel Turbo Boost Technology Summary 29 Multi-core Chip 4-core CPU (Nehalem) Question: can we set the individual cores p-state and c-state? 30

P-State: Enhanced Intel SpeedStep Technology Dynamic frequency and voltage scaling Current Intel processors use the same frequency and voltage for all the cores Therefore, it is impossible to actually run different cores at different p-states. Processor p-state = MIN (core desired p- states) 31

C-State: Low-Power Idle States The actions are: Halting the execution Flushing cache Stopping clock These actions can be performed on individual cores Different cores can have different C-State 32 How about C1E? C1E is C1 + the lowest frequency P-state

Therefore, C1E is only used when all the cores are in C1E. 33 How about C-State for Hyper Threading? There can be two hardware threads per core Each thread may use mwait instruction to specify the desired C-state However, the C-state action cannot be performed for individual threads

core c-state = MIN (thread c-state) 34 General Optimization Guideline In general, it is better to use the cores evenly Distribute computations so that the cores have similar utilization Then all the cores can go into the same P-State The processor can actually go into the P-State For single-threaded application, there is a new Intel processor feature

35 Intel Turbo Boost Technology Basic idea: Processor frequency is fundamentally limited by the operating temperature If there is head-room in operating temperature, one can increase the processor frequency to achieve higher performance Intel Turbo Boost Technology: All but one core are in C3/C6 Automatically increase frequency given

temperature and other constraints 36 Summary ACPI defines a standard interface for operating systems to utilize hardware power features Supported by most OS, e.g., Linux, Windows CPUs, BIOS, and software drivers combined to support the ACPI interface Intel processor power features:

Enhanced Intel SpeedStep Technology: P-State Low power idle states: C-State Intel Turbo Boost Technology: not in ACPI standard 37 References 1. http://www.acpi.info 2. Intel 64 and IA-32 Architectures Software Developers Manual. Volume 3A: System Programming Guide. Order Number: 253668-033US. December 2009. Chapter 14. 3. Intel 64 and IA-32 Architectures Optimization Reference Manual. Order Number: 248966-020. November 2009. Chapter

11. 4. Enhanced Intel SpeedStep Technology for the Intel Pentium M Processor. Order Number: 301170-001. March 2004. 5. Intel Core i7-800 and i5-700 Desktop Processor Series, Datasheet Volume 1. September 2009. Chapter 4. 38 Thank you! 39 Backup

40 Summary: ACPI State Hierarchy G0 : Working Processor power states (C-state) C0 : normal execution Performance state (P-State) : Enhanced Intel SpeedStep Technology Other C-state:

model-specific low-power idle states G1 : Sleeping (e.g., suspend, hibernate) Sleep State (S-state): S0, S1, S2, S3, S4 G2 : Soft off (S5) G3 : Mechanical off 41 Clock Duty Cycle Modulation Some Intel processors support an additional mechanism to reduce power consumption:

42 Use C-State to Reduce Power OS can monitor activity level (e.g., for every 100ms) and determine the desired C-State 43

Recently Viewed Presentations

  • Artificial System of Plant Classification

    Artificial System of Plant Classification

    Class determined by Stamen Order by Pistils Problems Controversial Before Linnaeus Naming practices varied Artificial vs. Natural Artificial taxonomy was a system of grouping unrelated plant species by a common criteria (i.e. a flowers sexual organs) Natural classification reflects evolutionary...
  • Bruce A. Bracken, PhD About the Author Bruce

    Bruce A. Bracken, PhD About the Author Bruce

    Bruce A. Bracken, PhD * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Construct Validity:...
  • Canada'S Physical Regions

    Canada'S Physical Regions

    CANADA'S PHYSICAL REGIONS What is a Physical Region? A section of Canada's land with features that are the same Elevation, vegetation, industry There are 6 main physical regions in Canada: Western Mountains (Cordillera) Interior Plains (Prairies) Canadian Shield Great Lakes-St.Lawrence...
  • Systems Analysis and Design 10th Edition ToolKit B

    Systems Analysis and Design 10th Edition ToolKit B

    Visible Analyst. The Visible Analyst® CASE tool can generate many types of models and diagrams including an entity-relationship diagram and a data flow diagram. Rational Software. IBM offers many systems development and modeling products, including a powerful tool called Rational...
  • Energy Systems - St Johns PE A-Level

    Energy Systems - St Johns PE A-Level

    At the 2008 Beijing Olympic Games, David Davies won the silver medal in the swimming 10 kilometre marathon event, in a time of 1 hour 51 minutes and 53.1 seconds.Explain how the majority of energy used during the race would...
  • Monte Carlo Hidden Markov Models - Sebastian Thrun

    Monte Carlo Hidden Markov Models - Sebastian Thrun

    Locating the Epipoles Stereo Rectification (see Trucco) Reconstruction (3-D): Idealized Reconstruction (3-D): Real Summary Stereo Vision (Class 1) Epipolar Geometry: Corresponding points lie on epipolar line Essential/Fundamental matrix: Defines this line Eight-Point Algorithm: Recovers Fundamental matrix Rectification: Epipolar lines parallel...
  • American Imperialism - Weebly

    American Imperialism - Weebly

    american imperialism In 1898, the United States began to acquire new territories, making it an imperial power. Most of these territorial gains resulted from the Spanish-American War.
  • Legal English and the Common Law

    Legal English and the Common Law

    Sometimes agreement is used as a synonym for contract, but: if every contract is an agreement, not every agreement is a contract. A contract in itself constitutes a type of agreement, it is a legally binding agreement, that is an...