Showing posts with label Processor specific. Show all posts
Showing posts with label Processor specific. Show all posts

Tuesday, January 5, 2010

C-States

C1
  • -      Internal CPU Clock signal is stopped
  • -      BIU and APIC are still fed with internal Clock Generator to allow the CPU to temporarily exits the HLT state
  • -      Since CPU temporarily “leaves” the HALT state, this state is called as “Stop Clock Snoop State” or “HALT/Grant Snoop State” or “Snoop State”
Intel C1E
  • -      Exactly similar to C1; but reduces the internal CPU voltage as well
  • -      If this is enabled in BIOS, the CPU will enter this mode instead of “traditional” C1 state on executing HLT instruction
  • -      Also called as Extended Halt/Stop Grant Snoop State”
AMD C1E
  • -      Works just like C3 (shutting down all CPU clocks – both internal and external)
  • -      Processor enters C1E state when this option is enabled in BIOS AND all CPU cores enter the regular C1 (HLT) state.
  • -      The diff between C1E and C3 is basically how the CPU enters the Sleep state:
  • -      While on traditional C3 state, CPU must be put in that state usually by a command from the OS
  • -      Where as on C1E, the CPU enters the Sleep state automatically when all cores are at HLT (C1) state
C2
  • -      Intel - Introduced by adding one extra pin to CPU called “STPCLK” (Stop Clock)
  • -      AMD - C2 state is entered by simply reading a register from the ACPI, circuit that is physically located on the chipset;
  • -      When this pin is asserted, the CPU core clock is cut
  • -      As you can notice, both C1 and C2 cuts the CPU core clock; the diff is in how the CPU achieves this
  • -      C1 is activated by software (HLT instruction) while C2 is activated by hardware (by sending a signal to CPU pin “STPCLK)
  • -      Two modes
o    Stop Grant
§  As explained above; CPU core clock is stopped but the clock generator chip (also know as PLL) is still active and generating the external bus reference clock i.e., CPU external clock
o    Stop Clock
§  Here the Clock Generator itself is turned off and thus the external clock generator chip would be turned off, thus saving more energy
§  Current CPUs don’t have this Stop Clock mode inside C2 state but on the C3 Deep Sleep State


C2E
  • -      Similar to C2; but reduces the CPU voltage besides stopping the CPU internal clock
C3
  • -      Also known as “Sleep” state
  • -      Both BIU and APIC clocks will be cut off (which means it cant answer to important requests coming from CPU external bus or interruptions)
  • -      C3 implementation on Intel
o    Given an extra pin called SLP (or DPSLP depending on CPU model) which must be activated when the CPU is in C2 state in order to switch the CPU into C3 state
o    So first STPCLK pin must be activated and then one should activate the SLP pin
o    Achieving the Deep Sleep state is achieved by simply cutting the “external clock signal”


  • -      C3 implementation on AMD
o    Achieved by simply reading a register from ACPI, circuit that is physically located on the chipset (P_LVL3) in the Processor control Block (P_BLK)
  • -      Two modes
o    Sleep – as explained above
o    Deep Sleep (pin is DPSLP instead SLP) – achieved by simply cutting the “external clock signal”
o    AltVid – allows reduction on CPU voltage while they are in C3 mode


-      Note: AMD C1E and C3 (Sleep State; not the Deep Sleep state) are similar


C4
  • -      Also known as Deeper Sleep State
  • -      Since on C3, all clock signals inside the CPU are stopped, there is no other way to save power by playing with CPU clock signals. The next step on reducing the CPU idle power is to reduce the CPU voltage (Power = VI)
  • -      Intel – C4 is achieved from C3 i.e., CPU must first enter C3 and then, from there, it can reduce its internal voltage
C4E
  • -      C4 + CPU voltage is reduced even more after the L2 memory cache has been disabled (some calls this C5 which is not the real name of this mode)
C6
  • -      Also known as Deep Power Down
  • -      When CPU enters this state, it saves its entire architectural state inside a special static (intel) or DRAM (which is fed from an independent power source)
  • -      This allows the CPU internal voltage to be lowered to any value, *including 0V*
  • -      When the CPU is waked up, it loads the previous state of all internal units  from its special static RAM (waking up CPU from this state takes a lot longer)
  • Notice that there is only one voltage line for the entire CPU( the only component with a different voltage source is the above mentioned static or DRAM where the entire architectural state is stored) and lowering or turning off the CPU voltage is an all-or-nothing kind of deal; if you turn off the CPU, you have to turn off it entirely when it goes into C6 mode

Thursday, June 4, 2009

CPU Registers

CPU registers are classified into five categories as follows
  1. Segment registers
  2. Pointer registers
  3. General Purpose registers
  4. Index registers
  5. Flags register
1. Segment registers
  • Segments (20-bit wide) are special areas defined in a program for containing the code, the data and stack.
  • segment begins on a paragraph boundary; that is at a location evenly divisible by 16
  • segment registers are 16-bit size and contains starting address of the segment (Reason: since segments are starting on a paragraph boundary, the designers decided that it would be unnecessary to store the zero digit in the segment register)
  • Offset is 16 bits wide (and is specified in Pointer registers described later)
  • Further classified into Code, data, stack and extra - corresponds to CS, DS, SS, ES, FS and GS registers
2. Pointer registers
  • Pointer registers are 32-bit EIP, ESP and EBP; the rightmost are IP, SP and BP respectively (16-bit wider as mentioned above)
  • IP register is associated with CS register (as CS:IP => Segment:Offset)
Example - Segment address in CS 39B40h
Offset address in IP +0514h
------------
Address of next instruction 3A054h
------------
  • SP register is associated with SS register (as SS:SP => Segment:Offset)
Example - Segment address in SS 39B40h
Offset address in SP +0514h
------------
Address in stack 3A054h
------------
  • BP facilitates referencing parameters, which are data and addresses that a program passes via the stack. Processor combines the address in SS with the offset in BP. BP can also be combined with DI and with SI as a base register for special addressing.
3. General Purpose registers
  • 32-bit general purpose registers
  • AX - primary accumulator - used for operations involving input/output and most arithmetic - more efficient compared to other registers
  • BX - base register - only register used as an index to extend addressing - can also be combined with DI or SI as a base register for special addressing
  • CX - count register - may contain a value to control the number of times a loop is repeated or a value to shift bits left or right
  • DX - data regsiter - works with AX sometimes, to compute operations that involve large values
4. Index registers
  • SI (soure index) - may be required for some string (character) handling operations - in this context, SI is associated with DS register (as DS:SI)
  • DI (destination index) - is required for some string operations - in this context, DI is associated with ES register
5. Flags register
  • 32 bit wder
  • OF (overflow), IF (interrupt), TF (trap), SF (sign), ZF (zero), AF (auxiliary carry), PF (parity) and CF (carry)



Execution unit and Bus Interface Unit

Processor is partitioned into two logical units
1. Execution unit (EU) - to execute instructions
2. Bus Interface Unit (BIU) - to deliver instructions and data to EU

Execution unit:
  • Maintains CPU status and control flags
  • manipulates general registers and instruction operands. (Registers and data paths are 16 bits wider)
  • has no connection to "outside world". 
  • obtains instructions from Instruction Q maintained by BIU.
  • when an instruction requires access to memory or to a peripheral device, EU requests the BIU to obtain or store the data

Bus Interface unit:
  • performs all bus operations for EU
  • data transferred between CPU and momory/IO devices upon demand from EU
  • during periods, when EU is busy executing instructions, the BIU "looks ahead" and fetches more instructions from memory.
  • these instructions are stores in an internal RAM array called "Instruction Stream Q" - from which EU takes instructions to execute

Processor history