Thursday, June 4, 2009

How the EXE is loaded into main memory for execution - The system Program Loader

Once BIOS hands control over to the OS, you may then request execution of a program.

NOTE: The program can be either .COM or .EXE. A .COM program is useful as a small utility program or as a resident program (one that is installed in memory and is available while other programs run). In real mode, an .EXE program consists of separate code, data and stack segments and is the method used for more serious programs.

When you double click on an .EXE program (when you request the system to load an .EXE program from disk into memory for execution), the System Program Loader performs following steps
  1. Accesses the .EXE program from disk
  2. Constructs a 256-byte (100H) Program Segment Prefix (PSP) on a paragraph boundary in available internal memory (NOTE: PSP is a data structure used in DOS systems to store the state of a program)
  3. Stores the program in memory immediately following PSP
  4. Loads the address of PSP in DS and ES registers
  5. Loads the address of code segment in CS register and sets the IP register to the offset of the first instruction (usually zero) in the code segment
  6. Loads the address of the stack in SS register and sets the SP register to the size of the stack
  7. Transfers control to the program for execution, beginning usually with the first instruction in the code segment

The BIOS Boot Process

1. Turning on Computers' power causes the processor to enter a reset state, clears all memory locations to zero, perform a parity check of memory and set the CS register to segment address FFFFh and IP register to zero.
2. Hence the first instruction to execute, therefore is at address formed by CS:IP pair, which is FFFF0H, the entry point to BIOS in ROM
3. BIOS routine at FFFF0H checks the various ports to identify and initialize devices taht are attached to the computer and provides services that are used for reading to and for writing from the devices.
4. BIOS then establishes two data areas -
  • IVT (Interrupt Vector Table): Begins in low memory at location 0 and contains 256 4-bytes address in the form of segment:offset (Both BIOS and OS uses these IVT for interrupts that occur)
  • BIOS data Areas: Beginning at location 400H, largely concerned with the status of attached devices
5. BIOS next determines whether a disk containing the system files is present and, if so, it accesses the bootstrap loader from the disk
6. This BSP (Boot strap program) loads system files from the disk into memory and transfers control to them (System files contains device drivers and other hardware-specific code which initializes internal system tables and the systems' portion of IVT)

NOTE: When a user program requests an IO services of OS, it transfers request to BIOS, which in turn accesses requested device. Sometimes, program makes requests directly to BIOS, such as keyboard and screen services. At other times, a program can bypass both OS and BIOS to access a device directly

CPU Registers

CPU registers are classified into five categories as follows
  1. Segment registers
  2. Pointer registers
  3. General Purpose registers
  4. Index registers
  5. Flags register
1. Segment registers
  • Segments (20-bit wide) are special areas defined in a program for containing the code, the data and stack.
  • segment begins on a paragraph boundary; that is at a location evenly divisible by 16
  • segment registers are 16-bit size and contains starting address of the segment (Reason: since segments are starting on a paragraph boundary, the designers decided that it would be unnecessary to store the zero digit in the segment register)
  • Offset is 16 bits wide (and is specified in Pointer registers described later)
  • Further classified into Code, data, stack and extra - corresponds to CS, DS, SS, ES, FS and GS registers
2. Pointer registers
  • Pointer registers are 32-bit EIP, ESP and EBP; the rightmost are IP, SP and BP respectively (16-bit wider as mentioned above)
  • IP register is associated with CS register (as CS:IP => Segment:Offset)
Example - Segment address in CS 39B40h
Offset address in IP +0514h
------------
Address of next instruction 3A054h
------------
  • SP register is associated with SS register (as SS:SP => Segment:Offset)
Example - Segment address in SS 39B40h
Offset address in SP +0514h
------------
Address in stack 3A054h
------------
  • BP facilitates referencing parameters, which are data and addresses that a program passes via the stack. Processor combines the address in SS with the offset in BP. BP can also be combined with DI and with SI as a base register for special addressing.
3. General Purpose registers
  • 32-bit general purpose registers
  • AX - primary accumulator - used for operations involving input/output and most arithmetic - more efficient compared to other registers
  • BX - base register - only register used as an index to extend addressing - can also be combined with DI or SI as a base register for special addressing
  • CX - count register - may contain a value to control the number of times a loop is repeated or a value to shift bits left or right
  • DX - data regsiter - works with AX sometimes, to compute operations that involve large values
4. Index registers
  • SI (soure index) - may be required for some string (character) handling operations - in this context, SI is associated with DS register (as DS:SI)
  • DI (destination index) - is required for some string operations - in this context, DI is associated with ES register
5. Flags register
  • 32 bit wder
  • OF (overflow), IF (interrupt), TF (trap), SF (sign), ZF (zero), AF (auxiliary carry), PF (parity) and CF (carry)



Execution unit and Bus Interface Unit

Processor is partitioned into two logical units
1. Execution unit (EU) - to execute instructions
2. Bus Interface Unit (BIU) - to deliver instructions and data to EU

Execution unit:
  • Maintains CPU status and control flags
  • manipulates general registers and instruction operands. (Registers and data paths are 16 bits wider)
  • has no connection to "outside world". 
  • obtains instructions from Instruction Q maintained by BIU.
  • when an instruction requires access to memory or to a peripheral device, EU requests the BIU to obtain or store the data

Bus Interface unit:
  • performs all bus operations for EU
  • data transferred between CPU and momory/IO devices upon demand from EU
  • during periods, when EU is busy executing instructions, the BIU "looks ahead" and fetches more instructions from memory.
  • these instructions are stores in an internal RAM array called "Instruction Stream Q" - from which EU takes instructions to execute

Processor history



Thursday, May 14, 2009

How the Linux Kernel Boots

1. Computer is powered ON (Now the RAM chips contains random data and no OS is running)
2. Special Hardware circuit raises the logical value of RESET pin of the CPU
3. On RESET, some registers of the processor (including cs and eip) are set to fixed values and the code found at physical address 0xffff fff0 is executed (Note this address is mapped by the hardware to a certain read-only, persistent memory chip called ROM)

Set of programs stored in ROM is traditionally called Basic Input/Output System (BIOS). This includes several interrupt-driven low-level procedures that make up the computer. (Some OSs, such as MSDOS rely on BIOS to implement most system calls)

4. Linux is forced to use BIOS in bootstrapping phase to retrieve the kernel image from the disk or from some other external device. The BIOS bootstrap program does 
  •  Executes POST (Power-On Self-Test) to make note of the devices present and its working condition (Recent computers make use of ACPI)
  • Initializes the hardware devices to make sure all hardware devices operates without conflicts on the IRQ lines and I/O pots
  • Searches for OS to boot. Depending on BIOS setting, the BSP may try to access first sector (boot sector) of floppy disk, hard disk and CD-ROM in the system (order can be changed)
  • Once valid OS image found in the first sector, it copies the contents of its first sector into RAM starting from physical address 0x0000 7c00 and then jumps into that address and executes the code just loaded (see below description without fail!!)
NOTE: Boot loader is the program invoked by BIOS to load the image of an operating system kernel into RAM. Here, booting from floppy disk is as simple as loading the instructions in its first sector into the RAM. (These instructions copy all the remaining sectors containing the kernel image into RAM).  But booting from !hard-disk! is done differently.  First sector of hard disk is MBR (partition table + a small pgm which loads the first sector of the partition containing the OS to be started)

5. A two-stage boot loader is required to boot a linux kernel from disk
  • LILO (LInux LOader)
  • GRand Unified Bootloader (GRUB) [More advanced than LILO as it recognizes several disk-based filesystems and is thus capable of reading portions of the boot program from files]
LILO may be installed either on MBR (replacing the small program that loades the boot sector of the active partition) or in the boot sector of every disk partition. In both cases, LILO is executed at boot time, and the user may choose which OS to load.
Now what does the LILO boot loader actually does...
  • Invokes a BIOS proceduce to display a "loading" message
  • Invokes BIOS procedure to load an initial portion of kernel image from disk: the first 512 bytes of the kernel image are put in RAM at address 0x0009 0000, while the code of the *setup()* function (discussed below) is put in RAM starting from address 0x0009 0200
  • Invokes a BIOS procedure to load the rest of the kernel image from disk and puts the image in RAM starting from either low address 0x0001 0000 (for small kernel images compiled with "make zImage" => "loaded low" kernel image) or high address 0x0010 0000 (for big kernel images compiled with "make bzImage" => "loaded high" kernel image)
  • Jumps to the setup() code (this assembly language function is placed by the linker at offset 0x200 of the kernel image file. Hence, LILO can easily locate this and copy it into RAM, starting from physical address 0x0009 0200 as mentioned above)
6. This setup() code initializes the hardware devices in the computer and setup the environment for the execution of kernel program. (Here a question arises... then wat *$($(#$ devices are been initialized by BIOS then?... Ofcourse.. but Linux does not rely on it. Hence reinitialized the devices on its own manner to enhance portability and robustness). This setup() does the following..
  • in ACPI compliant systems, invokes BIOS routine that builds a table in RAM describing the layout of the systems' physical memory
  • sets the keyboard repeat delay and rate (when user keeps a key pressed past a certain amount of time, the keyboard device sends the coreesponding keycode over and over to CPU)
  • initializes video adapter card
  • reinitialized the disk controller and determines the hard disk parameters
  • checks for an IBM micro channel bus (MCA) and PS/2 pointing device (bus mouse)
  • checks for APM BIOS support (Advanced Power Management BIOS)
  • If "loaded low" kernel (at 0x0001 0000), the function moves it to physical address 0x0000 1000. No changes for "loaded high" kernel. (This step is necessary because to be able to store the kernel image on a floppy disk and to reduce the booting time, the kernel image stored on disk is compressed and the decompression routing needs some free space to use as a temporary buffer following the kernel image in RAM) !!! Me too not clear with this step !!!
  • Sets the A20 pin located on the 8042 keyboard controller. A20 pin is a hack introduced n 80286 based systems to make physical addresses compatible with those of the ancient 8088 microprocessors. Unfortunately, the A20 pin must be properly set before switching to Protected mode, otherwise the 21st bit of every physical address will always be regarded as zero by CPU. To set this pin is a messy operation
  • Sets up a provisional IDT (Interrupt Descriptor Table) and a provisional GDT (Global Descriptor Table)
  • Resets the floating-point unit (FPU)
  • Reprograms the PIC (Programable Interrupt Controllers) to mask all interrupts except IRQ2 which is the cascading interrupt between the two PICs
  • Switches the CPU from Real Mode to Protected mode by setting PE bit in CR0 status register. (NOTE: The PG bit in CR0 register is cleared, so paging is still disabled. Linear address is considered as Physical address)
  • Jumps to "startup_32()" assembly language function (Yeah.. coming coming.. i know there are two startup_32 functions.. the one which we are referring is at arch/i386/boot/compressed/head.S file)
7. After "setup()" function terminates, the function has been moved to either to 0x0010 0000 to 0x0000 1000 depending on whether the kernel image was loaded high or low in RAM. Letz see whats this "startup_32()" gotto do
  • initializes segmentation registers and a provisional stack
  • clears all bits in eflags register
  • fills the area of uninitialized data of kernel identified by _edata and _end symbols with zeros
  • invokes decompress_kernel() function to decompress the kernel image. (Messages like "Uncrompressing Linux..." and once decompressed, "OK, booting the kernel.." are shown) (Again, if kernel image was loaded low, the decompressed kernel is placed at physical address 0x0010 0000. Else placed in a temporary buffer located after the compressed image. The decompressed image is then moved into its final position, which starts at physical address 0x0010 0000)
  • Jumps to physical address 0x0010 0000
8. The decompressed kernel image begins with anotehr startup_32 (this time at arch/i386/kernel.head.S; Same name for both functions does not create any probs besides confusin us). This second "startup_32()" sets up the execution environment for the first linux process (process id = 0). Here we go wat else it does..
  • initializes segmentation registers with their final values
  • fills the 'bss' segment of the kernel with zeros
  • initializes the provisional kernel Page Tables contained in swapper_pg_dir and pg0 to identically map the linear addresses to same physicall addresses (more explanation required.. will do sooner)
  • stores the address Page Global Directory in cr3 register and enables paging by settng PG bit in cr0 register
  • Sets up the Kernel Mode stack for process 0
  • again clears all bits in eflags register
  • invokes setup_idt() to fill IDT with null interrupt handlers
  • puts the system parameters obtained from BIOS and parameters passed to operating system into the first page frame (again sooner will explain this too)
  • identifies the model of the processor
  • loads the gdtr and idtr registers with addresses of GDT and IDT tables
  • jumps to our one and only "start_kernel()" function which completes the initialization of Linux Kernel
9. Nearly every kernel component is initialized by this function. Mentioned below are juz a few of them
  • sched_init() - to initialize scheduler
  • build_all_zonelists() - to initialize memory zones
  • page_alloc_init() & mem_init() - to initialize Buddy system allocators
  • trap_init()  & init_IRQ() - to initialize IDT for final time
  • softirq_init() - to initialize TASKLET_SOFTIRQ and HI_SOFTIRQ
  • time_init() - to initialize system date and time
  • kmem_cache_init() - to initialize slab allocator
  • calibrate_delay() - to determine the speed of the CPU clock
  • kernel_thread() - to create the kernel thread for process 1; in turn, this kernel thread creates other kernel threads and executes /sbin/init program
10. After this init program and many kernel threads, at the end, the familiar login prompt appears on the console (or in the graphical screen, if the X Window System is launched at startup), telling the user that the Linux kernel is up and running

Once in protected mode, Linux does not use BIOS any longer, but it provides its own device driver for every hardware device on the computer (Moreover, BIOS procedures must be executed in real mode, so they cannot share functions even if that would be beneficial)

Wednesday, May 13, 2009

Compiling Linux Kernel

If you want to remove/add some kernel symbols to be exported, then go to
/usr/src/linux/fs/proc/proc_misc.c
and do the changes you want. If you want to remove, comment EXPORT_SYMBOL of that symbol

Now to compile the kernel, do
1. make
2. make modules_install
3. make install
in /usr/src/linux directory.

Frequently used commands

1. To find a file in current directory
find . -name "something.h"

2. To find a file (in current directory) with some text in it
grep "sometext" . -wrn
-r is for recursive; -n is to print the line numbers; -w for "whole" words only

3. To find the dependencies of an exe on .so files
ldd exe-name

4. To remove the debug symbols from an exe
strip --strip-debug or
strip -g or
strip - d

6. To list down exported symbols from kernel
cat /proc/kallsyms | grep "__ksymtab_proc_root"

NOTE: "__ksymtab" prefixed symbols are the symbols exported from the kernel.

7. To install from the spec file
rpm -ba "spec"


8. To find the distribution of the Linux (11.1 or 10.2 or 10.3), use
lsb_release -a
NOTE: lsb stands for Linux Standard Base


9. To find the size of the files (in friendly format) in a directory, go inside that directory
du -ch *

Symbols exported by Linux Kernel

To see the list of kernel exported symbols, use
  • vim /boot/System.map or
  • cat /proc/kallsyms | more
NOTE: Only the symbols that are preceded with "__ksymtab" can be used in application. Eg. __ksymtab_proc_root_kcore