Thursday, May 14, 2009

How the Linux Kernel Boots

1. Computer is powered ON (Now the RAM chips contains random data and no OS is running)
2. Special Hardware circuit raises the logical value of RESET pin of the CPU
3. On RESET, some registers of the processor (including cs and eip) are set to fixed values and the code found at physical address 0xffff fff0 is executed (Note this address is mapped by the hardware to a certain read-only, persistent memory chip called ROM)

Set of programs stored in ROM is traditionally called Basic Input/Output System (BIOS). This includes several interrupt-driven low-level procedures that make up the computer. (Some OSs, such as MSDOS rely on BIOS to implement most system calls)

4. Linux is forced to use BIOS in bootstrapping phase to retrieve the kernel image from the disk or from some other external device. The BIOS bootstrap program does 
  •  Executes POST (Power-On Self-Test) to make note of the devices present and its working condition (Recent computers make use of ACPI)
  • Initializes the hardware devices to make sure all hardware devices operates without conflicts on the IRQ lines and I/O pots
  • Searches for OS to boot. Depending on BIOS setting, the BSP may try to access first sector (boot sector) of floppy disk, hard disk and CD-ROM in the system (order can be changed)
  • Once valid OS image found in the first sector, it copies the contents of its first sector into RAM starting from physical address 0x0000 7c00 and then jumps into that address and executes the code just loaded (see below description without fail!!)
NOTE: Boot loader is the program invoked by BIOS to load the image of an operating system kernel into RAM. Here, booting from floppy disk is as simple as loading the instructions in its first sector into the RAM. (These instructions copy all the remaining sectors containing the kernel image into RAM).  But booting from !hard-disk! is done differently.  First sector of hard disk is MBR (partition table + a small pgm which loads the first sector of the partition containing the OS to be started)

5. A two-stage boot loader is required to boot a linux kernel from disk
  • LILO (LInux LOader)
  • GRand Unified Bootloader (GRUB) [More advanced than LILO as it recognizes several disk-based filesystems and is thus capable of reading portions of the boot program from files]
LILO may be installed either on MBR (replacing the small program that loades the boot sector of the active partition) or in the boot sector of every disk partition. In both cases, LILO is executed at boot time, and the user may choose which OS to load.
Now what does the LILO boot loader actually does...
  • Invokes a BIOS proceduce to display a "loading" message
  • Invokes BIOS procedure to load an initial portion of kernel image from disk: the first 512 bytes of the kernel image are put in RAM at address 0x0009 0000, while the code of the *setup()* function (discussed below) is put in RAM starting from address 0x0009 0200
  • Invokes a BIOS procedure to load the rest of the kernel image from disk and puts the image in RAM starting from either low address 0x0001 0000 (for small kernel images compiled with "make zImage" => "loaded low" kernel image) or high address 0x0010 0000 (for big kernel images compiled with "make bzImage" => "loaded high" kernel image)
  • Jumps to the setup() code (this assembly language function is placed by the linker at offset 0x200 of the kernel image file. Hence, LILO can easily locate this and copy it into RAM, starting from physical address 0x0009 0200 as mentioned above)
6. This setup() code initializes the hardware devices in the computer and setup the environment for the execution of kernel program. (Here a question arises... then wat *$($(#$ devices are been initialized by BIOS then?... Ofcourse.. but Linux does not rely on it. Hence reinitialized the devices on its own manner to enhance portability and robustness). This setup() does the following..
  • in ACPI compliant systems, invokes BIOS routine that builds a table in RAM describing the layout of the systems' physical memory
  • sets the keyboard repeat delay and rate (when user keeps a key pressed past a certain amount of time, the keyboard device sends the coreesponding keycode over and over to CPU)
  • initializes video adapter card
  • reinitialized the disk controller and determines the hard disk parameters
  • checks for an IBM micro channel bus (MCA) and PS/2 pointing device (bus mouse)
  • checks for APM BIOS support (Advanced Power Management BIOS)
  • If "loaded low" kernel (at 0x0001 0000), the function moves it to physical address 0x0000 1000. No changes for "loaded high" kernel. (This step is necessary because to be able to store the kernel image on a floppy disk and to reduce the booting time, the kernel image stored on disk is compressed and the decompression routing needs some free space to use as a temporary buffer following the kernel image in RAM) !!! Me too not clear with this step !!!
  • Sets the A20 pin located on the 8042 keyboard controller. A20 pin is a hack introduced n 80286 based systems to make physical addresses compatible with those of the ancient 8088 microprocessors. Unfortunately, the A20 pin must be properly set before switching to Protected mode, otherwise the 21st bit of every physical address will always be regarded as zero by CPU. To set this pin is a messy operation
  • Sets up a provisional IDT (Interrupt Descriptor Table) and a provisional GDT (Global Descriptor Table)
  • Resets the floating-point unit (FPU)
  • Reprograms the PIC (Programable Interrupt Controllers) to mask all interrupts except IRQ2 which is the cascading interrupt between the two PICs
  • Switches the CPU from Real Mode to Protected mode by setting PE bit in CR0 status register. (NOTE: The PG bit in CR0 register is cleared, so paging is still disabled. Linear address is considered as Physical address)
  • Jumps to "startup_32()" assembly language function (Yeah.. coming coming.. i know there are two startup_32 functions.. the one which we are referring is at arch/i386/boot/compressed/head.S file)
7. After "setup()" function terminates, the function has been moved to either to 0x0010 0000 to 0x0000 1000 depending on whether the kernel image was loaded high or low in RAM. Letz see whats this "startup_32()" gotto do
  • initializes segmentation registers and a provisional stack
  • clears all bits in eflags register
  • fills the area of uninitialized data of kernel identified by _edata and _end symbols with zeros
  • invokes decompress_kernel() function to decompress the kernel image. (Messages like "Uncrompressing Linux..." and once decompressed, "OK, booting the kernel.." are shown) (Again, if kernel image was loaded low, the decompressed kernel is placed at physical address 0x0010 0000. Else placed in a temporary buffer located after the compressed image. The decompressed image is then moved into its final position, which starts at physical address 0x0010 0000)
  • Jumps to physical address 0x0010 0000
8. The decompressed kernel image begins with anotehr startup_32 (this time at arch/i386/kernel.head.S; Same name for both functions does not create any probs besides confusin us). This second "startup_32()" sets up the execution environment for the first linux process (process id = 0). Here we go wat else it does..
  • initializes segmentation registers with their final values
  • fills the 'bss' segment of the kernel with zeros
  • initializes the provisional kernel Page Tables contained in swapper_pg_dir and pg0 to identically map the linear addresses to same physicall addresses (more explanation required.. will do sooner)
  • stores the address Page Global Directory in cr3 register and enables paging by settng PG bit in cr0 register
  • Sets up the Kernel Mode stack for process 0
  • again clears all bits in eflags register
  • invokes setup_idt() to fill IDT with null interrupt handlers
  • puts the system parameters obtained from BIOS and parameters passed to operating system into the first page frame (again sooner will explain this too)
  • identifies the model of the processor
  • loads the gdtr and idtr registers with addresses of GDT and IDT tables
  • jumps to our one and only "start_kernel()" function which completes the initialization of Linux Kernel
9. Nearly every kernel component is initialized by this function. Mentioned below are juz a few of them
  • sched_init() - to initialize scheduler
  • build_all_zonelists() - to initialize memory zones
  • page_alloc_init() & mem_init() - to initialize Buddy system allocators
  • trap_init()  & init_IRQ() - to initialize IDT for final time
  • softirq_init() - to initialize TASKLET_SOFTIRQ and HI_SOFTIRQ
  • time_init() - to initialize system date and time
  • kmem_cache_init() - to initialize slab allocator
  • calibrate_delay() - to determine the speed of the CPU clock
  • kernel_thread() - to create the kernel thread for process 1; in turn, this kernel thread creates other kernel threads and executes /sbin/init program
10. After this init program and many kernel threads, at the end, the familiar login prompt appears on the console (or in the graphical screen, if the X Window System is launched at startup), telling the user that the Linux kernel is up and running

Once in protected mode, Linux does not use BIOS any longer, but it provides its own device driver for every hardware device on the computer (Moreover, BIOS procedures must be executed in real mode, so they cannot share functions even if that would be beneficial)

Wednesday, May 13, 2009

Compiling Linux Kernel

If you want to remove/add some kernel symbols to be exported, then go to
/usr/src/linux/fs/proc/proc_misc.c
and do the changes you want. If you want to remove, comment EXPORT_SYMBOL of that symbol

Now to compile the kernel, do
1. make
2. make modules_install
3. make install
in /usr/src/linux directory.

Frequently used commands

1. To find a file in current directory
find . -name "something.h"

2. To find a file (in current directory) with some text in it
grep "sometext" . -wrn
-r is for recursive; -n is to print the line numbers; -w for "whole" words only

3. To find the dependencies of an exe on .so files
ldd exe-name

4. To remove the debug symbols from an exe
strip --strip-debug or
strip -g or
strip - d

6. To list down exported symbols from kernel
cat /proc/kallsyms | grep "__ksymtab_proc_root"

NOTE: "__ksymtab" prefixed symbols are the symbols exported from the kernel.

7. To install from the spec file
rpm -ba "spec"


8. To find the distribution of the Linux (11.1 or 10.2 or 10.3), use
lsb_release -a
NOTE: lsb stands for Linux Standard Base


9. To find the size of the files (in friendly format) in a directory, go inside that directory
du -ch *

Symbols exported by Linux Kernel

To see the list of kernel exported symbols, use
  • vim /boot/System.map or
  • cat /proc/kallsyms | more
NOTE: Only the symbols that are preceded with "__ksymtab" can be used in application. Eg. __ksymtab_proc_root_kcore

Memory mapped and IO mapped devices

Memory Mapped Devices: Device registers are mapped to physical memory.
IO Mapped Devices: Device registers are within the device itself. Requires special instructions (like inport, outport) to access those registers.