The Linux v0.01 Bootloader Explained

Manohar Vanga


Add to  Bookmarks.co.cc Add to  Digg Add to  Del.icio.us Add to  Reddit Add to  StumbleUpon Add to  Slashdot Add to  Netscape Add to  Furl Add to  Yahoo Add to  Spurl Add to  Google Add to  Technorati Add to  Newsvine

NOTE: This is a tutorial being updated as I keep discovering those old links of mine. I intend to put in links to the original resources I used for understanding. I will also add additional notes on the source code as I learn more.

This tutorial tries to explain how the bootloader in Linux v0.01 works. When I first wanted to start writing an operating system of my own for learning purposes, I tried finding out how the original bootloader of Linux worked. After exhaustive searching, I found the information distributed in vague bits all over the internet. I finally sat down and wrote these notes outlining the working of the bootloader. This is an HTML version to help out others who find themselves in a similar situation.


The BIOS

The BIOS is the firmware in the ROM of a PC. When the PC is powered up, the BIOS is the first program that runs. All other programs must be loaded into RAM first. The BIOS contains the following parts:

è   POST (Power On Self Test). The running counter (In older machines) that counts the kilobytes of main memory is the most visible part of the POST.

è   The Setup Menu, that lets you set some parameters and lets you adjust the real time clock. Most modern BIOS versions let you set the boot order, the devices that BIOS checks for booting. These can be the first floppy disk, the first hard disk, CD-ROM and possibly other disks as well. The first device in the list will be tried first. Older BIOS-es have only one boot order: Floppy, Hard Drive. So the BIOS will try to boot from the floppy first and if there is no diskette in the drive it tries to boot from the hard drive.

è   The boot sector loader. This loads the first 512 bytes from the boot disk into RAM and jumps to it.

è  The BIOS interrupts. These are simple device drivers that programs can use to access the screen, the keyboard and disks. Boot loaders rely on them, most operating systems do not (the Linux kernel does not use BIOS interrupts once it has been started). MSDOS does.

As far as boot loading facilities are concerned, the BIOS is very primitive compared to that of other computer systems. The only thing it knows about disks is how to load the first 512-byte sector.

è   The first sector of a diskette can be loaded at address 0000:7C00. The last two bytes of the sector are checked for the values 0x55 and 0xAA, as a rough sanity check. If these are OK, the BIOS jumps to the address 0000:7C00.

 

è   Booting from a hard disk is very similar to booting from a diskette. The first sector of a hard disk (often known as the Master Boot Record) is loaded at 0000:7C00 and next the BIOS jumps to it. The MBR program moves itself to an address that is different from 0000:7C00. It then proceeds to load a different boot sector from a partition to address 0000:7C00 and jump to that! You can read more about the MBR at http://www.dewassoc.com/kbase/hard_drives/master_boot_record.htm

 

è  Modern BIOS versions can treat a certain file on a CD-ROM as a diskette image. They pretend to boot from a diskette by loading the first 512 bytes of the file to 0000:7C00 and jumping to it. Every attempt to access the same diskette using the BIOS routines, will be redirected to the image file on CD-ROM. Some other ways to boot a CD-ROM may also be supported (with an emulated hard disk or with no disk emulation at all).

 

When the boot sector is loaded, the CPU is in real mode. For those who are unfamiliar with 80x86 architecture: real mode is very limited compared to 32-bit protected mode (in which Linux runs). For example: data outside a 64K segment can only be accessed if you change a segment register and data outside the first 1MB of address space (which contains 640kB of main memory) cannot be accessed at all. As gcc does not know about real mode, programs compiled with it can only be run in real mode with some tricks and with severe memory size restrictions. This is the reason why most boot loaders (except GRUB) are written in assembly. Its just so much easier!



The Bootloader

Lets get right into the code of the Linux v0.01 bootloader. Below is the complete source code for the Linux v0.01 bootloader. I have basically commented each instruction on what it does. I will later add links to resources regarding the various techniques and information used by Linus Torvalds when he wrote this bootloader.


;HEAVILY COMMENTED BY MANOHAR VANGA
;I realize that this is a very newbie approach to commenting where every line has a comment. But I find it invaluable for learning purposes.
;I'm unfortunately also a stickler for details...Anyway, I hope this helps someone out there :-)


;
;                            boot.s
;
;boot.s is loaded at 0x7c00 by the bios-startup routines, and moves itself
;out of the way to address 0x90000, and jumps there.
;
;It then loads the system at 0x10000, using BIOS interrupts. Thereafter
;it disables all interrupts, moves the system down to 0x0000, changes
;to protected mode, and calls the start of system. System then must
;RE-initialize the protected mode in it's own tables, and enable
;interrupts as needed.
;
;NOTE! currently system is at most 8*65536 bytes long. This should be no
;problem, even in the future. I want to keep it simple. This 512 kB
;kernel size should be enough - in fact more would mean we'd have to move
;not just these start-up routines, but also do something about the cache-
;memory (block IO devices). The area left over in the lower 640 kB is meant
;for these. No other memory is assumed to be "physical", ie all memory
;over 1Mb is demand-paging. All addresses under 1Mb are guaranteed to match
;their physical addresses.
;
;NOTE1 abouve is no longer valid in it's entirety. cache-memory is allocated
;above the 1Mb mark as well as below. Otherwise it is mainly correct.
;
;NOTE 2! The boot disk type must be set at compile-time, by setting
;the following equ. Having the boot-up procedure hunt for the right
;disk type is severe brain-damage.
;The loader has been made as simple as possible (had to, to get it
;in 512 bytes with the code to move to protected mode), and continuos
;read errors will result in a unbreakable loop. Reboot by hand. It
;loads pretty fast by getting whole sectors at a time whenever possible.

;1.44Mb disks:
sectors = 18
;1.2Mb disks:
;sectors = 15
;720kB disks:
;sectors = 9

.globl begtext, begdata, begbss, endtext, enddata, endbss
.text
begtext:
.data
begdata:
.bss
begbss:
.text

BOOTSEG = 0x07c0
INITSEG = 0x9000
SYSSEG  = 0x1000              ;system loaded at 0x10000 (65536).
ENDSEG    = SYSSEG + SYSSIZE

entry start
start:
    mov    ax,#BOOTSEG        ;Store 07c0 in ax
    mov    ds,ax              ;Move it into ds (make current stuff all data)
    mov    ax,#INITSEG        ;store the target location(0x9000) into es
    mov    es,ax              ;    so that we can move data from ds to es
    mov    cx,#256            ;Set the count value in cx (256 words = 512 bytes = Size of bootsector code)
    sub    si,si              ;clear si (points to start point of ds) for the rep command
    sub    di,di              ;clear di (points to start point of es) for the rep command
    rep    movw               ;Move cx no. of words from ds:si to es:di
    jmpi    go,INITSEG        ;Jump to 0x9000:go (the go tag inside the just-copied block starting at 0x9000)
go:    mov    ax,cs           ;This is reached only after copying (thus it lies at 0x9000:go when accessed)
    mov    ds,ax              ;set all the segment registers to point to current location
    mov    es,ax
    mov    ss,ax
    mov    sp,#0x400          ;arbitrary value >>512 as we need the stack to start after the bootsector code(512 bytes long)

    mov    ah,#0x03           ;Gets both the cursor position and what horizonal lines the cursor is using. Before calling int 10h, move into BH what
                                           ;    video page you are working with. 
    xor    bh,bh              ;After calling int 10h, CH will equal the cursor starting line, CL = cursor ending line, DH = the row the cursor is 
                                           ;    currently on, and DL = the column the cursor is currently on. 
    int    0x10               ;(If you have anything important in either the DX or CX registers, don't forget to save them before calling int 10h or 
                                           ;    else DX and CX will get overwritten.)
    
    mov    cx,#24             ;Size of the first message printed (24 bytes long. see below...)
    mov    bx,#0x0007         ;page 0, attribute 7 (normal)
    mov    bp,#msg1           ;Address of the first message printed. (Given by tag msg1: below)
    mov    ax,#0x1301         ;write string, move cursor
    int    0x10

;ok, we've written the message, now
;we want to load the system (at 0x10000)

    mov    ax,#SYSSEG         ;Location where we want to load the system 0x1000
    mov    es,ax              ;    segment of 0x10000
    call    read_it           ;Function to read all sectors on floppy into location SYSSEG ie. 0x1000
    call    kill_motor        ;Function to kill the floppy drive motor and continue.

;if the read went well we get current cursor position ans save it for
;posterity.

    mov    ah,#0x03           ;read cursor pos (shown earlier too)
    xor    bh,bh              ;clear bh for the call
    int    0x10               ;Return row and column of current location in DX. Save it in known place, con_init fetches it back
    mov    [510],dx           ;    it from 0x90510.
        
;now we want to move to protected mode ...

    cli                       ;no interrupts allowed !

;first we move the system to it's rightful place

    mov    ax,#0x0000
    cld                       ;'direction'=0, movs moves forward
do_move:
    mov    es,ax              ;destination segment
    add    ax,#0x1000         ;Add 0x1000 to ax
    cmp    ax,#0x9000         ;    and compare with 0x9000 to see if we have copied everything
    jz    end_move            ;If we have we jump to end_move
    mov    ds,ax              ;Set ds register to point to 0x1000
    sub    di,di              ;clear di
    sub    si,si              ;    and si
    mov     cx,#0x8000        ;Move a total of 0x9000-0x1000 = 0x8000 bytes
    rep    movsw              ;Simply copies the data located in address DS:SI to the address ES:DI. CX number of times.
    j    do_move              ;Again jump to do_move and this time jump to end_move.

;then we load the segment descriptors

end_move:                

    mov    ax,cs              ;right, forgot this at first. didn't work :-)
    mov    ds,ax
    lidt    idt_48            ;load idt with 0,0
    lgdt    gdt_48            ;load gdt with whatever appropriate. See below for the values of these.

;that was painless, now we enable A20

    call    empty_8042
    mov    al,#0xD1           ;Command write. This is the command "Read output port"
    out    #0x64,al           ;Keyboard is at port 0x64
    call    empty_8042        ;Again ensure that the keyboard buffer has space
    mov    al,#0xDF           ;A20 on
    out    #0x60,al           ;Enable A20 line to keyboard.
    call    empty_8042        ;Yet again ensure the keyboard buffer has space

;well, that went ok, I hope. Now we have to reprogram the interrupts :-(
;we put them right after the intel-reserved hardware interrupts, at
;int 0x20-0x2F. There they won't mess up anything. Sadly IBM really
;messed this up with the original PC, and they haven't been able to
;rectify it afterwards. Thus the bios puts interrupts at 0x08-0x0f,
;which is used for the internal hardware interrupts as well. We just
;have to reprogram the 8259's, and it isn't fun.

    mov    al,#0x11           ;initialization sequence
    out    #0x20,al           ;send it to 8259A-1
    .word    0x00eb,0x00eb    ;jmp $+2, jmp $+2
    out    #0xA0,al           ;and to 8259A-2
    .word    0x00eb,0x00eb
    mov    al,#0x20           ;start of hardware int's (0x20)
    out    #0x21,al
    .word    0x00eb,0x00eb
    mov    al,#0x28           ;start of hardware int's 2 (0x28)
    out    #0xA1,al
    .word    0x00eb,0x00eb
    mov    al,#0x04           ;8259-1 is master
    out    #0x21,al
    .word    0x00eb,0x00eb
    mov    al,#0x02           ;8259-2 is slave
    out    #0xA1,al
    .word    0x00eb,0x00eb
    mov    al,#0x01           ;8086 mode for both
    out    #0x21,al
    .word    0x00eb,0x00eb
    out    #0xA1,al
    .word    0x00eb,0x00eb
    mov    al,#0xFF           ;mask off all interrupts for now
    out    #0x21,al
    .word    0x00eb,0x00eb
    out    #0xA1,al

;well, that certainly wasn't fun :-(. Hopefully it works, and we don't
;need no steenking BIOS anyway (except for the initial loading :-).
;The BIOS-routine wants lots of unnecessary data, and it's less
;"interesting" anyway. This is how REAL programmers do it.
;



;Well, now's the time to actually move into protected mode. To make
;things as simple as possible, we do no register set-up or anything,
;we let the gnu-compiled 32-bit programs do that. We just jump to
;absolute address 0x00000, in 32-bit protected mode.

    mov    ax,#0x0001         ;protected mode (PE) bit
    lmsw    ax                ;This is it! (Load machine status word
    jmpi    0,8               ;jmp offset 0 of segment 8 (cs)

    
;This routine checks that the keyboard command queue is empty
;No timeout is used - if this hangs there is something wrong with
;the machine, and we probably couldn't proceed anyway.
empty_8042:
    .word    0x00eb,0x00eb
    in    al,#0x64            ;8042 status port
    test    al,#2             ;is input buffer full?
    jnz    empty_8042         ;yes - loop
    ret

    
;This routine loads the system at address 0x10000, making sure
;no 64kB boundaries are crossed. We try to load it as fast as
;possible, loading whole tracks whenever we can.
;
;in:    es - starting address segment (normally 0x1000)
;
;This routine has to be recompiled to fit another drive type,
;just change the "sectors" variable at the start of the file
;(originally 18, for a 1.44Mb drive)
;

;Defined Constants
sread:    .word 1             ;sectors read of current track
head:    .word 0              ;current head
track:    .word 0             ;current track



read_it:                      ;read_it function
    mov ax,es                 ;store address of es into ax (should be 0x1000)
    test ax,#0x0fff           ;TEST dest, source. It performs a conjunction, bit by bit, of the operators, but differing from AND, this instruction does not place the result on the dest operator, it only has effect on the state of the flags.
die:    jne die               ;es must be at 64kB boundary. If not then go into an infinite loop!
    xor bx,bx                 ;bx is starting address within segment. Set it to zero.
rp_read:
    mov ax,es                 ;store es starting location into ax register
    cmp ax,#ENDSEG            ;have we loaded all yet? check if ax is below ENDSEG expected address
    jb ok1_read               ;if less than (not read all yet) do ok1_read
    ret                       ;return
    
    
    
ok1_read:
    mov ax,#sectors           ;Move no. of sectors on floppy to ax
    sub ax,sread              ;Subtract no. of read sectors from ax
    mov cx,ax                 ;Move the "to be read number" of sectors to cx
    shl cx,#9                 ;Multiply the no. of sectors with 512 to get no. of bytes to read.
    add cx,bx                 ;add with bx and see if a carry occurs. If it does then we cannot copy cx no. of bytes at location cx.
    jnc ok2_read              ;Only if the sectors to be read fit onto the segment (Only if there is no carry), we execute ok2_read
    je ok2_read               ;    if the above expression is equal, we call this function too.
    xor ax,ax                 ;If neither of the above are true, we cant fit it onto the segment. Clear ax.
    sub ax,bx                 ;Subtract bx from ax.
    shr ax,#9                 ;Divide ax by 512 and continue reading
ok2_read:
    call read_track           ;Execute read_track
    mov cx,ax                 ;ax contains no. of to be read sectors.
    add ax,sread              ;contains no. of read sectors
    cmp ax,#sectors           ;sum should equal the no. of sectors on floppy
    jne ok3_read              ;if it is not equal, we still haven't read some sectors or have encountered an error. Call ok3_read
    mov ax,#1                 ;If it is equal, we have read all sectors onto disk at location 0x1000. Move value 1 into ax
    sub ax,head               ;Subtract head number from ax
    jne ok4_read              ;If not zero, jump to ok4_read
    inc track                 ;Otherwise increment the track and THEN call ok4_read
ok4_read:                
    mov head,ax               ;move the new head value from ax into the head variable
    xor ax,ax                 ;clear ax and continue with ok3_read
ok3_read:
    mov sread,ax              ;Store the to-be-read-no. of sectors into sread
    shl cx,#9                 ;Multiply no-of-to-be-read-sectors by 512 to get no. of bytes
    add bx,cx                 ;Add to bx
    jnc rp_read               ;Ensure that no carry occurs (to make sure all the data fits)
    mov ax,es                 ;Store es in ax. 
    add ax,#0x1000            ;add the base address 0x1000 to ax
    mov es,ax                 ;Store this new address into es
    xor bx,bx                 ;clear bx to zero (points to beginning of the new es)
    jmp rp_read               ;call rp_read function.

read_track:
    push ax                   ;Push the registers onto stack
    push bx
    push cx
    push dx
    mov dx,track              ;Move track no. into dx (Initially 0)
    mov cx,sread              ;Move no. of read sectors into cx (initially one)
    inc cx                    ;Increment cx as we are about to read another sector.
    
    mov ch,dl            
    mov dx,head               ;Stores current head location into dx (initially 0). required value goes into dl
    mov dh,dl                 ;moves dl into dh (as we need the head number in dh)
    mov dl,#0                 ;    and then clears dl to zero...for example sake, dx will initially contain 0x0000 after all this.
    and dx,#0x0100            ;
    mov ah,#2                 ;Function number to read sectors from drive.
    int 0x13                  ;Call int 13h
    
    jc bad_rt                 ;If 0x13h function 02 encounters an error it sets the carry flag.
    pop dx                    ;Pop previously pushed registers from stack
    pop cx
    pop bx
    pop ax
    ret                       ;Return to ok2_read
bad_rt:    mov ax,#0          ;Function no. 0 resets disk drive. Disk drive number is stored in DL
    mov dx,#0                 ;Disk drive 0. ie. Floppy.
    int 0x13                  ;Reset it!
    pop dx                    ;Pop all pushed registers (in function read_track)
    pop cx
    pop bx
    pop ax
    jmp read_track            ;Try reading from disk AGAIN! call read_track function

;
;This procedure turns off the floppy drive motor, so
;that we enter the kernel in a known state, and
;don't have to worry about it later.
;
kill_motor:
    push dx                   ;Save value in dx. We are about to modify it.
    mov dx,#0x3f2             ;Port of the floppy drive controller
    mov al,#0                 ;Value to send to floppy drive controller
    outb                      ;Outputs value in al to port stored in dx. Tells the floppy drive controller to stop the motor.
    pop dx                    ;Retreive old dx value
    ret                       ;return

gdt:                          ;Gets incorporated in gdt_48 structure. 
    .word    0,0,0,0          ;dummy

    .word    0x07FF           ;8Mb - limit=2047 (2048*4096=8Mb)
    .word    0x0000           ;base address=0
    .word    0x9A00           ;code read/exec
    .word    0x00C0           ;granularity=4096, 386

    .word    0x07FF           ;8Mb - limit=2047 (2048*4096=8Mb)
    .word    0x0000           ;base address=0
    .word    0x9200           ;data read/write
    .word    0x00C0           ;granularity=4096, 386

    
;IDT Entry. This is the IDT pointer loaded by lidt command
idt_48:                    
    .word    0                ;idt limit=0
    .word    0,0              ;idt base=0L

    
;GDT Entry. This is the GDT pointer loaded by lgdt command
gdt_48:
    .word    0x800            ;gdt limit=2048, 256 GDT entries
    .word    gdt,0x9          ;gdt base = 0X9xxxx. This incorporates the above structure (gdt)
    
msg1:                              ;The first message to be printed. TOTAL 24 BYTES LONG!
    .byte 13,10                    ;a \n and a \r
    .ascii "Loading system ..."    ;followed by a message
    .byte 13,10,13,10              ;followed by a \n \r \n \r

.text
endtext:
.data
enddata:
.bss
endbss: