Tuesday, April 20, 2010

Where to get and the Nature of the Kernel.

Where to get the Kernel ? :

The Kernel source code can be downloaded from the http://www.kernel.org/ . The kernel source is typically installed in /usr/src/linux/ directory. Don't use this kernel source for the development, because the C library compiled along with this kernel is linked with this kernel. You can use root directory only to instll the kernel and the home directory for the kernel development.

I hope that you would get more information when you Google for the "how to compile/install kernel", for that reason i am skipping that.

Kernel source tree :

When you untar the kernel source directory, the root source tree of the kernel consists of different directories. The following is the list the directories and their description.


Directory                             Description

arch                            Architecture-specific source
crypto                            Crypto API
Documentation                        Kernel source documentation
drivers                            Device drivers
fs                            The VFS and the individual file systems
include                            Kernel headers
init                            Kernel boot and initialization
ipc                            Interprocess communication code
kernel                            Core subsystems, such as the scheduler
lib                            Helper routines
mm                            Memory management subsystem and the VM
net                            Networking subsystem
scripts                            Scripts used to build the kernel
security                        Linux Security Module
sound                            Sound subsystem
usr                            Early user-space code (called initramfs)


The Different nature of the kernel :

The kernel has several differences than compared to the user-space applications. Certainly not in the programming prospective but in nature. The most important differences of kernel are as follows

1. The kernel doesn't have access to the C library.
2. The kernel is developed in GNU C.
3. The kernel is doesn't have protection to the memory usage, like in user-space.
4. The Kernel can't easily use the floating point.
5. The kernel has a small fixed size stack.
6. The kernel has a asynchronous interrupts, is preemptive. The kernel supports SMP (symmetric multi processors), synchronization and conccurancy are the major concerns with in the kernel.
7. Portability is important.

We briefly look at each concern ....

No Lib C :

Unlike the user-space applications, the kernel doesn't link with the lib C (Or any other library, for that matter). There are multiple reasons for not linking the lib C with the kernel. The important reasons is, speed and the size. The C library or the decent subset of it is too large and insufficient for the kernel.

One of the important missing function is printf (), the kernel have printk (), it works same as the printf () function. In addition to that, it has the priority flag. This flag is used by the syslogd to decide where to show the messages.

ex : printk (KERN_INFO, "This is informational message\n");

GNU C :

The developers uses the ISO C99 and the GNU C, uses the extesnions of the language that are available in the gcc.

Inline Functions :

The inline functions is inserted in the function instead of calling the function, this will reduces the overhead of the function call (register saving) and return (register restore), and allows potentially more optimization because the compiler can optimize the caller and the called funcion. The disadvantage of the inline function is, increases the size and foot print of the application. Inline functions are used when the function is time critical. If the function is large and the function is called more than once the inline functions doesn't make send to use.

Ex : static inline void dog (unsigned long int arg)

The declaration of the function make the any usage, else the compiler will not make the function inline. Usually the inline functions are declared in the header files. As the functions are declared as static, can't be exported, so can be declared at the starting of the file.

Inline Assembly :

The gcc C compiler ebables embedding the assembly instructions in otherwise normal C functions. Ther directive asm () is used for the inline assembly code.

No Memory Protection :

When a user application access the illegal memory, the kernel traps it and throws the SIGSEGV and kill the process. When the kernel access the illegal memory location, the results are less controlled. Memory violations in kernel results in major kernel errors. It should go without saying that you must not access the illegal memory, such as dereferencing the NULL pointer but with in the kernel the stakes are much higher.

No easy use of a floating point :

When the user space application uses the floating point, the kernel manages the transition from integer to the floating point mode. 


Unless the user space application, the kernel doesn't have luxary to use the floating point because the kernel can't trap itself. Using the floating point inside the kernel requires the manual saving and restore of the floating point registers. The best part is avoid using it. No floating point in the kernel.



Synchronization :

The kernel is liable to race conditions. Unlike the single threaded user space application, no.of properties of kernel allow for concurrent access of shared resources and thus require synchronization to prevent race conditions.

Specifically,
1. Linux is a preemptive multi tasking OS. Therefore, the processes schedule and reschedule at the whim of the scheduler, so synchronization is must among the processes.
2. Linux supports multiprocessing. Therefore, without proper protection, the kernel code executing  on one or more processors can access the same resources.
3. In Linux the interrupts are asynchronous, Therefore, without proper protection, the interrupt can occur at the midst of accessing the resource and the interrupt handler can access the same resource.
4. Linux is preemptive, Therefore, without proper protection, The kernel code can be preempted in favor of other code and thus access the same resource.

The solution for the race conditions is spinlock and semaphores.

Portability :

This means that architecture-independent C code must correctly compile and run on a wide range of systems, and that architecture-dependent code must be properly segregated in system-specific directories in the kernel source tree.
A handful of rules such as remain endian neutral, be 64-bit clean, do not assume the word or page size, and so on go a long way....


Hopefully i will write about the Process Management in Kernel in the next post.

Monday, April 19, 2010

Before we begin ... !!!

Well, Here we go....

Before we start to actually know about how actually Kernel works and behaves, We must briefly know about the history how the Linux has started and what the Linux Kernel is ?


A Brief History :

After 3 decades of use, Unix has became most powerful system in existence. Unix has been developed by Dennis Ritchie and Ken Thompson in 1969 at Bell Laboratories. The Unix grew out of Multics, a failure multi user operating system from Bell Labs. In the Year 1969, Thompson & Ritchie implemented a file system which evolved into Unix Operating system. The simplicity of the Unix is, it was distributed with the source code. Which leads to the further development of the Unix outside the Organization. Unix is simple Unlike other OSes, The Unix has only hundreds of System calls with clear design goals. In Unix, every thing is file, which simplifies the data manipulation and devices with a set of system calls. As the Unix kernel is implemented in C, which gives Unix amazing portability and availability to a wide range of developers. Unix Provides simple yet robust inter process communication.

What is Linux ? :

Linux was developed by Linus Torvalds in 1991.

When Linus was student, he did use the Minix as a simple teaching aid. But he was discouraged to make the changes to the Minix's source code and distribute because of the Minix license and the design decisions made by the Minix author.

Linux is clone of Unix, but not the Unix. Linux borrows most of the ideas from the Unix and implements the Unix API. It didn't not use the Unix source code as it is. Linux has deviated path from the Unix but it didn't not leave the basic design of the Unix.

Linux is not a commercial product, instead it is a collaborative product developed by the various developers on the Internet. Linus remains the creator & maintainer of the Linux.

The basic Linux system is Kernel, C library, tool chain and basic utilities like the login process and shell.

What is Kernel ? :

Operating System and Kernel :

Technically speaking, the operating system is considered the parts of the system that is responsible for the basic use and administration. This includes the kernel, device drivers, boot loader, command shell or any other user interface and basic file and system utilities. The term operating system inturn refers to the Operating system and the applications running on top of it.

Kernel is the inner most part of the Operating system. It is core internals, a software that provides the basic services to for all other parts of the resources, manages h/w and distributes the system resources. Some times kernel is called the supervisor, core and internals of the OS. Typical components of the kernel are the interrupt handlers for the service interrupt requests, a scheduler to share the processor time among the processes, a memory management unit to manage the process address spaces, and the system services such as the networking and the interprocess communication.

On modern systems with a protected memory management units, the kernel resides in the elevated state than compared with the normal user applications. This includes the protected memory space and full access to the h/w. The system state and the memory space collectively referred as kernel-space.

The user application executes in the user mode, have access to the machine's subset of resources, so they are unable to perform some system functions directly access to the h/w. When executing the kernel the system is in kernel-mode, opposed to the application executing in the user-mode.

Applications running on the system communicate the kernel via the system calls. The application calls the functions in the C library, the library functions in-turn calls the system calls. The C library can provide more extra features that the kernel doesn't have.

Example : printf () calls the write () system call. Where as other function strcpy () will not at all use any system call. When an application executes the system call, the kernel is executing on behalf of the application. The application is said be executing system call in the kernel-space and the kernel is running in the process-context.

The kernel also manages the system h/w, provide the concept of interrupts. When the h/w wants to communicate with the system, it issues the interrupt.This will asynchronously interrupt the kernel. Interrupts are identified by the number. The kernel uses the number to execute the interrupt handler and process and response to the interrupt.

To provide the synchronization, the kernel disables all the interrupts or simply disables one interrupt number. In many OSes including Linux, interrupt handlers are executed in a separate context called interrupt context and it is not associated with any process. This context remains till the interrupt handler quickly responds to an interrupt and then exit.

We can generalize the each processor is doing one of the three things at any given moment:

1. In kernel-space, in process context, executing on behalf of a specific process.
2. In kernel-space, in interrupt context not associated with any process handling interrupt.
3. In user-space, executing the user code in a process.

When idle, the kernel is executing an idle process in process context in the kernel.

Linux vs Classic Unix Kernels

Unix kernel is monolithic static binary, i.e. It exists as a large single executable image that runs in a single address space. Unix systems typically require a system with a paged memory management unit; this h/w enables the system to enforce memory protection and to provide a unique virtual address space to each process.

Monolithic Kernel vs Micro Kernel Designs :

Monolithic Kernels:

    Monolithic kernels implemented entirely as a single process running entirely in a single address space. These kernels exists on disk as a single static binary. All services, exist and execute in a single large address space. Communication with in the kernel is trivial because every thing runs in the kernel mode in the same address space.

Micro Kernels:

    Micro kernels are not implemented as single process. Where as the functionality is broken down into separate processes, Usually called the servers. Only the servers requiring such capabilities runs in a privileged execution mode. The rest servers run in user-space. All the servers, though are kept separate and run in different address spaces. Therefore, direct function invocation as in monolithic kernels is not possible. Communication in micro kernels are handled by message passing, IPC. The separation of the servers prevents a failure in one server from bringing down another.


Linix is a monolithic that is runs in a single address space entirely in kernel mode. Linux borrowed much from micro kernels: Modular design with kernel preemption, supports kernel threads, and capability to dynamically load kernel modules. Every thing runs in the kernel mode with a function invocation. Linux is modular, threaded and the kernel itself is schedulable.

Diffrences b/w Linux and Unix Kernel variants :

1.    Linux supports dynamic loading of kernel modules. Although the linux kernel is monolithic, it is capable of dynamically loading and unloading the kernel code on demand.
2.    Linux has symmetric multiprocessor (SMP) support. Some of commerial unix versions support SMP, but traditional Unix kernel doesn't support SMP.
3.    Linux kernel is preemptive. Unlike the Unix kernels, Linux kernels capable of preempting a task running in the kernel. Of there other commercial Unix kernels have preemptive kernels, but most         traditional Unix kernels doesn't.
4.     Linux takes an interesting support to the thread: It doesn't diffrentiate b/w threads and processes. To the kernel, all processes are same some just happen to share resources.
5.     Linux provides object oriented device model with device classes, hot pluggable events, and user space device file system.
6.     Linux ignores some unix features which it thought are poorly designed such as STREAMS, or standards that are brain dead.
7.    Linux is free in every sense.


Linux Kernel Versions :

Linux kernel come in two flavors

1. Stable
2. Development.

The stable release can be deployed in a production environment. The Development release is for developers, It contains new features and new ideas implemented. It can't be deployed in the production environment as the new features implemented are not tested well.







The Kernel Version notation.






                2.6.0
                |   |  |___ Patch level
Major Ver   |___ Minor Version



We will discuss the process management in the next post. I will also post, from where to get the kernel source, the kernel source tree and how to compile it. 

By the way, feel free to ask questions and doubts and write your feedback about the post and suggestions too ... !!! 

Thursday, April 15, 2010

About this blog

Hi,
Welcome to this blog. This blog will give you the information about the Linux Kernel and its internals and how it is developed. You will get the information about :

What is Linux Kernel ?
Process Management
Process Scheduling
System Calls
Interrupts & Interrupt handlers
Bottom halves
Kernel Synchronization methods
Timers & Time management
Memory management
etc.

The basic idea to start this blog is, to provide the information about the kernel and how it works? When i searched for it on the net i get very little information. So thought of sharing it with every one. I will also share the links and books which i followed. All the details i am discussing/will discuss here are related to 2.6 series kernel.