Kernel architecture overview

The architecture of the kernel of Manux is quite different from those of usual kernels. Let's see this in more details.

A modular monospace kernel

The kernel is described as having a "modular monospace" architecture. Since this designation was created specially for it, it seems useful to precise its meaning.

The Linux kernel is monolithic modular, that is, it is composed of a central monolith and some dynamically loaded modules (at least, in the most common configuration; it can also be compiled as a pure monolith), all of this being executed in a unified address space.

The kernel of Manux is also executed in a unified address space. However, it is integrally composed of modules, and lacks any central monolith. Additionnaly, currently, no module is removable.

To be more specific, the kernel elements fall into three categories :

The amorce is the first loaded element. Technically, it is an ELF executable in multiboot format, to which the boot loader transfers control just after having finished its work. From the kernel's point of view, it is the portion of code tasked with initializing the hardware and the modules.

The permanent modules are the real components of the kernel. They split the tasks between themselves, and communicate by direct function call.

Finally, the ephemodules are elements meant to do one precise punctual task before being unloaded. It can be the checking of the absence of memory corruption in a module's memory, the testing of a module at the beginning of its development, the reorganization of memory to adapt it to the injection of a new kernel module, etc... The kernel always has one and only one ephemodule in memory. Initially, the loaded ephemodule is "blanc" ("white" in French), which does nothing.

The division into modules brings the following advantages :

The modules' architecture

Modules are composed of two parts : the code and the data. The code is located in an ELF statically linked binary, and is coded with the method of the refs without defs - that is, its data are referenced without ever being defined. Theoretically, such an approach should bring uncompilable code. In fact, the address of its data is injected at link edition (file scripts_ld/<module>.lds), which works very well, but implies that the modules cannot be executed before an external piece of code (the amorce) has allocated and initialized their data structures

As for the data, they are composed of two contiguous portions : the bordereau (no other english translation) and the heap. The heap is the memory area in which the dynamic memory allocations take place, through the function Alloue_mem(). The bordereau is the memory structure containing or referencing all the module's data.

For example, here is the structure of the pipe module's bordereau, tubes :

struct bordereau_tubes {
        struct en_tete_bordereau etb;
        struct tube tubes[NB_MAX_TUBES];
} __attribute__((packed));

The header (en_tete_bordereau) contains the informations required to do memory allocations in the heap, and the tube structures contain the data required to handle the pipes.

The advantages of this architecture are as follows :

Associated coding conventions

In the kernel's code (and its data), the following coding conventions are applied :

(This list is incomplete, it only mentions the elements having an interest for the kernel architecture.)

Interest of these conventions

The functions declared as "exported" are the module's interfaces. The module communicate through them. Of course, this raises a problem : when gcc compiles their code, it gives them an arbitrary address within the module. It would have been possible to simply inform each module of the address of the interfaces of all the others at link edition, but such an operation would make the compilation of a module dependant of that of the others, and would cause severe difficulties during dynamic patching.

To solve this problem, a little plank of code is put at the beginning of the module, tasked with calling the interfaces from a known position (this is the file inclus/liens/<module>.s , see the source code for more details). The position of all the calls from this plank is an easily computable constant, which makes it possible to make every module independant from the version of the others.

As an example, here is the disassembly of the beginning of the gmm module :

Disassembly of section code:

f14000d4 <code>:
	...
f1400100:	e9 fb 04 00 00       	jmp    0xf1400600
f1400105:	90                   	nop
f1400106:	0f 0b                	ud2    
f1400108:	e9 83 05 00 00       	jmp    0xf1400690
f140010d:	90                   	nop
f140010e:	0f 0b                	ud2    
f1400110:	e9 db 05 00 00       	jmp    0xf14006f0
f1400115:	90                   	nop
f1400116:	0f 0b                	ud2    
	...
f1400120:	55                   	push   %ebp
f1400121:	57                   	push   %edi
f1400122:	bf 01 00 00 00       	mov    $0x1,%edi
f1400127:	56                   	push   %esi
f1400128:	53                   	push   %ebx

The initial code plank goes from the beginning to 0xf1400117 included. It is followed by a 9-byte hole added by the link editor, and then by the code of the module as compiled by gcc. At 0xf1400100 lies the interface Alloue(), which is interface number 0 of this module; it transfers the control to the alloue() function of the C code, located in our case at 0xf1400600. Two other interfaces follow (Libere() and Realloue()), followed by the binary corresponding to the file noyau/gmm/main.c .

During a recompilation, the addresses of the C functions change, but not these of the initial plank. Thus, the compilation of the modules is independant of the compilation of the others, and it is possible to dynamically alter each module separately.

Additionnaly, the coding convention eases the identification of nature of the elements by the programmer. As an example from the module p3p :

        Inodes[j] = Alloue_mem(sizeof(struct inode_p3p), 0);
        verifie((Inodes[j] != NULL), RETOURNE, NULL, 0);

        Num_inode_courante = ((j+1) % P3P_NB_MAX_INODES);

Here, we see that Inodes is a structure from the bordereau of this module, j a local variable, Alloue_mem() a function from another module (actually, this is a macro hiding the function Alloue(), but this doesn't matter); verifie() is a local function (again, it's a macro, but it calls no external function per se); NULL, RETOURNE and P3P_NB_MAX_INODES are preprocessor definitions, and Num_inode_courante is a variable stored in the bordereau.

Index of the documentation
Main page