Userspace architecture overview

Manux has its own userspace architecture, which is unlike any other. Let's see what it looks like.

Layout on the hard drive

First, let's have a look at the file layout on the hard drive. For comparison, here's what the root directory looks like on a classical Linux distribution :

ecolbus@linux:~$ ls -F /
bin/    etc/             lib/         opt/   sbin/     tmp/      vmlinuz.old@
boot/   home/            lost+found/  proc/  selinux/  usr/
cdrom/  initrd.img@      media/       root/  srv/      var/
dev/    initrd.img.old@  mnt/         run/   sys/      vmlinuz@

Now let's mount a Manux root partition, and see what its root looks like :

root@linux:~# mount /mnt/manux-ro 
root@linux:~# ls -F /mnt/manux-ro/
lost+found/  packages/

As you might realize, that's not exactly the way a traditional UNIX root is supposed to look like.

Now, let's explain. "lost+found" is a filesystem-related directory that has to be in its root. "packages", which will actually correspond to /packages on the running system, is the directory that contains all its packages; however, in Manux, every file is part of a package. So, yes, "/packages" contains the entire operating system. Finally, "/" itself is a special directory, always unreachable (except if you decide to mount the partition somewhere else), that serves for some algorithms inside the virtual filesystem (kernel module : sfv).

Ok, so, next, what is in "/packages" ?

root@linux:~# ls -F /mnt/manux-ro/packages/
core/  local/  std/

Here, we only have 3 directories, because we are running version 0.0.1. Each of these corresponds to a distribution, or a distributor (for example, if King Arthur were to package his grail-searching software and distribute it, it would go in "/packages/round_table"), except for the special name "local", which contains the packages specific to the machine's users, like their homedirs.

As for both others, "core" contains, well, the core of the operating system, whereas "std" contains the initial distribution, with stuff that's so common (like the glibc or bash) that they're likely to remain used unmodified by other distributions.

Packages organization

Now, what does it looks like inside these directories? Let's try a few commands.

root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3
alias/  etc/  misc/  userdirs/  vim/
root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim
lnc/  lncb/  meta/  root/  userdirs

Now things are getting interesting. First, let's note that all packages are located in similarly-organised trees in /packages : "distrib/distrib_version/software/software_version/software_portion". At this point are two mandatory directories, "root" and "meta", respectively the package's content and metadata, and two optional ones, "lncb" and "lnc", that relate to the way of launching the eventual executables in the package. Additionnaly, we see a curious file named "userdirs" :

root@linux:~# env -i stat /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/userdirs
  File: `/mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/userdirs'
  Size: 4096      	Blocks: 8          IO Block: 4096   weird file
Device: 807h/2055d	Inode: 195906      Links: 2
Access: (0777/?rwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-07-06 15:34:38.000000000 +0200
Modify: 2013-07-06 15:34:38.000000000 +0200
Change: 2013-07-06 15:34:38.000000000 +0200
 Birth: -

This is actually a directory hardlink, but we'll have a look at this later.

How does it looks like, in "root"? Well, let's see :

root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/root/
etc/  lib/  tmp/  usr/

Now, this looks like a standard root directory!

In Manux, all userspace programs are chrooted, and their packages double as their root directories. Now, let me immediately mention that this directory has this content because we are dealing with an already installed operating system. In the "root" directory, this package only provides the content of /usr/bin :

root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/root/usr/bin/
vim*

(Well, strictly speaking, it also provides an empty /etc/ld.so.cache, but this is unimportant.)

What about "meta"?

root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/meta/
crypto/  deps  files  identity  install*  origins  userdirs

This is the data required for the packaging system, whose description will be the subject of another document.

Launchers

What about "lncb" and "lnc"?

root@linux:~# ls -l /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/lnc{b,}
/mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/lnc:
total 16
-rwxr-xr-t 2 root root 12784 juil.  6 15:34 vim

/mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/lncb:
total 4
-rw-r--r-- 1 root root 390 janv.  1  1970 vim.lncb

As mentionned earlier, these directories are related to the executables, so in the vim package, it's not surprising to find only vim-related files. The "lncb" directory contains a package-provided file specifying how to launch vim ("lnc" stands for "launcher", the "b" indicates the binary non-executable format), whereas "lnc" contains the actual launcher created from the lncb file during the package's installation. (Why such a complicated scheme, you ask? Well, we need to handle the case were the packager is actually hostile.The lncb binary format is very limited, but guarantees that the launchers generated from it will remain inoffensive.)

But let's pay attention to the file "lnc/vim". It is actually an executable :

root@linux:~# file /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/lnc/vim 
/mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim/lnc/vim: sticky ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped

But it is far smaller than the true vim executable. Also, it is statically linked, has two hardlinks, and has its sticky bit activated. What does it means?

As we said, this file's task is to launch vim when called. Since all processes are chrooted, if the user calls vim, his/her shell will need to call the launcher from its chroot. This is possible because this launcher is hardlinked inside the user's root directory - thus the second hardlink. If other applications had a dependancy on vim, or if there were more users on this machine, there would be more hardlinks, but this isn't the case.

But there's an issue : which C library is there available inside the caller's chroot? The answer is : we can't know it beforehand, none is guaranteed, and additionnaly, the calling package might be hostile, thus it isn't possible to rely on its libraries. That's why all launchers are statically linked.

Finally, how is this binary supposed to find the true vim binary when called from an arbitrary external chroot? Well, that's were the sticky bit comes in play.

The sticky bit

The meaning of the sticky bit on regular files is unspecified, and in practice, it is unused, so I decided to reuse it in my filesystem. In ext2l, the sticky bit on a regular file informs the user that this file has a rootlink, a new kind of link, akin to a hardlink, that asymetrically links a regular file with a directory. Only a process that executes this regular file can make use of the rootlink, and when it does, its chroot changes, and the targetted directory becomes its new root (and its new current directory). In the standard case, the directory targetted is the package's "/root", but vim happens to be an exception - we'll see later why. In the meantime, let's see what happens when somebody calls the more standard /bin/cat :

  1. The application (in this case the user's shell) classically forks and looks for cat in its PATH; it finds it in /bin/cat ;
  2. The application calls execve(2) on the executable it found under /bin/cat; unaware that this is actually only cat's launcher;
  3. The launcher performs some analysis, then calls xchroot(2) with the flag XCHROOT_USE_ROOTLINK;
  4. The kernel honors the request and moves this process from the caller's chroot into the directory called "/root" in the cat package (both as its new root and as its new current directory, as xchroot(2) implies a chdir(2));
  5. The launcher calls execve(2) on the true cat binary.

Thus, thanks to the rootlink, an application was able to call another one without either of them ever having access to the other's chroot.

Userdirs

Let's not end this discussion without explaining vim's specificity. What separates vim from cat, technically, is that vim has user-specific configuration files (~/.vim*). Handling these requires a far more complicated scheme.

First, let's explain the term. A userdir is a directory that contains files specific to a combination {user, version of an application}. While classical operating systems keep these informations in the user's homedir, simply doing this wouldn't work on Manux, because the programs have no access to this homedir. So we need a special place for them.

Let's look again at our initial commands :

root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3
alias/  etc/  misc/  userdirs/  vim/
root@linux:~# ls -F /mnt/manux-ro/packages/std/0.0.1/vim/7.3/vim
lnc/  lncb/  meta/  root/  userdirs

Oh, yes, here's the userdir. But why two of them?

Well, remember, the first command simply listed the packages that are present in /packages/std/0.0.1/vim/7.3 . So the first one is an independant package - its raison d'être is to allow the application packager to provide default configuration files, and to store these userdirs.

And the second? Well, remember what I told you about it? Yes, it's a directory hardlink, and it targets the directory "root" in the first one.

Why such a complicated scheme, you may ask? Well, for starters, there's nothing specifying that vim 7.3 has to use vim 7.3's userdirs - if the configuration had remained compatible, and if such a package existed, it could use vim 7.2's userdirs (how the userdir is found is the job of the packaging system). So the first one isn't actually usable.

Plus, as it happens, if a software has userdirs, its launchers are given a rootlink towards the immediate father of the "/root" directory. But now, theoretical explanations become quite complicated, so I'll simply detail the operations that happen when someone starts vim :

  1. The calling application (here the user's shell) classically forks and looks for vim in its PATH; it finds it in /usr/bin/vim ;
  2. The application calls execve(2) on the executable it found under /usr/bin/vim; unaware that this is actually only vim's launcher;
  3. The launcher performs some analysis, then calls xchroot(2) with the flag XCHROOT_USE_ROOTLINK;
  4. The kernel honors the request and moves this process from the caller's chroot into the directory called "/" in vim's package (that is, to /packages/std/0.0.1/vim/7.3/vim, instead of the classically expected /packages/std/0.0.1/vim/7.3/vim/root);
  5. The launcher calls getuid(2), to identify his user;
  6. It then looks for the directory /userdirs/<uid> ;
  7. If not found, the launcher forks and calls /userdirs/bin/mkuserdir to create it; mkuserdir in turns copies the model stored in /userdirs/0;
  8. Then, the launcher modifies its environment so that the HOME variable points towards this directory, and performs some additionnal analysis;
  9. Then it calls xchroot(2) without XCHROOT_USE_ROOTLINK, and with a new root in /root;
  10. And, finally, it calls execve(2) on the true vim binary.

Thus, the HOME environment variable has been altered so that vim finds the user's configuration files into it, without having the possibility to look into anybody else's configuration. This particular architectural choice means that the variable HOME can change outside the user's control; this is a little price to pay for the operating system's security.

The triple-dot partition

Ha! You really thought we were done? Well, look at the previous examples, you don't see there's still a problem?

Let me explain. If the user decides to launch "cat", then everything will be fine, indeed. But what if he/she types "cat my_file.txt"?

Yes, look carefully. The launcher will chroot itself, then call cat, which will look for a file named "my_file.txt" in his directory... But he's been chrooted (and, under Manux, xchroot(2) implies chdir(2)), so the file won't be there. So, how is this supposed to work?

Well, I'm going to explain, but first : this can't be shown using a stopped Manux system. So let's boot it, type our login/password, and have a look at our homedir.

ecolbus@manux:~$ pwd
/.../home/ecolbus

Ha! There's something new, here! It's not /home/ecolbus, but /.../home/ecolbus . Well, let's try other things...

ecolbus@manux:~$ touch file
ecolbus@manux:~$ ls -l file
-rw-r--r-- 1 ecolbus ecolbus 0 20 juil. 01:33 file
ecolbus@manux:~$ ls -l /.../home/ecolbus/file
-rw-r--r-- 1 ecolbus ecolbus 0 20 juil. 01:33 /.../home/ecolbus/file

So far, so good...

ecolbus@manux:~$ ls -l /bin/cat
-rwxr-xr-t 4 sys sys 5264 13 juil. 15:05 /.../??/cat

Wait, what? The file's name has changed?

Well, yes, it has. Now, time for explanations.

/... is a special filesystem, one that has no real existence, like /proc (its kernel module is p3p). Unlike /proc, /... is always reachable, independently of your chroot. Also, its content depends on the process looking at it, and whether it is visible in / or not is unspecified (it currently never is).

This filesystem can only contain directories (and symlinks), but it is possible to tell it to establish associations between the directory entries and "real" files (see xchroot(2)). For example, one can tell it "make /.../abcd/efgh correspond to my file /etc/hostname", and then the new chroot will contain an entry /.../abcd/efgh corresponding to the previous /etc/hostname.

What does this has to do with us? Well, as we said, there's a problem if somebody tries to type "cat myfile.txt". Remember when I said the launcher "performed some analysis"? Here comes its other function : it analyses its command line, separating what is a file from what isn't, then passes all the required files in the chroot of the new command, in its /... directory.

(For technical reasons, this has to be done in the same syscall that the chroot, so this is again xchroot(2)'s task. This time, instead of the flags, the parameters used are the entries.)

Well, that's nice, but when it has determined which files had to be passed, how does the launcher chooses their future paths in /... ?

Answer : whenever it can keep the old path, it does; whenever it can't or is unsure, it chooses a new one that ends with the same name and is guaranteed not to collide with any other.

(I won't go here into the details of the techniques used to ensure this unicity (though this is perhaps the operating system's most interesting algorithm, a Cantor diagonalization on 7 bits per byte; see the source if you're interested); but let's mention an evidence : there cannot be a /... in the original package's root (dot-only filenames are reserved in ext2l), so we are sure that all the launcher has to ensure is that it never generates a collision between the paths it creates.)

So this is why the home directory has been deliberately placed in /... : because this way, we can be sure that all its files can be passed to any program without a need to rewrite their paths. The user already sees them in /... , so the launcher just keeps their names.

So, in more details, here's what happens whenever somebody types "cat myfile.txt" :

  1. The application (in this case the user's shell) classically forks and looks for cat in its PATH; it finds it in /bin/cat;
  2. The application calls execve(2) on the executable it found under /bin/cat; unaware that this is actually only cat's launcher;
  3. The launcher parses the command line, and realizes that myfile.txt corresponds to a file;
  4. It then looks at the local directory, and notices it's /.../home/username, well inside /... ; from which it deduces that it can safely pass the file as /.../home/username/myfile.txt ;
  5. It builds the xchroot entry required for this, then calls xchroot(2) with the flag XCHROOT_USE_ROOTLINK;
  6. The kernel honors the request and moves this process from the caller's chroot into the directory called "/root" in the cat package; simultaneously, it creates a series of entries in the /... of this process, named /.../home , /.../home/username, and /.../home/username/myfile.txt , the last one associated with the previous file myfile.txt;
  7. The launcher then changes its directory into the copy of the previous current directory, it happens to be /.../home/username;
  8. The launcher calls execve(2) on the true cat binary;
  9. cat analyses its command line, notices it's supposed to open the file myfile.txt in its current directory, tries to do so, succeeds, and performs its task.

Files outside /...

And what about files located outside from /... ? Well, this command gave us a good example of it :

ecolbus@manux:~$ ls -l /bin/cat
-rwxr-xr-t 4 sys sys 5264 13 juil. 15:05 /.../??/cat

In this case, I asked /bin/ls to operate on /bin/cat, which is in my root directory, not in my /... . The launcher of ls noticed the problem, but had only one option : choose a new file path, located in /... , so that /bin/ls could access it from its chroot. Its algorithm gave it a name composed of two unreadable characters (that ls replaced with '?'), and, after doing the call to xchroot(2), the launcher altered ls's command line so that it was told to operate on the new name, not the old one.

As it happens, "ls" is one of the very few commands that let us see this mechanism. This also means that Manux behaves in ways that are slightly different from traditional operating systems, and that some incompatibilities can result from it. This is an undeniable fact, but this is only a consequence of its security - another small price for it.

Conclusion

This concludes our overview of the Manux userspace architecture. Not that this is all there is, but the rest is mostly arcane knowledge (like how start_session works, or how the system is bootstrapped) that is both far less useful and simply based on the principles exposed here. At this point, if you want to understand it better, there can be no better way than trying it.

Index of the documentation
Main page