view www/design.html @ 69:c2becd45f162

Update design document, add infrastructure for navigation links.
author Rob Landley <rob@landley.net>
date Thu, 28 Dec 2006 17:21:05 -0500
parents f8c588578fa1
children 2b66e5d3ae79
line wrap: on
line source

<!--#include "header.html" -->

<h1>The Firmware Linux build process</h1>

<p>FWL builds a cross-compiler and then uses it to build a minimal system
containing a native compiler, BusyBox and uClibc.  Then it runs this minimal
system under an emulator (QEMU) to natively build the final system.  Finally it
packages the resulting system (kernel, initramfs, and root filesystem) into
a single file that can boot and run (on x86 by using a modified version of LILO).</p>

<p>Firmware Linux builds in stages:</p>

<h2>Stage 1: Build a cross-compiler.</h2>

<p>The first stage builds a cross-compiler,
which runs on the host system and produces binaries that run on the target
system.  (See my <a href=/writing/docs/cross-compiling.html>Introduction to
cross compiling</a> if you're unfamiliar with this.)</p>

<p>We have to cross-compile even if the host and target system are both
x86, because the host probably use different C libraries.  If the host has
glibc and the target uses uClibc, then the (dynamically linked) target binaries
we produce won't run on the host.  This is what distinguishes cross-compiling
from native compiling: different processors are just one reason the binaries
might not run.  Of course, as long as we've got the cross-compiling support
anyway, we might as well support building for x86_64, arm, mips, or ppc
targets...</p>

<p>Building a cross-compiler toolchain requires four packages.  The bulk of
it is binutils, gcc, and uClibc, but building those requires header files from
the Linux kernel which describe the target system.</p>

<h2>Stage 2: Use the cross-compiler to build a native build environment
for the target.</h2>

<p>Because cross-compiling is persnickety and difficult, we do as little of
it as possible.  Instead we use the cross-compiler to generate the smallest
possible native build environment for the target, and then run the rest of the
build in that environment, under an emulator.</p>

<p>The emulator we use is QEMU.  The minimal build environment powerful enough
to boot and compile a complete Linux system requires seven packages: the Linux
kernel, binutils, gcc, uClibc, BusyBox, make, and bash.  It's packaged
using the <a href=http://www.linuxfromscratch.org>Linux From Scratch</a>
/tools directory approach, staying out of the way so the minimal build
environment doesn't get mixed into the final system.</p>

<h2>Stage 3: Run the target's native build environment under an emulator to
build the final system.</h2>

<p>Running a native build under QEMU is much slower than cross-compiling,
but it's a lot easier and more reliable.</p>

<p>A trick to accelerate the build is to use distcc to call out to the
cross-compiler, feeding the results back into the emulator through the virtual
network.  (This is still a TODO item.)</p>

<p>The actual build run under stange 3 can be a fairly straightforward
<a href=http://www.linuxfromscratch.org>Linux From Scratch</a> approach,
or another source based Linux build system like Gentoo.</p>

<h2>Stage 4: Package the system into a firmware file.</h2>

<p>The reason for the name Firmware Linux is that the entire operating system
(kernel, initramfs, and read-only squashfs root filesystem) are glued together
into a single file.  A modified version of LILO is included which can boot and
run this file on x86.</p>

<hr>

<h1>Evolution of the firmware Linux build process.</h1>

<h2>The basic theory</h2>

<p>The Linux From Scratch approach is to build a minimal intermediate system
with just enough packages to be able to compile stuff, chroot into that, and
build the final system from there.  This isolates the host from the target,
which means you should be able to build under a wide variety of distributions.
It also means the final system is built with a known set of tools, so you get
a consistent result.</p>

<p>A minimal build environment consists of a C library, a compiler, and BusyBox.
So in theory you just need three packages:</p>

<ul>
  <li>A C library (uClibc)</li>
  <li>A toolchain (tcc)</li>
  <li>BusyBox</li>
</ul>

<p>Unfortunately, that doesn't work yet.</p>

<h2>Some differences between theory and reality.</h2>

<h3>Environmental dependencies.</h2>

<p>Environmental dependencies are things that need to be installed before you
can build or run a given package.  Lots of packages depend on things like zlib,
SDL, texinfo, and all sorts of other strange things.  (The GnuCash project
stalled years ago after it released a version with so many environmental
dependencies it was impossible to build or install.  Environmental dependencies
have a complexity cost, and are thus something to be minimized.)</p>

<p>A good build system will scan its environment to figure out what it has
available, and disable functionality that depends on stuff that isn't
available.  (This is generally done with autoconf, which is disgusting but
suffers from a lack of alternatives.)  That way, the complexity cost is
optional: you can build a minimal version of the package if that's all you
need.</p>

<p>A really good build system can be told that the environment
it's building in and the environment the result will run in are different,
so just because it finds zlib on the build system doesn't mean that the
target system will have zlib installed on it.  (And even if it does, it may not
be the same version.  This is one of the big things that makes cross-compiling
such a pain.  One big reason for statically linking programs is to eliminate
this kind of environmental dependency.)</p>

<p>The Firmware Linux build process is structured the way it is to eliminate
as many environmental dependencies as possible.  Some are unavoidable (such as
C libraries needing kernel headers or gcc needing binutils), but the
intermediate system is the minimal fully functional Linux development
environment I currently know how to build, and then we switch into that and
work our way back up from there by building more packages in the new
environment.</p>

<h3>Resolving environmental dependencies.</h2>

<p><b>To build uClibc you need kernel headers</b> identifying the syscalls and
such it can make to the OS.  Way back when you could use the kernel headers
straight out of the Linux kernel 2.4 tarball and they'd work fine, but sometime
during 2.5 the kernel developers decided that exporting a sane API to userspace
wasn't the kernel's job, and stopped doing it.</p>

<p>The 0.8x series of Firmware Linux used
<a href=http://ep09.pld-linux.org/~mmazur/linux-libc-headers/>kernel
headers manually cleaned up by Mariusz Mazur</a>, but after the 2.6.12 kernel
he had an attack of real life and fell too far behind to catch up again.</p>

<p>The current practice is to use the Linux kernel's "make headers_install"
target, created by David Woodhouse.  This runs various scripts against the
kernel headers to sanitize them for use by userspace.  This was merged in
2.6.18-rc1, and was more or less debugged by 2.6.19.  So can use the Linux
Kernel tarball as a source of headers again.</p>

<p>Another problem is that the busybox shell situation is a mess with four
implementations that share little or no code (depending on how they're
configured).  The first question when trying to fix them is "which of the four
do you fix?", and I'm just not going there.  So until bbsh goes in we
<b>substitute bash</b>.</p>

<p>Finally, <b>most packages expect gcc</b>.  The tcc project isn't a drop-in
gcc replacement yet, and doesn't include a "make" program.  Most importantly,
tcc development appears stalled because Fabrice Bellard's other major project
(qemu) is taking up all his time these days.  In 2004 Fabrice
<a href=http://fabrice.bellard.free.fr/tcc/tccboot.html>built a modified Linux
kernel with tcc</a>, and
<a href=http://fabrice.bellard.free.fr/tcc/tccboot_readme.html>listed</a>
what needed to be upgraded in TCC to build an unmodified kernel, but
since then he hardly seems to have touched tcc.  Hopefully, someday he'll get
back to it and put out a 1.0 release of tcc that's a drop-in gcc replacment.
(And if he does, I'll add a make implementation to toybox so we don't need
to use any of the gnu toolchain).  But in the meantime the only open source
compiler that can build a complete Linux system is still the gnu compiler.</p>

<p>The gnu compiler actually consists of three packages <b>(binutils, gcc, and
make)</b>, which is why it's generally called the gnu "toolchain".  (The split
between binutils and gcc is for purely historical reasons, and you have
to match the right versions with each other or things break.)</p>

<p>This means that to compile a minimal build environment, you need seven
packages, and to actually run the result we use an eighth package (QEMU).</p>

<p>This can actually be made to work.  The next question is how?</p>

<h2>Additional complications</h2>

<h3>Cross-compiling and avoiding root access</h2>

<p>The first problem is that we're cross-compiling.  We can't help it.
You're cross-compiling any time you create target binaries that won't run on
the host system.  Even when both the host and target are on the same processor,
if they're sufficiently different that one can't run the other's binaries, then
you're cross-compiling.  In our case, the host is usually running both a
different C library and an older kernel version than the target, even when
it's the same processor.</p>

<p>The second problem is that we want to avoid requiring root access to build
Firmware Linux.  If the build can run as a normal user, it's a lot more
portable and a lot less likely to muck up the host system if something goes
wrong.  This means we can't modify the host's / directory (making anything
that requires absolute paths problematic).  We also can't mknod, chown, chgrp,
mount (for --bind, loopback, tmpfs)...</p>

<p>In addition, the gnu toolchain (gcc/binutils) is chock-full of hardwired
assumptions, such as what C library it's linking binaries against, where to look
for #included headers, where to look for libraries, the absolute path the
compiler is installed at...  Silliest of all, it assumes that if the host and
target use the same processor, you're not cross-compiling (even if they have
a different C library and a different kernel, and even if you ./configure it
for cross-compiling it switches that back off because it knows better than
you do).  This makes it very brittle, and it also tends to leak its assumptions
into the programs it builds.  New versions may someday fix this, but for now we
have to hit it on the head repeatedly with a metal bar to get anything remotely
useful out of it, and run it in a separate filesystem (chroot environment) so
it can't reach out and grab the wrong headers or wrong libraries despite
everything we've told it.</p>

<p>The absolute paths problem affects target binaries because all dynamically
linked apps expect their shared library loader to live at an absolute path
(in this case /lib/ld-uClibc.so.0).  This directory is only writeable by root,
and even if we could install it there polluting the host like that is just
ugly.</p>

<p>The Firmware Linux build has to assume it's cross-compiling because the host
is generally running glibc, and the target is running uClibc, so the libraries
the target binaries need aren't installed on the host.  Even if they're
statically linked (which also mitigates the absolute paths problem somewhat),
the target often has a newer kernel than the host, so the set of syscalls
uClibc makes (thinking it's talking to the new kernel, since that's what the
ABI the kernel headers it was built against describe) may not be entirely
understood by the old kernel, leading to segfaults.  (One of the reasons glibc
is larger than uClibc is it checks the kernel to see if it supports things
like long filenames or 32-bit device nodes before trying to use them.  uClibc
should always work on a newer kernel than the one it was built to expect, but
not necessarily an older one.)</p>

<h2>Ways to make it all work</h2>

<h3>Cross compiling vs native compiling under emulation</h3>

<p>Cross compiling is a pain.  There are a lot of ways to get it to sort of
kinda work for certain versions of certain packages built on certain versions
of certain distributions.  But making it reliable or generally applicable is
hard to do.</p>

<p>I wrote an <a href=/writing/docs/cross-compiling.html>introduction
to cross-compiling</a> which explains the terminology, plusses and minuses,
and why you might want to do it.  Keep in mind that I wrote that for a company
that specializes in cross-compiling.  Personally, I consider cross-compiling
a necessary evil to be minimized, and that's how Firmware Linux is designed.
We cross-compile just enough stuff to get a working native build environment
for the new platform, which we then run under emulation.</p>

<h3>Which emulator?</h3>

<p>The emulator Firmware Linux 0.8x used was User Mode Linux (here's a
<a href=http://www.landley.net/writing/docs/UML.html>UML mini-howto</a> I wrote
while getting this to work).  Since we already need the linux-kernel source
tarball anyway, building User Mode Linux from it was convenient and minimized
the number of packages we needed to build the minimal system.</p>

<p>The first stage of the build compiled a UML kernel and ran the rest of the
build under that, using UML's hostfs to mount the parent's root filesystem as
the root filesystem for the new UML kernel.  This solved both the kernel
version and the root access problems.  The UML kernel was the new version, and
supported all the new syscalls and ioctls and such that the uClibc was built to
expect, translating them to calls to the host system's C library as necessary.
Processes running under User Mode Linux had root access (at least as far as UML
was concerned), and although they couldn't write to the hostfs mounted root
partition, they could create an ext2 image file, loopback mount it, --bind
mount in directories from the hostfs partition to get the apps they needed,
and chroot into it.  Which is what the build did.</p>

<p>Current Firmware Linux has switched to a different emulator, QEMU, because
as long as we're we're cross-compiling anyway we might as well have the
ability to cross-compile for non-x86 targets.  We still build a new kernel
to run the uClibc binaries with the new kernel ABI, we just build a bootable
kernel and run it under QEMU.</p>

<p>The main difference with QEMU is a sharper dividing line between the host
system and the emulated target.  Under UML we could switch to the emulated
system early and still run host binaries (via the hostfs mount).  This meant
we could be much more relaxed about cross compiling, because we had one
environment that ran both types of binaries.  But this doesn't work if we're
building an ARM, PPC, or x86-64 system on an x86 host.</p>

<p>Instead, we need to sequence more carefully.  We build a cross-compiler,
use that to cross-compile a minimal intermediate system from the seven packages
listed earlier, and build a kernel and QEMU.  Then we run the kernel under QEMU
with the new intermediate system, and have it build the rest natively.</p>

<p>It's possible to use other emulators instead of QEMU, and I have a todo
item to look at armulator from uClinux.  (I looked at another nommu system
simulator at Ottawa Linux Symposium, but after resolving the third unnecessary
environmental dependency and still not being able to get it to finish compiling
yet, I gave up.  Armulator may be a patch against an obsolete version of gdb,
but I could at least get it to build.)</p>

<h1>Packaging</h1>

<p>The single file packaging combines a linux kernel, initramfs, squashfs
partition, and cryptographic signature.</p>

<p>In Linux 2.6, the kernel and initramfs are already combined into a single
file.  At the start of this file is either the obsolete floppy boot sector
(just a stub in 2.6), or an ELF header which has 12 used bytes followed by 8
unused bytes.  Either way, we can generally use the 4 bytes starting at offset
12 to store the original length of the kernel image, then append a squashfs
root partition to the file, followed by a whole-file cryptographic
signature.</p>

<p>Loading an ELF kernel (such as User Mode Linux or a non-x86 ELF kernel)
is controlled by the ELF segments, so the appended data is ignored.
(Note: don't strip the file or the appended data will be lost.)  Loading an x86
bzImage kernel requires a modified boot loader that can be told the original
size of the kernel, rather than querying the current file length (which would
be too long).  Hence the patch to Lilo allowing a "length=xxx" argument in the
config file.</p>

<p>Upon boot, the kernel runs the initramfs code which finds the firmware
file.  In the case of User Mode Linux, the symlink /proc/self/exe points
to the path of the file.  A bootable kernel needs a command line argument
of the form firmware=device:/path/to/file (it can lookup the device in
/sys/block and create a temporary device node to mount it with; this is
in expectation of dynamic major/minor happening sooner or later).
Once the file is found, /dev/loop0 is bound to it with an offset (losetup -o,
with a value extracted from the 4 bytes stored at offset 12 in the file), and
the resulting squashfs is used as the new root partition.</p>

<p>The cryptographic signature can be verified on boot, but more importantly
it can be verified when upgrading the firmware.  New firmware images can
be installed beside old firmware, and LILO can be updated with boot options
for both firmware, with a default pointing to the _old_ firmware.  The
lilo -R option sets the command line for the next boot only, and that can
be used to boot into the new firmware.  The new firmware can run whatever
self-diagnostic is desired before permanently changing the default.  If the
new firmware doesn't boot (or fails its diagnostic), power cycle the machine
and the old firmware comes up.  (Note that grub does not have an equivalent
for LILO's -R option; which would mean that if the new firmware doesn't run,
you have a brick.)</p>

<h2>Filesystem Layout</h2>

<p>Firmware Linux's directory hierarchy is a bit idiosyncratic: some redundant
directories have been merged, with symlinks from the standard positions
pointing to their new positions.  On the bright side, this makes it easy to
make the root partition read-only.</p>

<h3>Simplifying the $PATH.</h3>

<p>The set "bin->usr/bin, sbin->usr/sbin, lib->usr/lib" all serve to consolidate
all the executables under /usr.  This has a bunch of nice effects: making a
a read-only run-from-CD filesystem easier to do, allowing du /usr to show
the whole system size, allowing everything outside of there to be mounted
noexec, and of course having just one place to look for everything.  (Normal
executables are in /usr/bin.  Root only executables are in /usr/sbin.
Libraries are in /usr/lib.)</p>

<p>For those of you wondering why /bin and /usr/sbin were split in the first
place, the answer is it's because Ken Thompson and Dennis Ritchie ran out
of space on the original 2.5 megabyte RK-05 disk pack their root partition
lived on in 1971, and leaked the OS into their second RK-05 disk pack where
the user home directories lived.  When they got more disk space, they created
a new direct (/home) and moved all the user home directories there.</p>

<p>The real reason we kept it is tradition.  The execuse is that the root
partition contains early boot stuff and /usr may get mounted later, but these
days we use initial ramdisks (initrd and initramfs) to handle that sort of
thing.  The version skew issues of actually trying to mix and match different
versions of /lib/libc.so.* living on a local hard drive with a /usr/bin/*
from the network mount are not pretty.</p>

<p>I.E. The seperation is just a historical relic, and I've consolidated it in
the name of simplicity.</p>

<p>On a related note, there's no reason for "/opt".  After the original Unix
leaked into /usr, Unix shipped out into the world in semi-standardized forms
(Version 7, System III, the Berkeley Software Distribution...) and sites that
installed these wanted places to add their own packages to the system without
mixing their additions in with the base system.  So they created "/usr/local"
and created a third instance of bin/sbin/lib and so on under there.  Then
Linux distributors wanted a place to install optional packages, and they had
/bin, /usr/bin, and /usr/local/bin to choose from, but the problem with each
of those is that they were already in use and thus might be cluttered by who
knows what.  So a new directory was created, /opt, for "optional" packages
like firefox or open office.</p>

<p>It's only a matter of time before somebody suggests /opt/local, and I'm
not humoring this.  Executables for everybody go in /usr/bin, ones usable
only by root go in /usr/sbin.  There's no /usr/local or /opt.  /bin and
/sbin are symlinks to the corresponding /usr directories, but there's no
reason to put them in the $PATH.</p>

<h3>Consolidating writeable directories.</h3>

<p>All the editable stuff has been moved under "var", starting with symlinking
tmp->var/tmp.  Although /tmp is much less useful these days than it used to
be, some things (like X) still love to stick things like named pipes in there.
Long ago in the days of little hard drive space and even less ram, people made
extensive use of temporary files and they threw them in /tmp because ~home
had an ironclad quota.  These days, putting anything in /tmp with a predictable
filename is a security issue (symlink attacks, you can be made to overwrite
any arbitrary file you have access to).  Most temporary files for things
like the printer or email migrated to /var/spool (where there are
persistent subdirectories with known ownership and permissions) or in the
user's home directory under something like "~/.kde".</p>

<p>The theoretical difference between /tmp and /var/tmp is that the contents
of /tmp should be deleted by the system init scripts on every
reboot, but the contents of /var/tmp may be preserved across reboots.  Except
there's no guarantee that the contents of any temp directory won't be deleted.
So any program that actually depends on the contents of /var/tmp being
preserved across a reboot is obviously broken, and there's no reason not to
just symlink them together.</p>

<p>(I case it hasn't become apparent yet, there's 30 years of accumulated cruft
in the standards, convering a lot of cases that don't apply outside of
supercomputing centers where 500 people share accounts on a mainframe that
has a dedicated support staff.  They serve no purpose on a laptop, let alone
an embedded system.)</p>

<p>The corner case is /etc, which can be writeable (we symlink it to
var/etc) or a read-only part of the / partition.   It's really a question of
whether you want to update configuration information and user accounts in a
running system, or whether that stuff should be fixed before deploying.
We're doing some cleanup, but leaving /etc writeable (as a symlink to
/var/etc).  Firmware Linux symlinks /etc/mtab->/proc/mounts, which
is required by modern stuff like shared subtrees.  If you want a read-only
/etc, use "find /etc -type f | xargs ls -lt" to see what gets updated on the
live system.  Some specific cases are that /etc/adjtime was moved to /var
by LSB and /etc/resolv.conf should be a symlink somewhere writeable.</p>

<h3>The resulting mount points</h3>

<p>The result of all this is that a running system can have / be mounted read
only (with /usr living under that), /var can be ramfs or tmpfs with a tarball
extracted to initialize it on boot, /dev can be ramfs/tmpfs managed by udev or
mdev (with /dev/pts as devpts under that: note that /dev/shm naturally inherits
/dev's tmpfs and some things like User Mode Linux get upset if /dev/shm is
mounted noexec), /proc can be procfs, /sys can bs sysfs.  Optionally, /home
can be be an actual writeable filesystem on a hard drive or the network.</p>

<p>Remember to
put root's home directory somewhere writeable (I.E. /root should move to
either /var/root or /home/root, change the passwd entry to do this), and life
is good.</p>

<--#include "footer.html" -->