changeset 7:f8c588578fa1

Finish shuffling old website material into new website.
author Rob Landley <rob@landley.net>
date Tue, 28 Nov 2006 16:08:42 -0500
parents e039588b3189
children 0068264ad65a
files www/build-process.html www/design.html www/index.html
diffstat 3 files changed, 308 insertions(+), 451 deletions(-) [+]
line wrap: on
line diff
--- a/www/build-process.html	Mon Nov 27 19:39:22 2006 -0500
+++ b/www/build-process.html	Tue Nov 28 16:08:42 2006 -0500
@@ -1,12 +1,13 @@
 <html>
 <title>The Firmware Linux build process</title>
 
-<p>FWL builds a cross-compiler and then uses it to build a minimal system with
-a native compiler, BusyBox and uClibc.  Then it runs this minimal system
-under an emulator (QEMU) and natively builds the final system.  It then
+<h1>Executive summary</h1>
+
+<p>FWL builds a cross-compiler and then uses it to build a minimal system
+containing a native compiler, BusyBox and uClibc.  Then it runs this minimal
+system under an emulator (QEMU) to natively build the final system.  Finally it
 packages the resulting system (kernel, initramfs, and root filesystem) into
-a single file that can boot and run (using a modified version of LILO on
-x86).</p>
+a single file that can boot and run (on x86 by using a modified version of LILO).</p>
 
 <p>Firmware Linux builds in stages:</p>
 
@@ -34,13 +35,13 @@
 for the target.</h2>
 
 <p>Because cross-compiling is persnickety and difficult, we do as little of
-it as possible.  We use the cross-compiler to generate a native build
-environment for the target, and then run the rest of the build under an
-emulator.</p>
+it as possible.  Instead we use the cross-compiler to generate the smallest
+possible native build environment for the target, and then run the rest of the
+build in that environment, under an emulator.</p>
 
-<p>The minimal build environment you can boot into and build a complete Linux
-system under is the Linux kernel, binutils, gcc, uClibc, BusyBox, make, and
-bash.  The emulator we use to run this is QEMU, so we build that too.</p>
+<p>The emulator we use is QEMU.  The minimal build environment powerful enough
+to boot and compile a complete Linux system requires seven packages: the Linux
+kernel, binutils, gcc, uClibc, BusyBox, make, and bash.</p>
 
 <h2>Stage 3: Run the target's native build environment under an emulator to
 build the final system.</h2>
@@ -50,12 +51,11 @@
 
 <p>A trick to accelerate the build is to use distcc to call out to the
 cross-compiler, feeding the results back into the emulator through the virtual
-network.  This is still a TODO item.</p>
+network.  (This is still a TODO item.)</p>
 
 <p>Stage 3 is a fairly straightforward
 <a href=http://www.linuxfromscratch.org>Linux From Scratch</a> approach,
-except that we use BusyBox and uClibc in place of a couple dozen other
-packages.</p>
+except that we use BusyBox and uClibc instead of the gnu packages.</p>
 
 <h2>Stage 4: Package the system into a firmware file.</h2>
 
@@ -64,5 +64,223 @@
 into a single file.  A modified version of LILO is included which can boot and
 run this file on x86.</p>
 
-</body>
-</html>
+<hr>
+
+<h1>Evolution of the firmware Linux build process.</h1>
+
+<h2>The basic theory</h2>
+
+<p>The Linux From Scratch approach is to build a minimal intermediate system
+with just enough packages to be able to compile stuff, chroot into that, and
+build the final system from there.  This isolates the host from the target,
+which means you should be able to build under a wide variety of distributions.
+It also means the final system is built with a known set of tools, so you get
+a consistent result.</p>
+
+<p>A minimal build environment consists of a C library, a compiler, and BusyBox.
+So in theory you just need three packages:</p>
+
+<ul>
+  <li>A C library (uClibc)</li>
+  <li>A toolchain (tcc)</li>
+  <li>BusyBox</li>
+</ul>
+
+<p>Unfortunately, that doesn't work yet.</p>
+
+<h2>Some differences between theory and reality.</h2>
+
+<h3>Environmental dependencies.</h2>
+
+<p>Environmental dependencies are things that need to be installed before you
+can build or run a given package.  Lots of packages depend on things like zlib,
+SDL, texinfo, and all sorts of other strange things.  (The GnuCash project
+stalled years ago after it released a version with so many environmental
+dependencies it was impossible to build or install.  Environmental dependencies
+have a complexity cost, and are thus something to be minimized.)</p>
+
+<p>A good build system will scan its environment to figure out what it has
+available, and disable functionality that depends on stuff that isn't
+available.  (This is generally done with autoconf, which is disgusting but
+suffers from a lack of alternatives.)  That way, the complexity cost is
+optional: you can build a minimal version of the package if that's all you
+need.</p>
+
+<p>A really good build system can be told that the environment
+it's building in and the environment the result will run in are different,
+so just because it finds zlib on the build system doesn't mean that the
+target system will have zlib installed on it.  (And even if it does, it may not
+be the same version.  This is one of the big things that makes cross-compiling
+such a pain.  One big reason for statically linking programs is to eliminate
+this kind of environmental dependency.)</p>
+
+<p>The Firmware Linux build process is structured the way it is to eliminate
+as many environmental dependencies as possible.  Some are unavoidable (such as
+C libraries needing kernel headers or gcc needing binutils), but the
+intermediate system is the minimal fully functional Linux development
+environment I currently know how to build, and then we switch into that and
+work our way back up from there by building more packages in the new
+environment.</p>
+
+<h3>Resolving environmental dependencies.</h2>
+
+<p><b>To build uClibc you need kernel headers</b> identifying the syscalls and
+such it can make to the OS.  Way back when you could use the kernel headers
+straight out of the Linux kernel 2.4 tarball and they'd work fine, but sometime
+during 2.5 the kernel developers decided that exporting a sane API to userspace
+wasn't the kernel's job, and stopped doing it.</p>
+
+<p>The 0.8x series of Firmware Linux used
+<a href=http://ep09.pld-linux.org/~mmazur/linux-libc-headers/>kernel
+headers manually cleaned up by Mariusz Mazur</a>, but after the 2.6.12 kernel
+he had an attack of real life and fell too far behind to catch up again.</p>
+
+<p>The current practice is to use the Linux kernel's "make headers_install"
+target, created by David Woodhouse.  This runs various scripts against the
+kernel headers to sanitize them for use by userspace.  This was merged in
+2.6.18-rc1, and was more or less debugged by 2.6.19.  So can use the Linux
+Kernel tarball as a source of headers again.</p>
+
+<p>Another problem is that the busybox shell situation is a mess with four
+implementations that share little or no code (depending on how they're
+configured).  The first question when trying to fix them is "which of the four
+do you fix?", and I'm just not going there.  So until bbsh goes in we
+<b>substitute bash</b>.</p>
+
+<p>Finally, <b>most packages expect gcc</b>.  The tcc project isn't a drop-in
+gcc replacement yet, and doesn't include a "make" program.  Most importantly,
+tcc development appears stalled because Fabrice Bellard's other major project
+(qemu) is taking up all his time these days.  In 2004 Fabrice
+<a href=http://fabrice.bellard.free.fr/tcc/tccboot.html>built a modified Linux
+kernel with tcc</a>, and
+<a href=http://fabrice.bellard.free.fr/tcc/tccboot_readme.html>listed</a>
+what needed to be upgraded in TCC to build an unmodified kernel, but
+since then he hardly seems to have touched tcc.  Hopefully, someday he'll get
+back to it and put out a 1.0 release of tcc that's a drop-in gcc replacment.
+(And if he does, I'll add a make implementation to toybox so we don't need
+to use any of the gnu toolchain).  But in the meantime the only open source
+compiler that can build a complete Linux system is still the gnu compiler.</p>
+
+<p>The gnu compiler actually consists of three packages <b>(binutils, gcc, and
+make)</b>, which is why it's generally called the gnu "toolchain".  (The split
+between binutils and gcc is for purely historical reasons, and you have
+to match the right versions with each other or things break.)</p>
+
+<p>This means that to compile a minimal build environment, you need seven
+packages, and to actually run the result we use an eighth package (QEMU).</p>
+
+<p>This can actually be made to work.  The next question is how?</p>
+
+<h2>Additional complications</h2>
+
+<h3>Cross-compiling and avoiding root access</h2>
+
+<p>The first problem is that we're cross-compiling.  We can't help it.
+You're cross-compiling any time you create target binaries that won't run on
+the host system.  Even when both the host and target are on the same processor,
+if they're sufficiently different that one can't run the other's binaries, then
+you're cross-compiling.  In our case, the host is usually running both a
+different C library and an older kernel version than the target, even when
+it's the same processor.</p>
+
+<p>The second problem is that we want to avoid requiring root access to build
+Firmware Linux.  If the build can run as a normal user, it's a lot more
+portable and a lot less likely to muck up the host system if something goes
+wrong.  This means we can't modify the host's / directory (making anything
+that requires absolute paths problematic).  We also can't mknod, chown, chgrp,
+mount (for --bind, loopback, tmpfs)...</p>
+
+<p>In addition, the gnu toolchain (gcc/binutils) is chock-full of hardwired
+assumptions, such as what C library it's linking binaries against, where to look
+for #included headers, where to look for libraries, the absolute path the
+compiler is installed at...  Silliest of all, it assumes that if the host and
+target use the same processor, you're not cross-compiling (even if they have
+a different C library and a different kernel, and even if you ./configure it
+for cross-compiling it switches that back off because it knows better than
+you do).  This makes it very brittle, and it also tends to leak its assumptions
+into the programs it builds.  New versions may someday fix this, but for now we
+have to hit it on the head repeatedly with a metal bar to get anything remotely
+useful out of it, and run it in a separate filesystem (chroot environment) so
+it can't reach out and grab the wrong headers or wrong libraries despite
+everything we've told it.</p>
+
+<p>The absolute paths problem affects target binaries because all dynamically
+linked apps expect their shared library loader to live at an absolute path
+(in this case /lib/ld-uClibc.so.0).  This directory is only writeable by root,
+and even if we could install it there polluting the host like that is just
+ugly.</p>
+
+<p>The Firmware Linux build has to assume it's cross-compiling because the host
+is generally running glibc, and the target is running uClibc, so the libraries
+the target binaries need aren't installed on the host.  Even if they're
+statically linked (which also mitigates the absolute paths problem somewhat),
+the target often has a newer kernel than the host, so the set of syscalls
+uClibc makes (thinking it's talking to the new kernel, since that's what the
+ABI the kernel headers it was built against describe) may not be entirely
+understood by the old kernel, leading to segfaults.  (One of the reasons glibc
+is larger than uClibc is it checks the kernel to see if it supports things
+like long filenames or 32-bit device nodes before trying to use them.  uClibc
+should always work on a newer kernel than the one it was built to expect, but
+not necessarily an older one.)</p>
+
+<h2>Ways to make it all work</h2>
+
+<h3>Cross compiling vs native compiling under emulation</h3>
+
+<p>Cross compiling is a pain.  There are a lot of ways to get it to sort of
+kinda work for certain versions of certain packages built on certain versions
+of certain distributions.  But making it reliable or generally applicable is
+hard to do.</p>
+
+<p>I wrote an <a href=/writing/docs/cross-compiling.html>introduction
+to cross-compiling</a> which explains the terminology, plusses and minuses,
+and why you might want to do it.  Keep in mind that I wrote that for a company
+that specializes in cross-compiling.  Personally, I consider cross-compiling
+a necessary evil to be minimized, and that's how Firmware Linux is designed.
+We cross-compile just enough stuff to get a working native build environment
+for the new platform, which we then run under emulation.</p>
+
+<h3>Which emulator?</h3>
+
+<p>The emulator Firmware Linux 0.8x used was User Mode Linux (here's a
+<a href=http://www.landley.net/writing/docs/UML.html>UML mini-howto</a> I wrote
+while getting this to work).  Since we already need the linux-kernel source
+tarball anyway, building User Mode Linux from it was convenient and minimized
+the number of packages we needed to build the minimal system.</p>
+
+<p>The first stage of the build compiled a UML kernel and ran the rest of the
+build under that, using UML's hostfs to mount the parent's root filesystem as
+the root filesystem for the new UML kernel.  This solved both the kernel
+version and the root access problems.  The UML kernel was the new version, and
+supported all the new syscalls and ioctls and such that the uClibc was built to
+expect, translating them to calls to the host system's C library as necessary.
+Processes running under User Mode Linux had root access (at least as far as UML
+was concerned), and although they couldn't write to the hostfs mounted root
+partition, they could create an ext2 image file, loopback mount it, --bind
+mount in directories from the hostfs partition to get the apps they needed,
+and chroot into it.  Which is what the build did.</p>
+
+<p>Current Firmware Linux has switched to a different emulator, QEMU, because
+as long as we're we're cross-compiling anyway we might as well have the
+ability to cross-compile for non-x86 targets.  We still build a new kernel
+to run the uClibc binaries with the new kernel ABI, we just build a bootable
+kernel and run it under QEMU.</p>
+
+<p>The main difference with QEMU is a sharper dividing line between the host
+system and the emulated target.  Under UML we could switch to the emulated
+system early and still run host binaries (via the hostfs mount).  This meant
+we could be much more relaxed about cross compiling, because we had one
+environment that ran both types of binaries.  But this doesn't work if we're
+building an ARM, PPC, or x86-64 system on an x86 host.</p>
+
+<p>Instead, we need to sequence more carefully.  We build a cross-compiler,
+use that to cross-compile a minimal intermediate system from the seven packages
+listed earlier, and build a kernel and QEMU.  Then we run the kernel under QEMU
+with the new intermediate system, and have it build the rest natively.</p>
+
+<p>It's possible to use other emulators instead of QEMU, and I have a todo
+item to look at armulator from uClinux.  (I looked at another nommu system
+simulator at Ottawa Linux Symposium, but after resolving the third unnecessary
+environmental dependency and still not being able to get it to finish compiling
+yet, I gave up.  Armulator may be a patch against an obsolete version of gdb,
+but I could at least get it to build.)</p>
--- a/www/design.html	Mon Nov 27 19:39:22 2006 -0500
+++ b/www/design.html	Tue Nov 28 16:08:42 2006 -0500
@@ -1,289 +1,64 @@
 <title>Flimsy rationalizations for all of my design mistakes</title>
 
-<h1>Build Process</h1>
-
-<h2>Executive summary</h2>
-
-<p>Cross-compile just enough to get a native compiler for the new environment,
-and then emulate the new environment with QEMU to build the final system
-natively.</p>
-
-<p>The intermediate system is built and run using only the following eight
-packages:</p>
+<h2>What is it?</h2>
 
-<ul>
-<li>linux-kernel</li>
-<li>uclibc</li>
-<li>busybox</li>
-<li>binutils</li>
-<li>gcc</li>
-<li>make</li>
-<li>bash<li>
-<li>QEMU</li>
-</ul>
-
-<h2>The basic theory</h2>
-
-<p>What we want to do is build a minimal intermediate system with just enough
-packages to be able to compile stuff, chroot into that, and build the final
-system from there.  This isolates the host from the target, which means you
-should be able to build under a wide variety of distributions.  It also means
-the final system is built with a known set of tools, so you get a consistent
-result.</p>
-
-<p>A minimal build environment consists of a C library, a compiler, and BusyBox.
-So in theory you just need three packages:</p>
+<h2>Firmware Linux is a bootable single file linux system.</h2>
 
-<ul>
-  <li>A C library (uClibc)</li>
-  <li>A toolchain (tcc)</li>
-  <li>BusyBox</li>
-</ul>
-
-<p>Unfortunately, that doesn't work yet.</p>
-
-<h2>Some differences between theory and reality.</h2>
-
-<h3>Environmental dependencies.</h2>
-
-<p>Environmental dependencies are things that need to be installed before you
-can build or run a given package.  Lots of packages depend on things like zlib,
-SDL, texinfo, and all sorts of other strange things.  (The GnuCash project
-stalled years ago after it released a version with so many environmental
-dependencies it was impossible to build or install.  Environmental dependencies
-have a complexity cost, and are thus something to be minimized.)</p>
-
-<p>A good build system will scan its environment to figure out what it has
-available, and disable functionality that depends on stuff that isn't
-available.  (This is generally done with autoconf, which is disgusting but
-suffers from a lack of alternatives.)  That way, the complexity cost is
-optional: you can build a minimal version of the package if that's all you
-need.</p>
-
-<p>A really good build system can be told that the environment
-it's building in and the environment the result will run in are different,
-so just because it finds zlib on the build system doesn't mean that the
-target system will have zlib installed on it.  (And even if it does, it may not
-be the same version.  This is one of the big things that makes cross-compiling
-such a pain.  One big reason for statically linking programs is to eliminate
-this kind of environmental dependency.)</p>
+<p>Firmware Linux is one file containing a kernel, initramfs, read-only root
+filesystem, and cryptographic signature.  You can boot Linux from this file
+as if it was a normal kernel image (a slightly modified LILO is required on
+x86, patches for other bootloaders are a to-do item).  You can upgrade
+your entire OS (and any applications in the root filesystem) atomically, by
+downloading a new file and pointing your bootloader at it.</p>
 
-<p>The Firmware Linux build process is structured the way it is to eliminate
-environmental dependencies.  Some are unavoidable (such as C libraries needing
-kernel headers or gcc needing binutils), but the intermediate system is
-the minimal fully functional Linux development environment I currently know
-how to build, and then we chroot into that and work our way back up from there
-by building more packages in the new environment.</p>
-
-<h3>Resolving environmental dependencies.</h2>
-
-<p><b>To build uClibc you need kernel headers</b> identifying the syscalls and
-such it can make to the OS.  Way back when you could use the kernel headers
-straight out of the Linux kernel 2.4 tarball and they'd work fine, but sometime
-during 2.5 the kernel developers decided that exporting a sane API to userspace
-wasn't the kernel's job, and stopped doing it.</p>
+<p>Firmware Linux is a Linux distro using busybox and uClibc as the basis for
+a self-hosting development environment.  The only gnu utilities used in the
+build are gcc, binutils, make, and bash.  At some point in the future tcc and
+toybox should be able to replace these, at which point a version of Linux
+exists that even the FSF can't stick a undeserved GNU/ prefix on.</p>
 
-<p>The 0.8x series of Firmware Linux used
-<a href=http://ep09.pld-linux.org/~mmazur/linux-libc-headers/>kernel
-headers manually cleaned up by Mariusz Mazur</a>, but after the 2.6.12 kernel
-he had an attack of real life and fell too far behind to catch up again.</p>
-
-<p>The current practice is to use the 2.6.18 kernel's "make headers_install"
-target, created by David Woodhouse.  This runs various scripts against the
-kernel headers to sanitize them for use by userspace.  This was merged in
-2.6.18-rc1, so as of 2.6.18 we can use the Linux Kernel tarball as a source of
-headers again.</p>
-
-<p>Another problem is that the busybox shell situation is a mess with four
-implementations that share little or no code (depending on how they're
-configured).  The first question when trying to fix them is "which of the four
-do you fix?", and I'm just not going there.  So until bbsh goes in we
-<b>substitute bash</b>.</p>
+<h2>Packaging</h2>
 
-<p>Finally, <b>most packages expect gcc</b>.  The tcc project isn't a drop-in
-gcc replacement yet, and doesn't include a "make" program.  Most importantly,
-tcc development appears stalled because Fabrice Bellard's other major project
-(qemu) is taking up all his time these days.  In 2004 Fabrice
-<a href=http://fabrice.bellard.free.fr/tcc/tccboot.html>built a modified Linux
-kernel with tcc</a>, and
-<a href=http://fabrice.bellard.free.fr/tcc/tccboot_readme.html>listed</a>
-what needed to be upgraded in TCC to build an unmodified kernel, but
-since then he hardly seems to have touched tcc.  Hopefully, someday he'll get
-back to it and put out a 1.0 release of tcc that's a drop-in gcc replacment.
-(And if he does, I'll add a make implementation to BusyBox so we don't need
-to use any of the gnu toolchain).  But in the meantime the only open source
-compiler that can build a complete Linux system is still the gnu compiler.</p>
+<p>The single file packaging combines a linux kernel, initramfs, squashfs
+partition, and cryptographic signature.</p>
 
-<p>The gnu compiler actually consists of three packages <b>(binutils, gcc, and
-make)</b>, which is why it's generally called the gnu "toolchain".  (The split
-between binutils and gcc is for purely historical reasons, and you have
-to match the right versions with each other or things break.)</p>
-
-<p>This means that to compile a minimal build environment, you need seven
-packages, and to actually run the result we use an eighth package (QEMU).</p>
-
-<p>This can actually be made to work.  The next question is how?</p>
-
-<h2>Additional complications</h2>
-
-<h3>Cross-compiling and avoiding root access</h2>
-
-<p>The first problem is that we're cross-compiling.  We can't help it.
-You're cross-compiling any time you create target binaries that won't run on
-the host system.  Even when both the host and target are on the same processor,
-if they're sufficiently different that one can't run the other's binaries, then
-you're cross-compiling.  In our case, the host is usually running both a
-different C library and an older kernel version than the target, even when
-it's the same processor.</p>
+<p>In 2.6, the kernel and initramfs are already combined into a single file.
+At the start of this file is either the obsolete floppy boot sector (just
+a stub in 2.6), or an ELF header which has 12 used bytes followed by 8 unused
+bytes.  Either way, we can generally use the 4 bytes starting at offset 12 to
+store the original length of the kernel image, then append a squashfs root
+partition to the file, followed by a whole-file cryptographic signature.</p>
 
-<p>The second problem is that we want to avoid requiring root access to build
-Firmware Linux.  If the build can run as a normal user, it's a lot more
-portable and a lot less likely to muck up the host system if something goes
-wrong.  This means we can't modify the host's / directory (making anything
-that requires absolute paths problematic).  We also can't mknod, chown, chgrp,
-mount (for --bind, loopback, tmpfs)...</p>
-
-<p>In addition, the gnu toolchain (gcc/binutils) is chock-full of hardwired
-assumptions, such as what C library it's linking binaries against, where to look
-for #included headers, where to look for libraries, the absolute path the
-compiler is installed at...  Silliest of all, it assumes that if the host and
-target use the same processor, you're not cross-compiling (even if they have
-a different C library and a different kernel, and even if you ./configure it
-for cross-compiling it switches that back off because it knows better than
-you do).  This makes it very brittle, and it also tends to leak its assumptions
-into the programs it builds.  New versions may someday fix this, but for now we
-have to hit it on the head repeatedly with a metal bar to get anything remotely
-useful out of it, and run it in a separate filesystem (chroot environment) so
-it can't reach out and grab the wrong headers or wrong libraries despite
-everything we've told it.</p>
-
-<p>The absolute paths problem affects target binaries because all dynamically
-linked apps expect their shared library loader to live at an absolute path
-(in this case /lib/ld-uClibc.so.0).  This directory is only writeable by root,
-and even if we could install it there polluting the host like that is just
-ugly.</p>
+<p>A User Mode Linux executable or non-x86 ELF image should still run just fine
+(if loading is controlled by the ELF segments, the appended data is ignored).
+Note: don't strip the file or the appended data will be lost.</p>
 
-<p>The Firmware Linux build has to assume it's cross-compiling because the host
-is generally running glibc, and the target is running uClibc, so the libraries
-the target binaries need aren't installed on the host.  Even if they're
-statically linked (which also mitigates the absolute paths problem somewhat),
-the target often has a newer kernel than the host, so the set of syscalls
-uClibc makes (thinking it's talking to the new kernel, since that's what the
-ABI the kernel headers it was built against describe) may not be entirely
-understood by the old kernel, leading to segfaults.  (One of the reasons glibc
-is larger than uClibc is it checks the kernel to see if it supports things
-like long filenames or 32-bit device nodes before trying to use them.  uClibc
-should always work on a newer kernel than the one it was built to expect, but
-not necessarily an older one.)</p>
-
-<h2>Ways to make it all work</h2>
-
-<h3>Cross compiling vs native compiling under emulation</h3>
-
-<p>Cross compiling is a pain.  There are a lot of ways to get it to sort of
-kinda work for certain versions of certain packages built on certain versions
-of certain distributions.  But making it reliable or generally applicable is
-hard to do.</p>
-
-<p>I wrote an
-<a href=https://crossdev.timesys.com/documentation/introduction-to-cross-compiling-for-linux/>introduction
-to cross-compiling</a> which explains the terminology, plusses and minuses,
-and why you might want to do it.  Keep in mind that I wrote that for a company
-that specializes in cross-compiling.  Personally, I consider cross-compiling
-a necessary evil to be minimized, and that's how Firmware Linux is designed.
-We cross-compile just enough stuff to get a working native compiler for the
-new platform, which we then run under emulation.</p>
-
-<h3>Which emulator?</h3>
-
-<p>The emulator Firmware Linux 0.8x used was User Mode Linux (here's a
-<a href=http://www.landley.net/code/UML.html>UML mini-howto</a> I wrote
-while getting this to work).  Since we already need the linux-kernel source
-tarball anyway, building User Mode Linux from it was convenient and minimized
-the number of packages we needed to build the minimal system.</p>
+<p>Loading an x86 bzImage kernel requires a modified boot loader that
+can be told the original size of the kernel, rather than querying the current
+file length which would be too long.  Hence the patch to Lilo allowing
+a "length=xxx" argument in the config file.</p>
 
-<p>The first stage of the build compiled a UML kernel and ran the rest of the
-build under that, using UML's hostfs to mount the parent's root filesystem as
-the root filesystem for the new UML kernel.  This solved both the kernel
-version and the root access problems.  The UML kernel was the new version, and
-supported all the new syscalls and ioctls and such that the uClibc was built to
-expect, translating them to calls to the host system's C library as necessary.
-Processes running under User Mode Linux had root access (at least as far as UML
-was concerned), and although they couldn't write to the hostfs mounted root
-partition, they could create an ext2 image file, loopback mount it, --bind
-mount in directories from the hostfs partition to get the apps they needed,
-and chroot into it.  Which is what the build did.</p>
-
-<p>Current Firmware Linux has switched to a different emulator, QEMU, because
-as long as we're we're cross-compiling anyway we might as well have the
-ability to cross-compile for non-x86 targets.  We still build a new kernel
-to run the uClibc binaries with the new kernel ABI, we just build a bootable
-kernel and run it under QEMU.</p>
-
-<p>The main difference with QEMU is a sharper dividing line between the host
-system and the emulated target.  Under UML we could switch to the emulated
-system early and still run host binaries (via the hostfs mount).  This meant
-we could be much more relaxed about cross compiling, because we had one
-environment that ran both types of binaries.  But this doesn't work if we're
-building an ARM, PPC, or x86-64 system on an x86 host.</p>
-
-<p>Instead, we need to sequence more carefully.  We build a cross-compiler,
-use that to cross-compile a minimal intermediate system from the seven packages
-listed earlier, and build a kernel and QEMU.  Then we run the kernel under QEMU
-with the new intermediate system, and have it build the rest natively.</p>
-
-<p>It's possible to use other emulators instead of QEMU, and I have a todo
-item to look at armulator.  (I looked at another nommu system simulator at
-Ottawa Linux Symposium, but after resolving the third unnecessary environmental
-dependency and still not being able to get it to finish compiling yet, I
-gave up.  Armulator may be a patch against an obsolete version of gdb, but I
-could at least get it to build.)</p>
-
-<h3>Alternatives to emulation</h3>
+<p>Upon boot, the kernel runs the initramfs code which finds the firmware
+file.  In the case of User Mode Linux, the symlink /proc/self/exe points
+to the path of the file.  A bootable kernel needs a command line argument
+of the form firmware=device:/path/to/file (it can lookup the device in
+/sys/block and create a temporary device node to mount it with; this is
+in expectation of dynamic major/minor happening sooner or later).
+Once the file is found, /dev/loop0 is bound to it with an offset (losetup -o,
+with a value extracted from the 4 bytes stored at offset 12 in the file), and
+the resulting squashfs is used as the new root partition.</p>
 
-<p>The main downsides of emulation are that is it's slow, can use a lot of
-memory, and can be tricky to debug if something goes wrong in the emulated
-environment.  Cross compiling is sufficiently harder than native compiling that
-I consider it a good trade-off, but there are alternatives.</p>
-
-<p>Some other build systems (such as uClibc's Buildroot) use a package called
-<a href=http://freshmeat.net/projects/fakeroot/>fakeroot</a>, which is sort
-of a halfway emulator.  It creates an environment where binaries run as if
-they had root access, but without being able to do anything that actually
-requires root access.  This is nice if you want to create tarballs with
-device nodes and different ownership in them, but not so useful if you want
-to actually use one of those device nodes, or twiddle mount points.  Firmware
-Linux doesn't use fakeroot (we use a real emulator instead), but it's
-an option.</p> 
-
-<p>In theory, we could work around the "host hasn't got uClibc" problem by
-statically linking our apps for the intermediate system, and work around the
-"host kernel older than the kernel headers we're using" problem by either
-building the intermediate version of uClibc with the host's kernel headers
-or just linking against glibc instead of uClibc.</p>
-
-<p>This has a number of
-downsides: harvesting the host's kernel headers is distribution-specific, and
-could easily leak bits of the host into the final system.  Linking the host
-tools against glibc (or a temporary version of uClibc built with different
-kernel headers) doesn't give us as much evidence that the resulting system
-will be able to rebuild itself under itself, and statically linking against
-glibc wastes a regrettable amount of space.  None of this works with real
-cross-compiling between different processors (such as building an ARM system
-from x86).</p>
-
-<p>We'd still have to solve the other problems (such as gcc wanting absolute
-paths) anyway, there just wouldn't be a switchover point where we could
-run the binaries we were building and start native compiling.  Instead we'd
-have to keep cross-compiling all the way to the final system, and if anything's
-wrong with it we wouldn't find out until we tried to run it.  With the native
-build, we've given the tools a bit of a workout during the build, so if the
-build completes then the finished system shouldn't have anything too
-fundamentally wrong with it.</p>
-
-<p>(Note: QEMU can export a host directory to the target through the emulated
-network card as an smb filesystem, but you don't want to run your root
-filesystem on smb.)</p>
+<p>The cryptographic signature can be verified on boot, but more importantly
+it can be verified when upgrading the firmware.  New firmware images can
+be installed beside old firmware, and LILO can be updated with boot options
+for both firmware, with a default pointing to the _old_ firmware.  The
+lilo -R option sets the command line for the next boot only, and that can
+be used to boot into the new firmware.  The new firmware can run whatever
+self-diagnostic is desired before permanently changing the default.  If the
+new firmware doesn't boot (or fails its diagnostic), power cycle the machine
+and the old firmware comes up.  (Note that grub does not have an equivalent
+for LILO's -R option; which would mean that if the new firmware doesn't run,
+you have a brick.)</p>
 
 <h2>Filesystem Layout</h2>
 
--- a/www/index.html	Mon Nov 27 19:39:22 2006 -0500
+++ b/www/index.html	Tue Nov 28 16:08:42 2006 -0500
@@ -5,8 +5,9 @@
 
 <b><h2>What is it?</h2></b>
 
-<p>Firmware Linux is an embedded Linux distribution builder.  It's basically
-a shell script that builds a complete Linux system from source code.</p>
+<p>Firmware Linux is an embedded Linux distribution builder that creates a
+bootable single file Linux system, based on uClibc and BusyBox/toybox.  It's
+basically a shell script that builds a complete Linux system from source code.</p>
 
 <p>FWL builds a cross-compiler and then uses it to build a minimal system
 containing a native compiler, BusyBox and uClibc.  Then it runs this minimal
@@ -24,7 +25,7 @@
 <p>The current stuff is available from <a href=/hg/firmware>the mercurial
 repository.  That's the new (QEMU-based, capable of cross-compiling for
 different hardware platforms) design I'm working on now, and where new
-development happens.</p>
+development happens.  To use it, download the tarball and run "./build.sh".</p>
 
 <p>The old (UML-based, x86 only) design is still available from <a href=old>the
 old website</a>, which is hideously out of date but contains a working
@@ -38,7 +39,10 @@
 <p>Here is a description of <a href=design.html>the design of Firmware
 Linux</a>.</p>
 
-<b><h2>Status</h2></b>
+<p>As always, read <a href=/notes.html>my development log</a> to see what I've
+been up to on this project. 
+
+<b><h2>History</h2></b>
 
 <p>I've been working on this project on and off since 1999, it's what 
 got me into BusyBox and uClibc and compilers and so on.  Now it's where I put
@@ -46,171 +50,31 @@
 actually works and give it a good stress-test.  (Eating your own dogfood,
 and all that.)</p>
 
-<p>The project stalled while I was BusyBox maintainer (2005-2006) due
-to lack of time, and since then most of my spare programming time has gone
-into launching toybox.  But sinice one of the main goals of toybox is to
-replace BusyBox in Firmware Linux, as toybox matures it'll naturally lead
-to spending more time working on FWL.</p>
-
-<p>This server does not currently run on Firmware Linux.  Making it do so
-is a TODO item.  After that, I'd like to get it to the point where I can
-use it on my laptop. :)</p>
-
-<hr>
-
-<p>Firmware Linux is a bootable single file linux system, based on busybox and
-uClibc.  After downloading the
-<a href=downloads/firmware-build-0.8.9.tar.bz2>source code</a>, extract
-it, read the README, and run "./build.sh".  This will (eventually)
-create a self-contained firmware-uml executable you can use to try out an
-emulated version of Firmware Linux.</p>
-
-<p>Prebuilt versions are available: The <a href=downloads/base-uml>basic
-build</a> is 2.5 megabytes (which includes the linux kernel, all command line
-utilities, and the minimal set of shared libraries).  The
-<a href=downloads/devel-uml>full development environment</a> is 15 megabytes
-(the base system plus enough development tools for Firmware Linux to rebuild
-itself from source code).</p>
-
-<h2>News</h2>
-<p>Read <a href=/notes.html>my development log</a> to see what I've been up
-to on this project.</p>
-
-<p>Here's the <a href=lilo-length.patch>length patch</a> I'm using on lilo.</p>
-
-<p>I wrote a <a href=/writing/docs/UML.html>Quick and Dirty User Mode Linux HOWTO</a> if you've never played with UML before.</p>
-
-<h2>What is it?</h2>
-
-<h2>Firmware Linux is a bootable single file linux system.</h2>
-
-<p>Firmware Linux is one file containing a kernel, initramfs, read-only root
-filesystem, and cryptographic signature.  You can boot Linux from this file
-as if it was a normal kernel image (a slightly modified LILO is required,
-patches for GRUB and other bootloaders are a to-do item).  You can upgrade
-your entire OS (and any applications in the root filesystem) atomically, by
-downloading a new file and pointing your bootloader at it.</p>
-
-<h2>Firmware Linux is a Linux distro using busybox and uClibc as the basis for
-a self-hosting development environment.</h2>
-
 <p>When the Firmware Linux project started, busybox applets like sed and sort
 weren't powerful enough to handle the "./configure; make; make install" of
 packages like binutils or gcc.  Busybox was usable in an embedded router or
 rescue floppy, but trying to get real work done with it revealed numerous
 bugs and limitations.</p>
 
-<p>Busybox has now been fixed, and in Firmware Linux Busybox functions as an
-effective replacement for bzip2, coreutils, e2fsprogs, file, findutils, gawk,
-grep, inetutils, less, modutils, net-tools, patch, procps, sed, shadow,
-sysklogd, sysvinit, tar, util-linux, and vim.  (Eventually, it should be
-capable of replacing bash and diffutils as well, but it's not there yet.)</p>
-
-<p>The base system consists of uClibc-0.9.28, busybox-1.1-pre1, and
-linux-2.6.13.2.  (Currently, bash-2.05 is also included due to limitations in
-the busybox built-in shell.  This is a temporary measure until Busybox
-is further improved.)</p>
-
-<p>The build toolchain uses the base system plus binutils, gcc-core, bison,
-linux-libc-headers, and make.</p>
-
-<p>The install software uses lilo, which needs bin86 and as86 to build it.</p>
-
-<p>The full development system (which creates a development environment
-sufficient for Firmware Linux to rebuild itself from source) adds
-m4, flex, bison, diffutils, zlib, bin86, and nasm.</p>
-
-<p>Busybox is effectively replacing all the following packages:
-bzip2, coreutils, e2fsprogs, file, findutils, gawk, grep, inetutils, less,
-modutils, net-tools, patch, procps, sed, shadow, sysklogd, sysvinit, tar,
-util-linux, and vim.  (Eventually, it should be capable of replacing bash
-and diffutils as well, but it's not there yet.)</p>
-
-<h2>Design</h2>
-
-<p>The single file packaging combines a linux kernel (either a bootable kernel
-or User Mode Linux executable), initramfs, squashfs partition, and
-cryptographic signature.</p>
-
-<p>In 2.6, the kernel and initramfs are already combined into a single file.
-At the start of this file is either the obsolete floppy boot sector (just
-a stub in 2.6), or an ELF header which has 12 used bytes followed by 8 unused
-bytes.  Either way, we can use the 4 bytes starting at offset 12 to store the
-original length of the kernel image, then append a squashfs root partition
-to the file, followed by a whole-file cryptographic signature.</p>
-
-<p>A User Mode Linux executable should still run just fine (loading is
-controlled by the ELF segments, the appended data is ignored).  Note: don't
-strip the file or the appended data will be lost.</p>
-
-<p>Loading the bootable kernel image requires a modified boot loader that
-can be told the original size of the kernel, rather than querying the current
-file length which would be too long.  Hence the patch to Lilo allowing
-a "length=xxx" argument in the config file.</p>
+<p>So I spent about 3 years improving Busybox (and pestering other people into
+improving their bits), and along the way accidentally become the BusyBox
+maintainer (at least until the project's crazy-uncle founder showed up and
+<a href=http://lwn.net/Articles/202106/>drove me away again</a>).  The result
+is that in Firmware Linux, Busybox now functions as an effective replacement
+for bzip2, coreutils, diffutils, e2fsprogs, file, findutils, gawk, grep,
+inetutils, less, modutils, net-tools, patch, procps, sed, shadow, sysklogd,
+sysvinit, tar, util-linux, and vim.  I was in the process of writing a new
+shell to replace bash with when I left.</p>
 
-<p>Upon boot, the kernel runs the initramfs code which finds the firmware
-file.  In the case of User Mode Linux, the symlink /proc/self/exe points
-to the path of the file.  A bootable kernel needs a command line argument
-of the form firmware=device:/path/to/file (it can lookup the device in
-/sys/block and create a temporary device node to mount it with; this is
-in expectation of dynamic major/minor happening sooner or later).
-Once the file is found, /dev/loop0 is bound to it with an offset (losetup -o,
-with a value extracted from the 4 bytes stored at offset 12 in the file), and
-the resulting squashfs is used as the new root partition.</p>
-
-<p>The cryptographic signature can be verified on boot, but more importantly
-it can be verified when upgrading the firmware.  New firmware images can
-be installed beside old firmware, and LILO can be updated with boot options
-for both firmware, with a default pointing to the _old_ firmware.  The
-lilo -R option sets the command line for the next boot only, and that can
-be used to boot into the new firmware.  The new firmware can run whatever
-self-diagnostic is desired before permanently changing the default.  If the
-new firmware doesn't boot (or fails its diagnostic), power cycle the machine
-and the old firmware comes up.  (Note that grub does not have an equivalent
-for LILO's -R option; which would mean that if the new firmware doesn't run,
-you have a brick.)</p>
-
-<h2>Notes</h2>
+<p>Firmware Linux stalled while I was BusyBox maintainer (2005-2006) due to
+lack of time, and since that ended most of my spare programming time has gone
+into launching toybox.  But one of the main goals of toybox is to replace
+BusyBox in Firmware Linux, so as toybox matures it'll naturally lead to more
+of my time spent working on FWL.</p>
 
-<p>Currently, the version of gcc it builds and uses only has a C compiler, not
-c++.  This restricts the packages I can build with it, and you'd be amazed
-what kind of things need c++.  (Python is the one I'm really missing at the
-moment.)  Possibly uclibc++ could help here, but I'm also looking at tcc
-instead of gcc and binutils.  (tcc doesn't quite build an unmodified Linux
-kernel yet, and who knows what other packages would need to be tweaked,
-but it's definitely worth a look once version 1.0 comes out.  Perhaps
-some kind of front-end could make it do c++?)</p>
-
-<p>User Mode Linux is used during the build for a number of reasons.  The
-kernel headers used to build the C library may be newer than the kernel the
-system doing the build is using, and this may result in programs linked against
-this C library trying to use new features the existing kernel doesn't have.
-This tends to result in programs segfaulting.  Building a UML kernel to
-run these programs under during the build solves this problem, by translating
-the calls into ones the host system understands.</p>
-
-<p>UML also avoids the need to run the build as root.  The build needs to
-mount partitions, associate files with looback devices, create device nodes,
-create absolute paths requiring new entries in the root directory, chroot,
-and so on.  Doing all of this within the emulated UML environment avoids the
-need for root permissions on the host.</p>
-
-<p>That said, if you're running the same kernel version the Firmware Linux
-build is using, and you have root access, you can skip the UML wrapper to
-speed up the build and make things more easily debuggable.  I need to make a
-wrapper script for this, but basically in sources/scripts, stage 0.0, 1.1,
-and 2.2 are still needed, then 2.3 to package the final result (although 2.3
-depends on an executable built in 0.1).</p>
-
-<h2>How to build it</h2>
-
-<p>Run "./build.sh".  This runs all the stages (numbered files in
-sources/scripts) in sequence.  It'll start by downloading all the source code
-needed to build everything (which it'll keep around in the sources/packages
-directory for future builds).  If you just want to download the source,
-run "sources/scripts/0.0-*".  The 1-* stages create a cross-compile environment
-independent of the parent system.  The 2-* stages build the final system
-by using that cross-compile environment.</p>
+<p>My server does not currently run on Firmware Linux.  Making it do so
+is a TODO item.  After that, I'd like to get it to the point where I can
+use it on my laptop. :)</p>
 
 <h2>Contact</h2>