Mercurial > hg > aboriginal

<!--#include file="header.html" -->

<h1>The Firmware Linux build process</h1>

<h1>Overview</h1>

<h2>User Interface</h2>

<p>The script "./<b>build.sh</b>" calls the other scripts in the correct
sequence to perform a full build for one or more architectures.  Its argument
is the architecture to build, and running it without arguments lists the
available architectures.  (Configuration information for each architecture is
in the sources/targets directory.)</p>

<p>Each stage can also be run (or re-run) individually.  The individual stages
are described in the next section.</p>

<p>

<p>The script "<b>sources/forkbomb.sh</b>" builds all architectures at once.
Its command line arguments specify whether to build them in series (--nofork) or in
parallel (--fork).  The output of each architecture's build is saved in a log
file named "out-$ARCH.txt".  A quick scan of each log for success or failure
is available with "./forkbomb.sh --stat".  Run it with no arguments to see
all available options.</p>

<p>Setting the following environment variables to non-blank values can modify
the behavior of the build:</p>

<ul>
<li><p><b>NATIVE_NOTOOLCHAIN</b> - This tells mini-native.sh not to
include a compiler toolchain (binutils, gcc, bash, make, and distcc), but
instead just build a small uClibc/busybox system.</p>

<p>Setting NATIVE_NOTOOLCHAIN="headers" will leave the libc and kernel
header files in the appropriate include directory, for use by a compiler such
as pcc, llvm/clang, or tinycc.  (Building and installing additional tools
such as "make" remains your problem.)</p>
</li>

<li><p><b>NATIVE_NOTOOLSDIR</b> - Path to the top level directory within
the native filesystem.  If this is blank, configure sets this to a default value
of "/tools", to act as a Linux From Scratch style chroot environment.  For
a more traditional filesystem layout, set this to "/" or to "/usr".</p></li>

<li><p><b>FWL_RECORD_COMMANDS</b> - Records all command lines used to build each
package.</p>

<p>Tells host-tools.sh to build a logging wrapper (sources/toys/wrappy.c) and
populate a directory (build/wrapper) with symlinks to that wrapper for each
command name in $PATH.  This allows later build stages to write log files in
the build directory (named "cmdlines.${STAGE_NAME}.${PACKAGE_NAME}") recording
each command run.  (When build/wrapper exists, include.sh sets the wrapper
control variables $WRAPPY_LOGDIR, $WRAPPY_LOGPATH, and $WRAPPY_REALPATH,
then adjusts $PATH to point to the wrapper directory, which makes the wrapper
append to the appropriate log file before calling the actual command.)</p>

<p>Afterwards, the script "sources/toys/report_recorded_commands.sh" can
generate a big report on which commands were used to build each package for
each architecture.  To get a single list of the command names used by
everything, do:</p>

<blockquote>
<p>echo $(find build -name "cmdlines.*" | xargs awk '{print $1}' | sort -u)</p>
</blockquote>

<p>(Note: this will miss things which which call executables at absolute values
instead of checking $PATH, but the only interesting ones so far are the
#!/bin/bash type lines at the start of shell scripts.)</p>
</li>

<li><p><b>CROSS_BUILD_STATIC</b> - Tells cross-compiler.sh to statically link all
binaries in the cross compiler toolchain it creates.</p>

<p>The prebuilt binary versions in the download directory are statically linked
against uClibc, by building a mini-native environment and re-running the build
under that with BUILD_STATIC=1.</p></li>

<li><p><b>FWL_PREFERRED_MIRROR</b> - Tells download.sh to try to download
packages from this URL first, before falling back to the normal mirror list.
For example, "FWL_PREFERRED_MIRROR=http://landley.net/code/firmware/mirror".</p></li>

<li><p><b>FWL_USE_TOYBOX</b> - Tells the host-tools.sh and mini-native.sh to
install the <a href=http://landley.net/code/toybox>toybox</a> implementation
of commands (where available) instead of the busybox versions.</p></li>
</ul>

<h1>Implementation</h1>

<p>The top level wrappers (<b>build.sh</b> or <b>forkbomb.sh</b>) call the
other stages in sequence.  The other stages are <b>download.sh</b>,
<b>host-tools.sh</b>, <b>cross-compiler.sh</b>, <b>mini-native.sh</b>, and
<b>package-mini-native.sh</b>.  Each script sources the common file
<b>include.sh</b>.</p>

<p>In theory, the stages are othogonal.  If you have an existing cross
compiler, you can add it to the $PATH and skip cross-compiler.sh.  Or you
can use _just_ cross-compiler.sh to create a cross compiler, and then go build
something else with it.  The host-tools.sh stage can often be skipped
entirely.</p>

<h2>Stage 0: Setup</h2>

<p>Before building anything we're going to keep, we need to do some setup
work.</p>

<ul>
<li><p><b>include.sh</b> - header file defining common environment variables
and functions, and performing other miscelanous setup.</li>

<p>This script is not run directly, instead it's included from all the other
scripts.</p>

<p>For building packages, this file defines the functions:</p>
<ul>
<li><p><b>setupfor</b> - extracts a source package (named in the first
argument) into a temporary directory, and changes the current directory
to there.</p>

<p>Source code is cached, meaning each package's source tarball is only
actually extracted once (into build/sources) and the temporary copies
are directories full of hard links to the cached source.</p>
</li>

<li><p><b>cleanup</b> - delete temporary copy of source code after build.</p>
</li>
</ul>

<li><p><b>download.sh</b> - Download source code packages from the web.</p>

<p>This file is a series of calls to the <b>download</b> function (defined in
include.sh).  If an existing copy of the tarball matching the sha1sum $SHA1
doesn't currently exist in the sources/packages directory, it uses wget to
fetch it from $URL (or a series of fallback mirrors).</p>

<p>A blank value for $SHA1 will accept any file as correct, ignoring its
contents.</p>

<p>After downloading all tarballs, the function <b>cleanup_oldfiles</b> deletes
any old files (meaning any files in sources/packages with a timestamp
before the call to download.sh, such as previous versions left over
after a package upgrade).</p>

<p>Running this stage with the argument "--extract-all" will extract all
the tarballs, to preopulate the cache used by setupfor.  (This is primarily
used to avoid race conditions in forkbomb.sh --fork.)</p>

<li><p><b>host-tools.sh</b> - Set up a known environment on the host</p>

<p>This script populates the <b>build/host</b> directory with
host versions of the busybox and toybox command line tools (the same ones
that the target's eventual root filesystem will contain), plus symlinks to the
host's compiler toolchain.</p>

<p>This allows the calling scripts to trim the $PATH to point to just this
one directory, which serves several purposes:</p>

<ul>

<li><p><b>Isolation</b> - This prevents the ./configure stages of the source
packages from finding and including unexpected dependencies on random things
installed on the host.</p></li>

<li><p><b>Portability</b> - Using a known set of command line utilities
insulates the build from variations in the host's Linux distribtion (such as
Ubuntu's /bin/echo lacking suport for the -e option).</p></li>

<li><p><b>Testing</b> - It ensures the resulting system can rebuild itself
under itself, since the initial build was done with the same tools we
install into the target's root filesystem.  The initial build acts as a smoke
test of most of the packages used to create the resulting system, and
restricting $PATH ensures that no necessary commands are missing.  (Variation
can still show up between x86/arm/powerpc versions, of course.)</p></li>

<li><p><b>Enumeration</b> - The RECORD_COMMANDS functionality (see
environment variables, above) starts here and continues into the later build
scripts.</p></li>
</ul>

<p>This stage is optional.  You don't need to run this stage if you don't
want to.  If the build/host directory doesn't exist (or doesn't contain
a "busybox" executable), the build will use the host's original $PATH.</p>
</ul>

<h2>Stage 1: Build a cross-compiler.</h2>

<ul>
<li><p><b>cross-compiler.sh</b> - build a cross compiler for a target.</p></li>

<p>This script builds a cross-compiler, which runs on the host system and
produces binaries that run on the target system.  (See my
<a href=http://landley.net/writing/docs/cross-compiling.html>Introduction to
cross compiling</a> if you're unfamiliar with cross compiling.)</p>

<p>The build requires a cross-compiler even if the host and target system are
both x86, because the host usually uses different C libraries.  If the host has
glibc and the target uses uClibc, then the (dynamically linked) target binaries
we produce won't run on the host.  (Target binaries that won't run on the
host are what distinguishes cross-compiling from native compiling.  Different
processors are just one reason for it.)</p>

<p>Building a cross-compiler toolchain requires four packages: binutils,
gcc, uClibc, and the linux kernel (for header files).</p>

</ul>

<h2>Stage 2: Cross compile a bootable target system.</h2>

<ul>
<li><p><b>mini-native.sh</b> - Build a minimal native development environment
for the target system.</p>

<p>This script uses the cross compiler from the previous step to build a kernel
and root filesystem for the target, including a native compiler toolchain.
The resulting system should boot and run on the target, or under an
appropriate emulator.</p>

<p>Because cross-compiling is persnickety and difficult, we do as little of
it as possible.  Instead we use the cross-compiler to generate the smallest
possible native build environment for the target, and then run the rest of the
build in that environment, under an emulator.</p>

<p>This script should perform all the cross compiling anyone ever
needs to do.  It uses the cross-compiler to generate the simplest possible
native build environment for the target which is capable of rebuilding itself
under itself.</p>

<p>Anything else that needs to be built for the target can then be built natively,
by running this kernel and root filesystem under an emulator and building
new packages there, bootstrapping up to a full system if necessary.</p>

<p>The emulator we use is QEMU.  The minimal build environment powerful enough
to boot and compile a complete Linux system requires seven packages: the Linux
kernel, binutils, gcc, uClibc, BusyBox, make, and bash.  It's packaged
using the <a href=http://www.linuxfromscratch.org>Linux From Scratch</a>
/tools directory approach, staying out of the way so the minimal build
environment doesn't get mixed into the final system.</p>

</li>
</ul>

<h2>Stage 3: Run the target's native build environment under an emulator to
build the final system.</h2>

<p>Running a native build under QEMU is much slower than cross-compiling,
but it's a lot easier and more reliable.</p>

<p>A trick to accelerate the build is to use distcc to call out to the
cross-compiler, feeding the results back into the emulator through the virtual
network.  (This is still a TODO item.)</p>

<p>The actual build run under stange 3 can be a fairly straightforward
<a href=http://www.linuxfromscratch.org>Linux From Scratch</a> approach,
or another source based Linux build system like Gentoo.</p>

<h2>Stage 4: Package the system into a firmware file.</h2>

<p>The reason for the name Firmware Linux is that the entire operating system
(kernel, initramfs, and read-only squashfs root filesystem) are glued together
into a single file.  A modified version of LILO is included which can boot and
run this file on x86.</p>

<hr>

<h1>Evolution of the firmware Linux build process.</h1>

<h2>The basic theory</h2>

<p>The Linux From Scratch approach is to build a minimal intermediate system
with just enough packages to be able to compile stuff, chroot into that, and
build the final system from there.  This isolates the host from the target,
which means you should be able to build under a wide variety of distributions.
It also means the final system is built with a known set of tools, so you get
a consistent result.</p>

<p>A minimal build environment consists of a C library, a compiler, and BusyBox.
So in theory you just need three packages:</p>

<ul>
  <li>A C library (uClibc)</li>
  <li>A toolchain (tcc)</li>
  <li>BusyBox</li>
</ul>

<p>Unfortunately, that doesn't work yet.</p>

<h2>Some differences between theory and reality.</h2>

<h3>Environmental dependencies.</h3>

<p>Environmental dependencies are things that need to be installed before you
can build or run a given package.  Lots of packages depend on things like zlib,
SDL, texinfo, and all sorts of other strange things.  (The GnuCash project
stalled years ago after it released a version with so many environmental
dependencies it was impossible to build or install.  Environmental dependencies
have a complexity cost, and are thus something to be minimized.)</p>

<p>A good build system will scan its environment to figure out what it has
available, and disable functionality that depends on stuff that isn't
available.  (This is generally done with autoconf, which is disgusting but
suffers from a lack of alternatives.)  That way, the complexity cost is
optional: you can build a minimal version of the package if that's all you
need.</p>

<p>A really good build system can be told that the environment
it's building in and the environment the result will run in are different,
so just because it finds zlib on the build system doesn't mean that the
target system will have zlib installed on it.  (And even if it does, it may not
be the same version.  This is one of the big things that makes cross-compiling
such a pain.  One big reason for statically linking programs is to eliminate
this kind of environmental dependency.)</p>

<p>The Firmware Linux build process is structured the way it is to eliminate
as many environmental dependencies as possible.  Some are unavoidable (such as
C libraries needing kernel headers or gcc needing binutils), but the
intermediate system is the minimal fully functional Linux development
environment I currently know how to build, and then we switch into that and
work our way back up from there by building more packages in the new
environment.</p>

<h3>Resolving environmental dependencies.</h3>

<p><b>To build uClibc you need kernel headers</b> identifying the syscalls and
such it can make to the OS.  Way back when you could use the kernel headers
straight out of the Linux kernel 2.4 tarball and they'd work fine, but sometime
during 2.5 the kernel developers decided that exporting a sane API to userspace
wasn't the kernel's job, and stopped doing it.</p>

<p>The 0.8x series of Firmware Linux used
<a href=http://ep09.pld-linux.org/~mmazur/linux-libc-headers/>kernel
headers manually cleaned up by Mariusz Mazur</a>, but after the 2.6.12 kernel
he had an attack of real life and fell too far behind to catch up again.</p>

<p>The current practice is to use the Linux kernel's "make headers_install"
target, created by David Woodhouse.  This runs various scripts against the
kernel headers to sanitize them for use by userspace.  This was merged in
2.6.18-rc1, and was more or less debugged by 2.6.19.  So can use the Linux
Kernel tarball as a source of headers again.</p>

<p>Another problem is that the busybox shell situation is a mess with four
implementations that share little or no code (depending on how they're
configured).  The first question when trying to fix them is "which of the four
do you fix?", and I'm just not going there.  So until bbsh goes in we
<b>substitute bash</b>.</p>

<p>Finally, <b>most packages expect gcc</b>.  The tcc project isn't a drop-in
gcc replacement yet, and doesn't include a "make" program.  Most importantly,
tcc development appears stalled because Fabrice Bellard's other major project
(qemu) is taking up all his time these days.  In 2004 Fabrice
<a href=http://fabrice.bellard.free.fr/tcc/tccboot.html>built a modified Linux
kernel with tcc</a>, and
<a href=http://fabrice.bellard.free.fr/tcc/tccboot_readme.html>listed</a>
what needed to be upgraded in TCC to build an unmodified kernel, but
since then he hardly seems to have touched tcc.  Hopefully, someday he'll get
back to it and put out a 1.0 release of tcc that's a drop-in gcc replacment.
(And if he does, I'll add a make implementation to toybox so we don't need
to use any of the gnu toolchain).  But in the meantime the only open source
compiler that can build a complete Linux system is still the gnu compiler.</p>

<p>The gnu compiler actually consists of three packages <b>(binutils, gcc, and
make)</b>, which is why it's generally called the gnu "toolchain".  (The split
between binutils and gcc is for purely historical reasons, and you have
to match the right versions with each other or things break.)</p>

<p>This means that to compile a minimal build environment, you need seven
packages, and to actually run the result we use an eighth package (QEMU).</p>

<p>This can actually be made to work.  The next question is how?</p>

<h2>Additional complications</h2>

<h3>Cross-compiling and avoiding root access</h3>

<p>The first problem is that we're cross-compiling.  We can't help it.
You're cross-compiling any time you create target binaries that won't run on
the host system.  Even when both the host and target are on the same processor,
if they're sufficiently different that one can't run the other's binaries, then
you're cross-compiling.  In our case, the host is usually running both a
different C library and an older kernel version than the target, even when
it's the same processor.</p>

<p>The second problem is that we want to avoid requiring root access to build
Firmware Linux.  If the build can run as a normal user, it's a lot more
portable and a lot less likely to muck up the host system if something goes
wrong.  This means we can't modify the host's / directory (making anything
that requires absolute paths problematic).  We also can't mknod, chown, chgrp,
mount (for --bind, loopback, tmpfs)...</p>

<p>In addition, the gnu toolchain (gcc/binutils) is chock-full of hardwired
assumptions, such as what C library it's linking binaries against, where to look
for #included headers, where to look for libraries, the absolute path the
compiler is installed at...  Silliest of all, it assumes that if the host and
target use the same processor, you're not cross-compiling (even if they have
a different C library and a different kernel, and even if you ./configure it
for cross-compiling it switches that back off because it knows better than
you do).  This makes it very brittle, and it also tends to leak its assumptions
into the programs it builds.  New versions may someday fix this, but for now we
have to hit it on the head repeatedly with a metal bar to get anything remotely
useful out of it, and run it in a separate filesystem (chroot environment) so
it can't reach out and grab the wrong headers or wrong libraries despite
everything we've told it.</p>

<p>The absolute paths problem affects target binaries because all dynamically
linked apps expect their shared library loader to live at an absolute path
(in this case /lib/ld-uClibc.so.0).  This directory is only writeable by root,
and even if we could install it there polluting the host like that is just
ugly.</p>

<p>The Firmware Linux build has to assume it's cross-compiling because the host
is generally running glibc, and the target is running uClibc, so the libraries
the target binaries need aren't installed on the host.  Even if they're
statically linked (which also mitigates the absolute paths problem somewhat),
the target often has a newer kernel than the host, so the set of syscalls
uClibc makes (thinking it's talking to the new kernel, since that's what the
ABI the kernel headers it was built against describe) may not be entirely
understood by the old kernel, leading to segfaults.  (One of the reasons glibc
is larger than uClibc is it checks the kernel to see if it supports things
like long filenames or 32-bit device nodes before trying to use them.  uClibc
should always work on a newer kernel than the one it was built to expect, but
not necessarily an older one.)</p>

<h2>Ways to make it all work</h2>

<h3>Cross compiling vs native compiling under emulation</h3>

<p>Cross compiling is a pain.  There are a lot of ways to get it to sort of
kinda work for certain versions of certain packages built on certain versions
of certain distributions.  But making it reliable or generally applicable is
hard to do.</p>

<p>I wrote an <a href=/writing/docs/cross-compiling.html>introduction
to cross-compiling</a> which explains the terminology, plusses and minuses,
and why you might want to do it.  Keep in mind that I wrote that for a company
that specializes in cross-compiling.  Personally, I consider cross-compiling
a necessary evil to be minimized, and that's how Firmware Linux is designed.
We cross-compile just enough stuff to get a working native build environment
for the new platform, which we then run under emulation.</p>

<h3>Which emulator?</h3>

<p>The emulator Firmware Linux 0.8x used was User Mode Linux (here's a
<a href=http://www.landley.net/writing/docs/UML.html>UML mini-howto</a> I wrote
while getting this to work).  Since we already need the linux-kernel source
tarball anyway, building User Mode Linux from it was convenient and minimized
the number of packages we needed to build the minimal system.</p>

<p>The first stage of the build compiled a UML kernel and ran the rest of the
build under that, using UML's hostfs to mount the parent's root filesystem as
the root filesystem for the new UML kernel.  This solved both the kernel
version and the root access problems.  The UML kernel was the new version, and
supported all the new syscalls and ioctls and such that the uClibc was built to
expect, translating them to calls to the host system's C library as necessary.
Processes running under User Mode Linux had root access (at least as far as UML
was concerned), and although they couldn't write to the hostfs mounted root
partition, they could create an ext2 image file, loopback mount it, --bind
mount in directories from the hostfs partition to get the apps they needed,
and chroot into it.  Which is what the build did.</p>

<p>Current Firmware Linux has switched to a different emulator, QEMU, because
as long as we're we're cross-compiling anyway we might as well have the
ability to cross-compile for non-x86 targets.  We still build a new kernel
to run the uClibc binaries with the new kernel ABI, we just build a bootable
kernel and run it under QEMU.</p>

<p>The main difference with QEMU is a sharper dividing line between the host
system and the emulated target.  Under UML we could switch to the emulated
system early and still run host binaries (via the hostfs mount).  This meant
we could be much more relaxed about cross compiling, because we had one
environment that ran both types of binaries.  But this doesn't work if we're
building an ARM, PPC, or x86-64 system on an x86 host.</p>

<p>Instead, we need to sequence more carefully.  We build a cross-compiler,
use that to cross-compile a minimal intermediate system from the seven packages
listed earlier, and build a kernel and QEMU.  Then we run the kernel under QEMU
with the new intermediate system, and have it build the rest natively.</p>

<p>It's possible to use other emulators instead of QEMU, and I have a todo
item to look at armulator from uClinux.  (I looked at another nommu system
simulator at Ottawa Linux Symposium, but after resolving the third unnecessary
environmental dependency and still not being able to get it to finish compiling
yet, I gave up.  Armulator may be a patch against an obsolete version of gdb,
but I could at least get it to build.)</p>

<h1>Packaging</h1>

<p>The single file packaging combines a linux kernel, initramfs, squashfs
partition, and cryptographic signature.</p>

<p>In Linux 2.6, the kernel and initramfs are already combined into a single
file.  At the start of this file is either the obsolete floppy boot sector
(just a stub in 2.6), or an ELF header which has 12 used bytes followed by 8
unused bytes.  Either way, we can generally use the 4 bytes starting at offset
12 to store the original length of the kernel image, then append a squashfs
root partition to the file, followed by a whole-file cryptographic
signature.</p>

<p>Loading an ELF kernel (such as User Mode Linux or a non-x86 ELF kernel)
is controlled by the ELF segments, so the appended data is ignored.
(Note: don't strip the file or the appended data will be lost.)  Loading an x86
bzImage kernel requires a modified boot loader that can be told the original
size of the kernel, rather than querying the current file length (which would
be too long).  Hence the patch to Lilo allowing a "length=xxx" argument in the
config file.</p>

<p>Upon boot, the kernel runs the initramfs code which finds the firmware
file.  In the case of User Mode Linux, the symlink /proc/self/exe points
to the path of the file.  A bootable kernel needs a command line argument
of the form firmware=device:/path/to/file (it can lookup the device in
/sys/block and create a temporary device node to mount it with; this is
in expectation of dynamic major/minor happening sooner or later).
Once the file is found, /dev/loop0 is bound to it with an offset (losetup -o,
with a value extracted from the 4 bytes stored at offset 12 in the file), and
the resulting squashfs is used as the new root partition.</p>

<p>The cryptographic signature can be verified on boot, but more importantly
it can be verified when upgrading the firmware.  New firmware images can
be installed beside old firmware, and LILO can be updated with boot options
for both firmware, with a default pointing to the _old_ firmware.  The
lilo -R option sets the command line for the next boot only, and that can
be used to boot into the new firmware.  The new firmware can run whatever
self-diagnostic is desired before permanently changing the default.  If the
new firmware doesn't boot (or fails its diagnostic), power cycle the machine
and the old firmware comes up.  (Note that grub does not have an equivalent
for LILO's -R option; which would mean that if the new firmware doesn't run,
you have a brick.)</p>

<h2>Filesystem Layout</h2>

<p>Firmware Linux's directory hierarchy is a bit idiosyncratic: some redundant
directories have been merged, with symlinks from the standard positions
pointing to their new positions.  On the bright side, this makes it easy to
make the root partition read-only.</p>

<h3>Simplifying the $PATH.</h3>

<p>The set "bin->usr/bin, sbin->usr/sbin, lib->usr/lib" all serve to consolidate
all the executables under /usr.  This has a bunch of nice effects: making a
a read-only run-from-CD filesystem easier to do, allowing du /usr to show
the whole system size, allowing everything outside of there to be mounted
noexec, and of course having just one place to look for everything.  (Normal
executables are in /usr/bin.  Root only executables are in /usr/sbin.
Libraries are in /usr/lib.)</p>

<p>For those of you wondering why /bin and /usr/sbin were split in the first
place, the answer is it's because Ken Thompson and Dennis Ritchie ran out
of space on the original 2.5 megabyte RK-05 disk pack their root partition
lived on in 1971, and leaked the OS into their second RK-05 disk pack where
the user home directories lived.  When they got more disk space, they created
a new direct (/home) and moved all the user home directories there.</p>

<p>The real reason we kept it is tradition.  The execuse is that the root
partition contains early boot stuff and /usr may get mounted later, but these
days we use initial ramdisks (initrd and initramfs) to handle that sort of
thing.  The version skew issues of actually trying to mix and match different
versions of /lib/libc.so.* living on a local hard drive with a /usr/bin/*
from the network mount are not pretty.</p>

<p>I.E. The seperation is just a historical relic, and I've consolidated it in
the name of simplicity.</p>

<p>On a related note, there's no reason for "/opt".  After the original Unix
leaked into /usr, Unix shipped out into the world in semi-standardized forms
(Version 7, System III, the Berkeley Software Distribution...) and sites that
installed these wanted places to add their own packages to the system without
mixing their additions in with the base system.  So they created "/usr/local"
and created a third instance of bin/sbin/lib and so on under there.  Then
Linux distributors wanted a place to install optional packages, and they had
/bin, /usr/bin, and /usr/local/bin to choose from, but the problem with each
of those is that they were already in use and thus might be cluttered by who
knows what.  So a new directory was created, /opt, for "optional" packages
like firefox or open office.</p>

<p>It's only a matter of time before somebody suggests /opt/local, and I'm
not humoring this.  Executables for everybody go in /usr/bin, ones usable
only by root go in /usr/sbin.  There's no /usr/local or /opt.  /bin and
/sbin are symlinks to the corresponding /usr directories, but there's no
reason to put them in the $PATH.</p>

<h3>Consolidating writeable directories.</h3>

<p>All the editable stuff has been moved under "var", starting with symlinking
tmp->var/tmp.  Although /tmp is much less useful these days than it used to
be, some things (like X) still love to stick things like named pipes in there.
Long ago in the days of little hard drive space and even less ram, people made
extensive use of temporary files and they threw them in /tmp because ~home
had an ironclad quota.  These days, putting anything in /tmp with a predictable
filename is a security issue (symlink attacks, you can be made to overwrite
any arbitrary file you have access to).  Most temporary files for things
like the printer or email migrated to /var/spool (where there are
persistent subdirectories with known ownership and permissions) or in the
user's home directory under something like "~/.kde".</p>

<p>The theoretical difference between /tmp and /var/tmp is that the contents
of /tmp should be deleted by the system init scripts on every
reboot, but the contents of /var/tmp may be preserved across reboots.  Except
there's no guarantee that the contents of any temp directory won't be deleted.
So any program that actually depends on the contents of /var/tmp being
preserved across a reboot is obviously broken, and there's no reason not to
just symlink them together.</p>

<p>(I case it hasn't become apparent yet, there's 30 years of accumulated cruft
in the standards, convering a lot of cases that don't apply outside of
supercomputing centers where 500 people share accounts on a mainframe that
has a dedicated support staff.  They serve no purpose on a laptop, let alone
an embedded system.)</p>

<p>The corner case is /etc, which can be writeable (we symlink it to
var/etc) or a read-only part of the / partition.   It's really a question of
whether you want to update configuration information and user accounts in a
running system, or whether that stuff should be fixed before deploying.
We're doing some cleanup, but leaving /etc writeable (as a symlink to
/var/etc).  Firmware Linux symlinks /etc/mtab->/proc/mounts, which
is required by modern stuff like shared subtrees.  If you want a read-only
/etc, use "find /etc -type f | xargs ls -lt" to see what gets updated on the
live system.  Some specific cases are that /etc/adjtime was moved to /var
by LSB and /etc/resolv.conf should be a symlink somewhere writeable.</p>

<h3>The resulting mount points</h3>

<p>The result of all this is that a running system can have / be mounted read
only (with /usr living under that), /var can be ramfs or tmpfs with a tarball
extracted to initialize it on boot, /dev can be ramfs/tmpfs managed by udev or
mdev (with /dev/pts as devpts under that: note that /dev/shm naturally inherits
/dev's tmpfs and some things like User Mode Linux get upset if /dev/shm is
mounted noexec), /proc can be procfs, /sys can bs sysfs.  Optionally, /home
can be be an actual writeable filesystem on a hard drive or the network.</p>

<p>Remember to
put root's home directory somewhere writeable (I.E. /root should move to
either /var/root or /home/root, change the passwd entry to do this), and life
is good.</p>


<!--
<p>Firmware Linux is an embedded Linux distribution builder, which creates a
bootable single file Linux system based on uClibc and BusyBox/toybox.  It's
basically a shell script that builds a complete Linux system from source code
for an arbitrary target hardware platform.</p>

<p>The FWL script starts by building a cross-compiler for the appropriate
target.  Then it cross-compiles a small Linux system for the target, which
is capable of acting as a native development environment when run on the
appropriate hardware (or under an emulator such as QEMU).  Finally the
build script creates an ext2 root filesystem image, and packages it with
a kernel configured to boot under QEMU and shell scripts to invoke qemu
appropriately.</p>
-->


<!--#include file="footer.html" -->
author	Rob Landley <rob@landley.net>
date	Thu, 13 Nov 2008 21:13:15 -0600
parents	d4b89b94e027
children