view master.idx @ 104:7bcba6e2acc3

OLS papers are up to 2009 now.
author Rob Landley <>
date Fri, 01 Jan 2010 04:54:13 -0600
parents d00b5e91cba8
line wrap: on
line source

<title>Linux Kernel Documentation</title>

<h2>Linux Kernel Documentation Index</h2>

<p>This page collects and organizes documentation about the Linux kernel, taken
from many different sources.  What is the kernel, how do you build it, how do
you use it, how do you change it...</p>

<p>This is a work in progress, and probably always will be.  Please let us know
on the
<a href=>linux-doc</a> mailing
list (on about any documentation you'd like added to this
index, and feel free to ask about any topics that aren't covered here yet.
This index is maintained by Rob Landley &lt;;, and tracked in
<a href=>this mercurial repostiory</a>.  The
cannonical location for the page is <a href=>here</a>.</p>




<span id="Sources of documentation">

<p>These are various upstream sources of documentation, many of which are linked
into the <a href=>linux kernel documentation index</a>.</p>

<span id="From the kernel">
<li><a href=Documentation>Text files in the kernel's Documentation directory.</a></li>
<li><a href=htmldocs>Output of kernel's "make htmldocs".</a></li>
<li><a href=makehelp.txt>Output of kernel's "make help".</a></li>
<li><a href=menuconfig>Menuconfig/kconfig help for each configuration option.</a></li>
<li><a href=readme>Linux kernel README files</a></li>
<li><a href=rfc-linux.html>IETF RFCs referred to by kernel source files.</a></li>
<span id="Out on the web">
<li><a href=>Linux man-pages website, includes HTML versions of man pages</a></li>
<li><a href=>Linux Weekly News kernel articles</a></li>
<li>Linux Device Drivers book (<a href=>third edition</a>) (<a href=>second edition</a>)</li>
<li><a href=ols>Ottawa Linux Symposium papers</a></li>
<li><a href=als1999>Atlanta Linux Showcase CD (1999)</a></li>
<li><a href=>Linux Journal archives</a></li>
<li><a href=>IBM Developerworks Linux Library</a> (also <a href=>here</a>)
<li><a href=>Linux Kernel Mailing List FAQ</a></li>
<li><a href=>Kernel Planet (blog aggregator)</a></li>
<li><a href=video.html>Selected videos of interest</a></li>
<li><a href=local>Some locally produced docs</a></li>

<span id="Standards">
<li><a href=>Single Unix Specification v4</a> (Also known as Open Group Base Specifications issue 7, and POSIX 2008.  See especially <a href=>system interfaces</a>)</li>
<li>C99 standard (defining the C programming language): <a href=>ISO/IEC C9899 PDF</a>, <a href=>html</a>, or <a href=>searchable website</a>.</li>
<li><a href=>Linux Foundation's specs page</a> (ELF, Dwarf, ABI...)</li>
</span id="Standards">

<span id="Translations">
<li><a href=>Linux Kernel Translation Project</a></li>
<li><a href=>Kernel Newbies regional pages</a></li>
<li><a href=>Japanese</a></li>
<li><a href=>Chinese</a></li>
</span id="Translations">

</span id="Sources of documentation">

<span id="Building from source">
  <span id="User interface">
<p>Building source packages is usually a three step process: configure, build,
and install.</p>

<p>The Linux kernel is configured with the command "make menuconfig", built
with the command "make", and installed either manually or with the command
"make install".</p>

tar xvjf linux-2.6.??.tar.bz2
cd linux-2.6.??
make menuconfig
make install

<p>For a description of the make options and targets, type <a href=makehelp.txt>
make help</a>.</p>

<span id="Get and extract the source">
<p>The Linux kernel source code is distributed by
<a href=></a> as tar archives.  Grab the most recent
"stable" release (using the tiny little letter F link) to grab a file of the
form "linux-2.6.*.tar.bz2" from the
<a href=>Linux 2.6 releases
directory</a>.  Then extract this archive with the command "tar xvjf
linux-2.6.*.tar.bz2".  (Type the command "man tar" for more information on the
tar command.)  Then cd into the directory created by extracting the archive.</p>

<p>To obtain other Linux kernel versions (such as development releases and
kernels supplied by <a href="#Distibutions">distributions</a>)
see <a href="Following_Linux_development">Following Linux development</a>.</p>

<p>To return your linux kernel source directory to its original (unconfigured)
condition after configuring and building in it, either either delete the
directory with "rm -r" and re-extract it from the tar archive, or run the
command "make distclean".</p>

<span id="Configuring">

<p>Before you can build the kernel, you have to configure it.  Configuring
selects which features this kernel build should include, and specifies other
technical information such as buffer sizes and optimization strategies.
This information is stored in a file named ".config" in the top level directory
of the kernel source code.  To see the various user interfaces to the
configuration system, type "<a href=makehelp.txt>make help</a>".</p>

<p>Note that "make clean" does not delete configuration information, but the
more thorough "make distclean" does.</p>

<span id="Using an existing configuration">

<p>Often when building a kernel, an existing .config file is supplied from
elsewhere.  Copy it into place, and optionally run "make oldconfig" to run
the kernel's diagnostics against it to ensure it matches the kernel version
you're using, updating anything that's out of sync.</p>

<p>Several preset configurations are shipped with the kernel source code.
Run the command <b>find . -name "*_defconfig"</b> in the kernel source
directory to seem them all.  Any of these can be copied to .config and
used as a starting point.</p>

<p>The kernel can also automatically generate various configurations,
mostly to act as starting points for customization:</p>
<li><b>make defconfig</b> - Set all options to default values</li>
<li><b>make allnoconfig</b> - Set all yes/no options to "n"</li>
<li><b>make allyesconfig</b> - Set all yes/no options to "y"</li>
<li><b>make allmodconfig</b> - Set all yes/no options to "y" and all "yes/module/no" options to "m"</li>
<li><b>make randconfig</b> - Set each option randomly (for debugging purposes).</li>
<li><b>make oldconfig</b> - Update a .config file from a previous version of the kernel to work with the current version.</li>

<span id="Creating a custom kernel configuration">
<p>The most common user interface for configuring the kernel is
<b>menuconfig</b>, an interactive terminal based menuing interface invoked
through the makefiles via "<b>make menuconfig</b>".  This interface groups the
configuration questions into a series of menus, showing the current values
of each symbol and allowing them to be changed in any order.  Each symbol
has associated help text, explaining what the symbol does and where to find
more information about it.  This help text is
<a href=menuconfig>also available as html</a>.</p>

<p>The menuconfig interface is controlled with the following keys:</p>

<li><b>cursor up/down</b> - move to another symbol</li>
<li><b>tab</b> - switch action between edit, help, and exit</li>
<li><b>enter</b> - descend into menu (or help/exit if you hit tab first)</li>
<li><b>esc</b> - exit menu (prompted to save if exiting top level menu)</li>
<li><b>space</b> - change configuration symbol under cursor</li>
<li><b>?</b> - view help for this symbol</li>
<li><b>/</b> - search for a symbol by name</li>

<p>Other configuration interfaces (functionally equivalent to menuconfig)
<li><b>make config</b> - a simple text based question and answer interface
(which does not require curses support, or even a tty)</li>
<li><b>make xconfig</b> - QT based graphical interface</li>
<li><b>make gconfig</b> - GTK based graphical interface</li>


    <span id="building">
      <span id="Building out of tree">
    <span id="Installing">
    <span id="running">
    <span id="debugging">
      <span id="QEMU">
    <span id="cross compiling">
<span id="Cross compiling vs native compiling">
<p>By default, Linux builds for the same architecture the host system is
running.  This is called "native compiling".  An x86 system building an x86
kernel, x86-64 building x86-64, or powerpc building powerpc are all examples
of native compiling.</p>

<p>Building different binaries than the host runs is called cross compiling.
<a href=>Cross
compiling is hard</a>.  The build system for the Linux kernel supports cross
compiling via a two step process: 1) Specify a different architecture (ARCH)
during the configure, make, and install stages.  2) Supply a cross compiler
(CROSS_COMPILE) which can output the correct kind of binary code.  An
example cross compile command line (building the "arm" architecture) looks

<pre>make ARCH=arm menuconfig
make ARCH=arm CROSS_COMPILE=armv5l-

<p>To specify a different architecture than the host, either define the "ARCH"
environment variable or else add "ARCH=xxx" to the make command line for each
of the make config, make, and make install stages.  The acceptable values for
ARCH are the names of the directories in the "arch" subdirectory of the Linux
kernel source code, see <a href="#Architectures">Architectures</a> for
details.  All stages of the build must use the same ARCH value, and building a
second architecture in the same source directory requires "make distclean".
(Just "make clean" isn't sufficient, things like the include/asm symlink need
to be removed and recreated.)</p>

<p>To specify a cross compiler prefix, define the CROSS_COMPILE environment
variable (or add CROSS_COMPILE= to each make command line).  Native compiler
tools, which output code aimed at the environment they're running in, usually
have a simple name ("gcc", "ld", "strip").  Cross compilers usually add a
prefix to the name of each tool, indicating the target they produce code for.
To tell the Linux kernel build to use a cross compiler named "armv4l-gcc" (and
corresponding "armv4l-ld" and "armv4l-strip") specify "CROSS_COMPILE=armv4l-".
(Prefixes ending in a dash are common, and forgetting the trailing dash in
CROSS_COMPILE is a common mistake.  Don't forget to add the cross compiler
tools to your $PATH.)</p>

      <span id="User Mode Linux">
  <span id="Infrastructure">
    <span id="kconfig">
<p>The Linux configuration system is called Kconfig.  The various
configuration front-ends (such as menuconfig) parse data files
written in the <a href=Documentation/kconfig-language.txt>Kconfig language</a>,
which define the available symbols and provide default values, help entries,
and so on.</p>

<p>The source code for the front ends is in scripts/kconfig.  The
Makefile in this directory defines the make targets for the configuration

    <span id="kbuild">
    <span id="build and link (tmppiggy)">

<span id="Installing and using the kernel">
  <span id="Installing">
    <span id="Kernel image">
    <span id="Bootloader">
  <span id="A working Linux root filesystem">
<p><a href=ols/2002/ols2002-pages-176-182.pdf>Advanced Boot Scripts</a></p>
    <span id="Finding and mounting /">
      <span id="initramfs, switch_root vs pivot_root, /dev/console">
    <span id="Running programs">
      <span id="init program and PID 1">
        <span id="What does daemonizing really mean?">
      <span id="Executable formats">
<p>The Linux kernel runs programs in response to the
<a href=xmlman/man3/exec.html>exec</a> syscall, which is called on a
file.  This file must have the
executable bit set, and must be on a filesystem that implements mmap() and which
isn't mounted with the "noexec" option.  The kernel understands
several different <a href="#executable_file_formats">executable file
formats</a>, the most common of which are shell scripts and ELF binaries.</p>
        <span id="Shell scripts">
<p>If the first two bytes of an executable file are the characters "#!", the
file is treated as a script file.  The kernel parses the first line of the file
(until the first newline), and the first argument (immediately following
the #! with no space) is used as absolute path to the script's interpreter,
which must be an executable file.  Any additional arguments on the first
line of the file (separated by whitespace) are passed as the first arguments
to that interpreter executable.  The interpreter's next argument is the name of
the script file, followed by the arguments given on the command line.</p>

<p>To see this behavior in action, run the following:</p>
<pre>echo "#!/bin/echo hello" > temp
chmod +x temp
./temp one two three

<p>The result should be:</p>
<blockquote>hello ./temp one two three</blockquote>

<p>This is how shell scripts, perl, python, and other scripting languages
work.  Even C code can be run as a script by installing the
<a href=>tinycc</a> package,
adding "#!/usr/bin/tcc -run" to the start of the .c file, and setting the
executable bit on the .c file.</p>
        <span id="ELF">
          <span id="Shared libraries">

      <span id="C library">
<p>Most userspace programs access operating system functionality through a C
library, usually installed at "/lib/*".  The C library wraps system
calls, and provides implementations of various standard functions.</p>

<p>Because almost all other programming languages are implemented in C
(including python, perl, php, java, javascript, ruby, flash, and just about
everything else), programs written in other languages also make use of the
C library to access operating system services.</p>

<p>The most common C library implementations for Linux are
<a href=>glibc</a>
and <a href=>uClibc</a>.  Both are full-featured
implementations capable of supporting a full-featured desktop Linux

<p>The main advantage of glibc is that it's the standard implementation used by the
largest desktop and server distributions, and has more features than any other
implementation.  The main advantage of uClibc is that it's much smaller and
simpler than glibc while still implementing almost all the same functionality.
For comparison, a "hello world" program statically linked against glibc is half
a megabyte when stripped, while the same program statically linked against
uClibc strips down to 7k.</p>

<p>Other commonly used special-purpose C library implementations include
<a href=>klibc</a> and
<a href=>newlib</a>.</p>

<span id="Exporting kernel headers">
<p>Building a C library from source code requires a special set
of Linux kernel header files, which describe the API of the specific version
of the Linux kernel the C library will interface with.  However, the header
files in the kernel source code are designed to build the kernel and contain
a lot of internal information that would only confuse userspace.  These
kernel headers must be "exported", filtering them for use by user space.</p>

<p>Modern Linux kernels (based on and newer) export kernel headers via
the "make headers_install" command.  See
<a href=Documentation/make/headers_install.txt>exporting kernel headers for
use by userspace</a> for more information.</p>
      <span id="Dynamic loader">
    <span id="FHS directories">
      <p>FHS spec</p>
      <a href="pending/hotplug.txt">populating /dev from sysfs</a>.

<span id="Reading the source code">
  <span id="Source code layout">
    <span id="Following the boot process">
    <span id="Major subsystems">
    <span id="Architectures">
  <span id="Concept vs implementation">
    <p>Often the first implementation of a concept gets replaced.
       Journaling != reiserfs, virtualization != xen, devfs gave way to udev...
       Don't let your excitement for the concept blind you to the possibility
       of alternate implementations.</p>
  <span id="Concepts">
    <span id="rbtree">
    <span id="rcu">
<p>RCU stands for "Read Copy Update".  The technique is a lockless way to manage data structures
(such as linked lists or trees) on SMP systems, using a specific sequence of reads and updates,
plus a garbage collection step, to avoid the need for locks in both the read and the update

<p>RCU was invented by Paul McKenney, who maintains an excellent page of
<a href=>RCU documentation</a>.
The Linux kernel also contains some <a href=Documentation/RCU>additional RCU

<p>RCU cannot be configured out of the kernel, but the kconfig symbol
<a href=menuconfig/lib-Kconfig.debug.html#RCU_TORTURE_TEST>CONFIG_RCU_TORTURE_TEST</a> controls the
<a href=Documentation/RCU/torture.txt>RCU Torture test module</a>.</p>

<li><a href=ols/2001/read-copy.pdf>Read-Copy Update</a> (OLS 2001)</li>


<span id="Kernel infrastructure">
  <span id="Process Scheduler">

<span id="History of the Linux Process Scheduler">
<p>The original Linux process scheduler was a simple design based on
a goodness() function that recalculated the priority of every task at every
context switch, to find the next task to switch to.  This served almost
unchanged through the 2.4 series, but didn't scale to large numbers of
processes, nor to SMP.  By 2001 there were calls for
change (such as the OLS paper <a href=ols/2001/elss.pdf>Enhancing Linux
Scheduler Scalability</a>), and the issue
<a href=>came to a head</a> in December 2001.</p>

<p>In January 2002, Ingo Molnar
<a href=>introduced the "O(1)" process scheduler</a> for the 2.5 kernel series, a design
based on separate "active" and "expired" arrays, one per processor.  As the name
implied, this found the next task to switch to in constant time no matter
how many processes the system was running.</p>

<p>Other developers (<a href=>such as Con Colivas</a>) started working on it,
and began a period of extensive scheduler development.  The early history
of Linux O(1) scheduler development was covered by the website Kernel

<p>During 2002 this work included
<a href=>preemption</a>,
<a href=>User Mode Linux support</a>,
<a href=>new drops</a>,
<a href=>runtime tuning</a>,
<a href=>NUMA support</a>,
<a href=>cpu affinity</a>,
<a href=>scheduler hints</a>,
<a href=>64-bit support</a>,
<a href=>backports to the 2.4 kernel</a>,
<a href=>SCHED_IDLE</a>,
discussion of <a href=>gang scheduling</a>,
<a href=>more NUMA</a>,
<a href=>even more NUMA</a>).  By the end of 2002, the O(1) scheduler was becoming
the standard <a href=>even in the 2.4 series</a>.</p>

<p>2003 saw support added for
<a href=>hyperthreading as a NUMA variant</a>,
<a href=>interactivity bugfix</a>,
<a href=>starvation and affinity bugfixes</a>,
<a href=>more NUMA improvements</a>,
<a href=>interactivity improvements</a>,
<a href=>even more NUMA improvements</a>,
a proposal for <a href=>Variable Scheduling Timeouts</a> (the first rumblings of what
would later come to be called "dynamic ticks"),
<a href=>more on hyperthreading</a>...</p>

<p>In 2004 there was work on <a href=>load balancing and priority handling</a>, and
<a href=>still more work on hyperthreading</a>...</p>

<p>In 2004 developers proposed several extensive changes to the O(1) scheduler.
Linux Weekly News wrote about Nick Piggin's
<a href=>domain-based scheduler</a>
and Con Colivas' <a href=>staircase scheduler</a>.  The follow-up article <a href=>Scheduler tweaks get serious</a> covers both.  Nick's scheduling domains
were merged into the 2.6 series.</p>

<p>Linux Weekly News also wrote about other scheduler work:</p>

<li><a href=>Filtered wakeups</a></li>
<li><a href=>When should a process be migrated</a></li>
<li><a href=>Pluggable and realtime schedulers</a></li>
<li><a href=>Low latency for audio applications:</a></li>
<li><a href=>Solving starvation problems in the scheduler:</a></li>
<li><a href=>SMPnice</a></li>

<p>In 2007, Con Colivas proposed a new scheduler, <a href=>The Rotating Staircase Deadline Scheduler</a>, which
<a href=>hit a snag</a>.  Ingo
Molnar came up with a new scheduler, which he named the
<a href=>Completely Fair Scheduler</a>,
described in the LWN writeups
<a href=>Schedulers: the plot thickens</a>,
<a href=>this week in the scheduling discussion</a>, and
<a href=>CFS group scheduling</a>.</p>

<p>The CFS scheduler was merged into 2.6.23.</p>

    <span id="fork, exec">
    <span id="sleep">
    <span id="realtime">
<p><a href=ols/2001/rtai.pdf>The Real-Time Application Interface</a> (OLS 2001, obsolete)</a></p>
  <span id="Timers">
    <span id="Interrupt handling">
  <span id="memory management">
    <li><a href="gorman">Understanding the Linux Virtual Memory Manager</a>, online book by Mel Gorman.</li>
    <li> What every programmer should know about memory, article series by Ulrich Drepper,
<a href=>one</a>,
<a href=>two</a>,
<a href=>three</a>,
<a href=>four</a>,
<a href=>five</a>.
    <li>Ars technica ram guide, article series by Jon "Hannibal" Stokes, parts
<a href=>one</a>,
<a href=>two</a>,
<a href=>three</a></li>
    <span id="mmap, DMA">
  <span id="vfs">
    <span id="Pipes, files, and ttys">
<p>A pipe can be read from or written to, transmitting a sequence of bytes
in order.</p>

<p>A file can do what a pipe can, and adds the ability to seek to a location,
query the current location, and query the length of the file (all of which are
an integer number off bytes from the beginning of the file).</p>

<p>A tty can do what a pipe can, and adds a speed (in bits per second)
and cursor location (X and Y, with the upper left corner at 0,0).  Oh, and
you can make it go beep.</p>

<p>Note that you can't call lseek() on a tty and you can't call termios
(man 3 termios) functions on a file.  Each can be treated as a pipe.</p>
    <span id="Filesystems">
      <span id="Types of filesystems (see /proc/filesystems)">
        <span id="Block backed">
<span id="ext2">
  <li><a href=ols/2002/ols2002-pages-117-129.pdf>Online ext2 and ext3 Filesystem Resizing</a> (OLS 2002)</li>
<span id="jffs2">
  <li><a href=ols/2001/jffs2.pdf>JFFS: The Journalling Flash Filesystem</a> (OLS 2001)</li>
<span id="vxfs">
  <li><a href=menuconfig/fs-Kconfig.html#VXFS_FS>CONFIG_VXFS_FS</a></li>
  <li><a href=ols/2002/ols2002-pages-191-196.pdf>Reverse engineering an advanced filesystem</a></li>
        <span id="Ram backed">
          <span id="ramfs">
          <span id="tmpfs">
        <span id="Synthetic">
          <span id="proc">
          <span id="sys">
<p>Although the sysfs filesystem probably wasn't intentionally named after the
greek myth about pushing a rock to the top of a hill only to see it forever
roll back down again, this is a remarkably accurate analogy for the
task of documenting sysfs.</p>

<p>The maintainers of sysfs do not believe in a stable API, and change
userspace-visible elements from release to release.  The rationale is that
sysfs exports information from inside the kernel to outside the kernel
(what API doesn't?) and the kernel internals change, thus sysfs changes to
reflect it.  This doesn't explain why sysfs regularly changes things that aren't
dictated by kernel internals, such as moving partition directories under block
device directories after initially exporting them at the same level, moving
/sys/block into /sys/devices, removing the "devices" symlink, and so on.<p>

<p>In reality, sysfs is treated as a private API exported for the use of the
"udev" program, which is maintained by the same developers as sysfs.  Any
attempt to use sysfs directly from other programs is condemned by sysfs'
authors as an abuse of sysfs, and attemps to document it are actively resisted
and ridiculed.  (Yes, you must often update udev when you update the kernel.)</p>

<p>The following documentation reflects the current state of sysfs.  This is
likely to change in future, as its maintainers break compatability with
existing userspace programs they didn't personally write.</p>

          <span id="internal (pipefs)">
          <span id="usbfs">
          <span id="devpts">
          <span id="rootfs">
          <span id="devfs (obsolete)">
<p>Devfs was the first attempt to do a dynamic /dev directory which could change
in response to hotpluggable hardware, by doing the seemingly obvious thing of
creating a kernel filesystem to mount on /dev which would adjust itself as
the kernel detected changes in the available hardware.</p>

<p>Devfs was an interesting learning experience, but turned out to be the wrong
approach, and was replaced by sysfs and udev.  Devfs was removed in kernel
version 2.6.18.  See
<a href=local/hotplug-history.html>the history of hotplug</a> for details.</p>

        <span id="Network">
          <span id="nfs">
<p><a href=ols/2001/nfsv4_ols.pdf>Linux NFS Version 4: Implementation and Administration</a> (OLS 2001)</a></p>
          <span id="smb/cifs">
          <span id="FUSE">
      <span id="Filesystem drivers">
        <span id="Using">
        <span id="Writing">

  <span id="Drivers">
    <span id="Filesystem">
    <span id="Block (block layer, scsi layer)">
      <span id="SCSI layer">
	<li><a href="Documentation/scsi">Documentation/scsi</a> scsi.txt scsi_mid_low_api.txt scsi-generic.txt scsi_eh.txt</li>
        <li><a href="">SCSI Generic (sg) HOWTO</a></li>
	<li><a href="xmlman/man4/sd.html">man 4 sd</a></li>
        <li><a href="">SCSI standards</a></li>
        <li><a href=ols/2002/ols2002-pages-40-49.pdf>Incrementally Improving the Linux SCSI Subsystem</a> (OLS 2002)</li>
    <span id="Character">
      <span id="serial">
      <span id="keyboard">
      <span id="tty">
        <span id="pty">
      <span id="audio">
      <span id="null">
      <span id="random/urandom">
<p><a href=>Analysis of the Linux Random Number Generator</a> - Zvi Gutterman, Benny Pinkas, Tzachy Reinman (iacr 2006)</p>
      <span id="zero">
    <span id="DRI">
    <span id="Network">

  <span id="Hotplug">
<p><a href=>Hotpluggable devices and the Linux kernel</a> (OLS 2001)</p>
<p><a href=local/hotplug-history.html>The history of hotplug</a></p>
  <span id="Input core">
  <span id="Network">
<p><a href=ols/2001/mipl.pdf>MIPL Mobile IPv6 for Linux in HUT Campus Network MediaPoli</a> (OLS 2001)</p>
<p><a href=ols/2001/sctp.pdf>Linux Kernel SCTP: The Third Transport</a> (OLS 2001)</p>
<p><a href=ols/2002/ols2002-pages-8-30.pdf>TCPIP Network Stack Performance in Linux Kernel 2.4 and 2.5</a></p>
  <span id="Modules">
    <span id="Exported symbols">
      <p>List of exported symbols.</p>
  <span id="Busses">
  <span id="Security">
    <span id="Traditional Unix security model">
Users, groups, files (rwx), signals.
    <span id="More complicated security models">
<p>The traditional Unix security model is too simple to satisfy the
certification requirements of large corporate and governmental organizations,
so several add-on security models have been implemented to increase
complexity.  There is some debate as to which of these (if any) are actually an

      <span id="Posix capabilities">
      <span id="SELinux">
<p><a href=ols/2001/selinux.pdf>Meeting Critical Security Objectives with Security-Enhanced Linux</a> (OLS 2001)</p>
<p><a href=ols/2002/ols2002-pages-65-72.pdf>SE Debian: how to make NSA SE LInux work in a distribution</a> (OLS 2002)</p>
    <span id="Encryption">
<p><a href=ols/2002/ols2002-pages-73-92.pdf>The Long Road to the Advanced Encryption Standard</a></p>
  <span id="API (how userspace talks to the kernel)">
    <span id="Syscalls">
    <span id="ioctls">
    <span id="executable file formats">
      <span id="a.out">
      <span id="elf">
        <span id="css, bss, etc.">
      <span id="scripts">
      <span id="flat">
      <span id="misc">
    <span id="Device nodes">
    <span id="Pipes (new pipe infrastructure)">
    <span id="Synthetic filesystems (as API)">

<span id="Hardware">
  <span id="Architectures">
<p>Linux supports many more hardware platforms than its original PC.
The first modern port of Linux was to the DEC Alpha processor
[TODO: open sources]</p>

<p>The most widely used modern ports are i386, x86-64, ARM, mips, and powerpc,
all of which are supported by the emulator <a href=>QEMU</a>.
Bootable kernel and filesystem images for those platforms (bootable under
QEMU) are available <a href=>here</a>.</p>

<p>Alpha, sparc, parisc, itanium are primarily of historical interest.
Each of those platforms used to have a bigger developer community than it
does now, but has peaked and gone into a pronounced decline.</p>

<p>Most of the other platforms have special-purpose niches.  For example,
super-hitachi is widely used in the Japanese auto industry.</p>


<span id="alpha">
<p>The now-obsolete
<a href=>DEC Alpha</a> was one of the
first 64-bit processors, one of the fastest and cleanest processor
designs of its time, and still has fans to this day.  Despite excellent
performance and widespread use in supercomputers, manufacturing of
Alpha was
<a href=,1000000091,2127122,00.htm>repeatedly
disrupted</a> and a series of acquisitions by PC vendors uninterested in any
non-PC architecture.  Despite pressure from users and even
<a href=>government intervention</a>
to preserve the Alpha processor, new development of the hardware ceased
towards the end of the 1990's.</p>

<p>The legacy of Alpha lives on in the x86-64 architecture.  When
<a href=>Compaq
bought DEC</a> it acquired the rights to the Alpha processor, but not the
chip design team.  Many ex-Alpha chip designers wound up at AMD, where they
<a href=>designed
the Athlon</a> (x86) and Opteron (x86-64) processors.  Intel also
<a href=>licensed</a> and <a href=>incorporated</a>
Alpha technology into all its processor lines.  Internally, modern PC
processors owe more to the Alpha than to the original 8086 processor.</p>

<p>Alpha is of great historical importance to Linux as the
<a href=>first non-PC
port incorporated into Linus's tree</a>, as well as the first 64-bit port.
<a href=>QEMU</a> recently grew preliminary support for
emulating Alpha processors.</a>

<li><a href=></a></li>

<span id="arm">
<p>The ARM processor is the most popular embedded processor, powering 80-90% of
the cell phone market and most battery powered handheld devices.  The iPod,
iPhone, Nokia N800, and Nintendo DS are all ARM-based.</p>

<p>While the x86 family has the world's leading price/performance ratio,
the ARM processor family has the best ratio power consumption to performance.
By delivering the best bang for the watt, ARM has become overwhelmingly popular
in embedded devices.</p>

<p>ARM originally for "Acorn RISC Machine", a processor designed by a British
company in the early 80's to replace the 8-bit 6502 in Acorn's successful BBC
Micro.  Unlike most RISC design efforts which focused on using RISC techniques
to increase performance, Acorn focused on creating a small, simple processor
design, initially with under 25,000 transistors (these days with about 43,000
transistors worth of core logic, before adding a cache and memory controller).
In 1990, the processor design team moved to a new company,
<a href=>ARM Ltd</a>, which doesn't manufacture chips but
instead licenses its designs to other companies interested in fabricating
chips.  This also allows ARM designs to be easily customized, and
embedded in things like network cards or system-on-chip designs.</p>

<p>Arm processor generations are divided by "architecture", which among other
things indicates the instruction set the processor can run:</p>
<li>ARMv3 - The oldest 32-bit ARM architecture, now considered obsolete.</li>
<li>ARMv4 - The oldest architecture still in widespread use.</li>
<li>ARMv5 - The oldest architecture still in production.  The baseline
modern" architecture.</li>
<li>ARMv6,v7 - An architecture ARM inc has used NDA terms to prevent QEMU
developers from releasing support for (apparently because it wants to sell
proprietary emulators, and considers a GPLed emulator a threat).  These
processors run ARMv4/v5 code.</li>

<p>The newest archtecture that can be emulated by
<a href=>QEMU</a> is ARMv5TEJ (I.E. ARMv5
with the Thumb, Enhanced DSP, and Java extensions).  Unfortunately, ARM Ltd.
has leveraged its NDAs with prominent open source developers to
<a href=>explicitly
forbid</a> them from contributing ARMv6 support to QEMU, apparently because
it's trying to sell a competing proprietary emulation product.</p>

<p>Newer ARM processors run older instruction sets, and are thus backwards
compatible.  The advantage of newer instruction sets is that they execute
faster (and are thus more energy efficient), and some produce smaller binary
sizes (the "Thumb" extensions are designed specifically for small code size,
but may exchange performance to get it).  Recompiling an ARMv4 program as ARMv5
usually results in a 25% performance improvement.</p>

<li><a href="">The ARM instruction set architecture</a>
<li><a href="">List of ARM processors</a></li>
<li><a href="">The ARM Linux web page</a></li>
<li><a href=>List of
over 1500 known arm systems</a>.</li>
<li><a href=>History of the ARM CPU</a></li>

<span id="ia64">
<p>The Itanium was a failed attempt to create a 64-bit successor to the x86, a
role that went to AMD's x86-64 design instead.  In 1994, Intel partnered with
HP to produce a successor to both x86 and HP's PA-RISC, with a new instruction
set ("ia64") fundamentally different from both.  To support software written
for the older processors, the designers included a complete implementation of
each, because the new chip was already so big and complex that including _two_
entire previous processors wasn't a significant increase to either.  (If this
sounds unlikely to end well...)</p>

<p>The result was a late, slow, inefficient chip that was difficult to
manufacture, more expensive than available alternatives, difficult to write
efficient compilers for, quickly nicknamed "Itanic" and essentially ignored by
the market.  (This was remarkably similar to Intel's earlier
<a href=>i432 project</a>,
a 1970's attempt to jump straight from the 8 bit 8080 to a
32-bit processor which also resulted resulted in a slow, late, overcomplicated
and overpriced design which the industry ignored.  The i432 was finally killed
off by the arrival of the 80286, which outperformed it by a factor of four.
History does repeat itself.)</p>

<p>For comparison purposes, the
<a href=>Ford Edsel</a> sold
64,000 units in its first year.  Itanium took over four years to sell that
only <a href=>500 units in
2001</a>, <a href=>3,500 in 2002</a>,
and around
<a href=>19,000 in
2003 and 30,000 in 2004</a>.  In 2005, x86-64 systems emerged as the new
64-bit PC standard, at which point Dell and IBM discontinued their Itanium
servers and HP discontinued its Itanium workstations.</p>

<p>To give a <a href=>sense of perspective</a>, in the first quarter of
2007, the licensees of ARM Inc. shipped 724 million ARM processors.  (In one
quarter, not a full year.) In the third quarter of 2007, the PC market shipped
<a href=>>68.1 million</a> systems (mostly x86-64).
Over in <a href=>PowerPC land</a>,
from their launch through August 2007 the Wii had sold 9 million units, Xbox
360 8.9 million, and Playstation 3 3.7 million (all three PowerPC based).
Shipments of
<a href=>many other
interesting processor families</a> each number in the millions of units
annually</a>.  The Itanium's cumulative total of 0.05 million
in its first four years combined doesn't even show up on the same graph.</p>

<p>The history of Itanium through 2003 was extensively detailed
<a href=>here</a>.
A more recent obituary for the chip is zdnet's
<a href=>Itanium: A cautionary

<p>Despite the Itanium's failure to gain any marketplace traction
(and <a href=>Linus Torvalds'
personal disdain for the chip</a>, the billions
of dollars poured into Itanium resulted in lots of corporate engineers assigned
to developing extensive Linux support for this virtually nonexistent hardware.
But despite a documented instruction set, no open source emulators run Itanium
code due to lack of interest.  (HP does offer a binary-only Itanium
emulator called
<a href=>SKI</a>, last
updated in 2004.)</p>

<p>Silicon Graphics still produces Itanium systems.  HP no longer produces
Itanium workstations, but offers some Itanium servers.  Intel still spends
money on it.</p>

<span id="m68knommu">
<p>The most popular nommu 68k variant is Coldfire, which uses a subset of the
68k instruction set and has no memory management unit.  Coldfire is currently
used in a small number of high volume devices.  (I.E. Coldfire isn't used in
many different products, but the products it's used in are produced in high

<p><a href=ols/2002/ols2002-pages-130-145.pdf>Running Linux on a DSP: Exploiting the Computational Resources of a programmable DSP Micro-Processor with uClinux</a> (OLS 2002)</p>
<span id="mips">
<p>Mips is probably the main competitor to ARM.  One advantage of MIPS is its
availability as a FPGA program, allowing easy prototyping of custom

<p>SGI produced primarily MIPS systems back in the Irix days.  Sony's
Playstation 2, and PSP are MIPS based, as are some Tivo and Linksys devices.</p>

<p><a href=>MIPS architecture</a></a>
<p><a href=>The Linux/MIPS web page</a></p>

<span id="parisc">
<p>The PA-RISC is from Hewlett Packard.  It was scheduled to be discontinued in
favor of the Itanium, but the failure of ia64 led to a restart of
PA-RISC development.</p>

  <a href=ols/2002/ols2002-pages-183-190.pdf>Porting Drivers to HP ZX1</a>
<span id="powerpc">
<p>The PowerPC was created in the early 90's by a parnership between IBM,
Apple, and Motorola.  Apple switched to x86-64 in 2005 and Motorola spun off
its processor division as Freescale (which now also manufactures Coldfire and
ARM processors).  But IBM is still strongly behind PowerPC, and the various
users of PowerPC formed a <a href=>consortium</a> to
promote and develop it.</p>

<p>PowerPC is commonly used in high volume set-top boxes and game consoles
such as the PlayStation 3, Xbox and Xbox 360, and Nintendo Wii.  PowerPC
is the third most common processor type in the
<a href=>Top 500</a> supercomputers list, and was used in
older cell phones (before Motorola spun off Freescale).</p>

<p>The most interesting recent PowerPC development is the <a href=>Cell processor</a>,
which combines a PowerPC core with 8 DSP-like "synergistic processing
units" which can offload compute-intensive tasks like 3D acceleration,
compression, encryption, and so on.</p>

<p>The PowerPC 7xx is the "386" of PowerPC systems, meaing most modern PowerPC
processors can run code compiled fro PowerPC 7xx (although such older code
may not take full advantage of the new chip's capabilities, especially
with regard to performance).  The PowerPC family also has
<a href=>64-bit variants</a> (an
early version of which Apple marketed as the "G5") that can still run 32-bit
PowerPC code.</p>

<p>The main exceptions to 7xx compatability are two embedded subsets of the
PowerPC, which were separately developed by IBM (the 4xx series) and Motorola
(the 8xx series) for use in low power devices.  These are stripped down PowerPC
processors in roughly the same way Coldfire was a stripped down 68k:
instructions were removed from the architecture to get the transistor count
down, and thus code must be recompiled to avoid using those instructions.
Unfortunately, the two vendors chose a different subset of the PowerPC
instruction set, so code compiled for 4xx won't run on 8xx, and vice versa.</p>

<p>The 4xx line was purchased by <a href=>AMCC</a>
(which has the most annoying website design ever, click one of the tabs to
get it to STOP MOVING).  Freescale mostly seems to have lost interest in the
8xx now that Motorola has switched its' cell phones to arm, but information
is <a href=>still available</a>.</p>

<p>The Linux PowerPC developers hang out on the #mklinux channel on</p>

  <a href=ols/2001/iseries.pdf>The Linux Kernel on iSeries</a> (OLS 2001)
  <a href=ols/2001/ppc64.pdf>PowerPC 64-bit Kernel Internals</a> (OLS 2001)
  <a href=>PowerPC implementation reference for QEMU</a>

<span id="ppc">
<p>The "ppc" architecture is obsolete, and
<a href="Documentation/feature-removal-schedule.txt">scheduled for removal
in June 2008</a>.</p>

<p>Once upon a time, ARCH=ppc was for 32-bit PowerPC processors (7xx and up),
and ARCH=powerpc was for 64-bit (970/G5 and up), but the two architectures were
merged together and support for most boards has since been ported over to
powerpc.  If you care about any of the remaining boards, bug the powerpc

<p>Note that ARCH=ppc does not support newer features like "make
headers_install", but ARCH=powerpc does.</p>
<span id="um">
<p>User Mode Linux is a port of Linux to run as a userspace program.  Instead
of talking to the hardware, it makes system calls to the C library.  Instead
of using a memory managment unit it makes clever use of mmap.</p>

<p>UML is sort of like an emulator: it can run Linux programs under itself
(its processes show up as threads to the host system).  It's sometimes used
as a superior "fakeroot", and sometimes used to provide an emulated system
for honeypots or shared hosting services.  It's an excellent tool for
learning and debugging the Linux kernel, because you can use all the normal
userspace debugging techniques, up to and including putting "printf()"
statements into the source code to see what it's doing.  (It's great for
developing things like filesystems, not so good for device drivers.)</p>

<li><a href=ols/2001/uml.pdf>User-Mode Linux</a> (OLS 2001)</li>
<li><a href=ols/2002/ols2002-pages-107-116.pdf>Making Linux Safe for Virtual Machines</a> (OLS 2002)</li>
<li><a href=>User Mode Linux HOWTO</a></li>
<span id="x86_64">
<p><a href=>x86-64</a> is the 64-bit successor to x86, and
the new dominant PC processor.  Essentially all current PCs are now shipping
with x86-64 processors, including traditionally non-x86 architectures such
as Apple's Macintosh and Sun's servers.</p>

  <a href=ols/2001/x86-64.pdf>Porting Linux to x86-64</a> (OLS 2001)

  <span id="DMA, IRQ, MMU (mmap), IOMMU, port I/O">
  <span id="Busses">
    <span id="PCI, USB">
<p><a href=ols/2001/pci.pdf>PCIComm: A Linux Device Driver for Communication over PCI Shared Memory</a> (OLS 2001)</p>
<p><a href=ols/2001/powertweak.pdf>Linux performance tuning using Powertweak</a> (OLS 2001)</p>

<span id="Following Linux development">
  <span id="Distibutions">
  <span id="Releases">
    <span id="Source control">

<p>Linux releases from 0.0.1 through 2.4.x used no source control system, just
release tarballs.  Releases 2.5.0 through 2.6.12-rc2 used a proprietary
source control system called BitKeeper.  Releases 2.6.12-rc2 through the
present use a source control system called git.</p>

<p>Early Linux development didn't use source control.  Instead Linus would
apply patches to his copy of the source, and periodically release tarball
snapshots of his development tree with a hand-edited changelog file noting who
contributed each patch he'd applied.  Most of these patches were posted to the
Linux Kernel Mailing List, and with a little effort could be fished out of the
mailing list archives.</p>

<p>This worked for many years, but didn't scale as Linux development grew.
Eventually the issue came to a head [link], and after some discussion Linus
decided to use a proprietary distributed source control system called
BitKeeper for the 2.5 development branch.  Linux releases v2.5.0 through
v2.6.12-rc2 were put out this way.</p>

<p>Linux development no longer uses BitKeeper, due to the sudden
<a href=>expiration of the
"Don't piss off Larry license"</a> under which BitKeeper was made available
to the Linux community (<a href=>more here</a>).
This prompted Linus to take a month off from Linux development to write his own
distributed source control system, git.  This is why the current source control
history in the main git development repository goes back to 2.6.12-rc2.
(The revision history from the BitKeeper era was later
<a href=;a=summary>converted to git</a>, but remains separate for historical reasons.)</p>

<p>Linus initially chose BitKeeper because he wanted a distributed source
control system, and the open source alternatives available at the time were
all centralized source control systems.</p>

<p>In a distributed source control
system, every user has a complete copy of the project's entire revision
history, which they can add their own changes to locally.  A centralized source
control system requires a single central location, with user accounts to
control access and either locking the tree or rejecting attempts to apply out
of date patches.  A distributed source control system is instead designed to
download and merge changes from many different repositories after they're
checked in to those other repositories.  The source control system handles
almost all of this merging automatically, because it can trace the changes in
each repository back to a common ancestor, and then use three-way merge
algorithms to better understand the changes.  (Patches don't indicate
which version they apply to.  A distributed source control system has
more information avialable to it, and uses that information to automatically
merge changes more effectively.)</p>

<p>This allows Linux subsystem maintainers to develop
and test their own local versions, then send changes to Linus in large batches
(without smearing together the individual patches they committed), and finally
resync with Linus's repository to get everyone else's changes.  Open source
development is already distributed, so distributed source control is a better
fit.  In this development model, Linus's repository serves as a coordination
point, but not a development bottleneck for anything except putting out
releases (which come from Linus's repository).</p>

<p>Linus described the appeal of distributed source control, and his reasons
for developing git, in the Google Video tech talk
<a href=>Linus Torvalds on git</a>.</p>

<p>The linux kernel source is also available as a
<a href=>mercurial repository</a>, another
popular open source distributed source control system.</p>

<p>This paper still serves as a decent introduction to distributed source
control: <a href=>BitKeeper
for Kernel Development</a> (OLS 2002, obsolete)</p>

  <span id="community">
  lwn, kernel traffic, kernelplanet.
  The four layer (developer, maintainer, subsystem, linus) model.
    Stable API nonsense
    Why reiser4 not in.
  </span id="community">
  <span id="Submitting Patches">

<span id="Glossary">