Linux Kernel Documentation Index

This page collects and organizes documentation about the Linux kernel, taken from many different sources. What is the kernel, how do you build it, how do you use it, how do you change it...

This is a work in progress, and probably always will be. Please let us know on the linux-doc mailing list (on vger.kernel.org) about any documentation you'd like added to this index, and feel free to ask about any topics that aren't covered here yet. This index is maintained by Rob Landley <rob@landley.net>, and tracked in this mercurial repostiory. The cannonical location for the page is here.


1 Sources of documentation

2 Building from source

3 Installing and using the kernel

4 Reading the source code

5 Kernel infrastructure

6 Hardware

7 Following Linux development

8 Glossary


1 Sources of documentation

These are various upstream sources of documentation, many of which are linked into the linux kernel documentation index.

1.1 From the kernel

1.2 Out on the web

1.3 Standards

1.4 Translations

2 Building from source

2.1 User interface

Building source packages is usually a three step process: configure, build, and install.

The Linux kernel is configured with the command "make menuconfig", built with the command "make", and installed either manually or with the command "make install".

tar xvjf linux-2.6.??.tar.bz2
cd linux-2.6.??
make menuconfig
make
make install

For a description of the make options and targets, type make help.

2.1.1 Get and extract the source

The Linux kernel source code is distributed by kernel.org as tar archives. Grab the most recent "stable" release (using the tiny little letter F link) to grab a file of the form "linux-2.6.*.tar.bz2" from the Linux 2.6 releases directory. Then extract this archive with the command "tar xvjf linux-2.6.*.tar.bz2". (Type the command "man tar" for more information on the tar command.) Then cd into the directory created by extracting the archive.

To obtain other Linux kernel versions (such as development releases and kernels supplied by distributions) see Following Linux development.

To return your linux kernel source directory to its original (unconfigured) condition after configuring and building in it, either either delete the directory with "rm -r" and re-extract it from the tar archive, or run the command "make distclean".

2.1.2 Configuring

Before you can build the kernel, you have to configure it. Configuring selects which features this kernel build should include, and specifies other technical information such as buffer sizes and optimization strategies. This information is stored in a file named ".config" in the top level directory of the kernel source code. To see the various user interfaces to the configuration system, type "make help".

Note that "make clean" does not delete configuration information, but the more thorough "make distclean" does.

2.1.2.1 Using an existing configuration

Often when building a kernel, an existing .config file is supplied from elsewhere. Copy it into place, and optionally run "make oldconfig" to run the kernel's diagnostics against it to ensure it matches the kernel version you're using, updating anything that's out of sync.

Several preset configurations are shipped with the kernel source code. Run the command find . -name "*_defconfig" in the kernel source directory to seem them all. Any of these can be copied to .config and used as a starting point.

The kernel can also automatically generate various configurations, mostly to act as starting points for customization:

  • make defconfig - Set all options to default values
  • make allnoconfig - Set all yes/no options to "n"
  • make allyesconfig - Set all yes/no options to "y"
  • make allmodconfig - Set all yes/no options to "y" and all "yes/module/no" options to "m"
  • make randconfig - Set each option randomly (for debugging purposes).
  • make oldconfig - Update a .config file from a previous version of the kernel to work with the current version.

2.1.2.2 Creating a custom kernel configuration

The most common user interface for configuring the kernel is menuconfig, an interactive terminal based menuing interface invoked through the makefiles via "make menuconfig". This interface groups the configuration questions into a series of menus, showing the current values of each symbol and allowing them to be changed in any order. Each symbol has associated help text, explaining what the symbol does and where to find more information about it. This help text is also available as html.

The menuconfig interface is controlled with the following keys:

  • cursor up/down - move to another symbol
  • tab - switch action between edit, help, and exit
  • enter - descend into menu (or help/exit if you hit tab first)
  • esc - exit menu (prompted to save if exiting top level menu)
  • space - change configuration symbol under cursor
  • ? - view help for this symbol
  • / - search for a symbol by name

Other configuration interfaces (functionally equivalent to menuconfig) include:

  • make config - a simple text based question and answer interface (which does not require curses support, or even a tty)
  • make xconfig - QT based graphical interface
  • make gconfig - GTK based graphical interface

2.1.3 building

2.1.3.1 Building out of tree

2.1.4 Installing

2.1.5 running

2.1.6 debugging

2.1.6.1 QEMU

2.1.7 cross compiling

2.1.7.1 Cross compiling vs native compiling

By default, Linux builds for the same architecture the host system is running. This is called "native compiling". An x86 system building an x86 kernel, x86-64 building x86-64, or powerpc building powerpc are all examples of native compiling.

Building different binaries than the host runs is called cross compiling. Cross compiling is hard. The build system for the Linux kernel supports cross compiling via a two step process: 1) Specify a different architecture (ARCH) during the configure, make, and install stages. 2) Supply a cross compiler (CROSS_COMPILE) which can output the correct kind of binary code. An example cross compile command line (building the "arm" architecture) looks like:

make ARCH=arm menuconfig
make ARCH=arm CROSS_COMPILE=armv5l-

To specify a different architecture than the host, either define the "ARCH" environment variable or else add "ARCH=xxx" to the make command line for each of the make config, make, and make install stages. The acceptable values for ARCH are the names of the directories in the "arch" subdirectory of the Linux kernel source code, see Architectures for details. All stages of the build must use the same ARCH value, and building a second architecture in the same source directory requires "make distclean". (Just "make clean" isn't sufficient, things like the include/asm symlink need to be removed and recreated.)

To specify a cross compiler prefix, define the CROSS_COMPILE environment variable (or add CROSS_COMPILE= to each make command line). Native compiler tools, which output code aimed at the environment they're running in, usually have a simple name ("gcc", "ld", "strip"). Cross compilers usually add a prefix to the name of each tool, indicating the target they produce code for. To tell the Linux kernel build to use a cross compiler named "armv4l-gcc" (and corresponding "armv4l-ld" and "armv4l-strip") specify "CROSS_COMPILE=armv4l-". (Prefixes ending in a dash are common, and forgetting the trailing dash in CROSS_COMPILE is a common mistake. Don't forget to add the cross compiler tools to your $PATH.)

2.1.7.2 User Mode Linux

2.2 Infrastructure

2.2.1 kconfig

The Linux configuration system is called Kconfig. The various configuration front-ends (such as menuconfig) parse data files written in the Kconfig language, which define the available symbols and provide default values, help entries, and so on.

The source code for the front ends is in scripts/kconfig. The Makefile in this directory defines the make targets for the configuration system.

2.2.2 kbuild

2.2.3 build and link (tmppiggy)

3 Installing and using the kernel

3.1 Installing

3.1.1 Kernel image

3.1.2 Bootloader

3.2 A working Linux root filesystem

Advanced Boot Scripts

3.2.1 Finding and mounting /

3.2.1.1 initramfs, switch_root vs pivot_root, /dev/console

3.2.2 Running programs

3.2.2.1 init program and PID 1

3.2.2.1.1 What does daemonizing really mean?

3.2.2.2 Executable formats

The Linux kernel runs programs in response to the exec syscall, which is called on a file. This file must have the executable bit set, and must be on a filesystem that implements mmap() and which isn't mounted with the "noexec" option. The kernel understands several different executable file formats, the most common of which are shell scripts and ELF binaries.

3.2.2.2.1 Shell scripts

If the first two bytes of an executable file are the characters "#!", the file is treated as a script file. The kernel parses the first line of the file (until the first newline), and the first argument (immediately following the #! with no space) is used as absolute path to the script's interpreter, which must be an executable file. Any additional arguments on the first line of the file (separated by whitespace) are passed as the first arguments to that interpreter executable. The interpreter's next argument is the name of the script file, followed by the arguments given on the command line.

To see this behavior in action, run the following:

echo "#!/bin/echo hello" > temp
chmod +x temp
./temp one two three

The result should be:

hello ./temp one two three

This is how shell scripts, perl, python, and other scripting languages work. Even C code can be run as a script by installing the tinycc package, adding "#!/usr/bin/tcc -run" to the start of the .c file, and setting the executable bit on the .c file.

3.2.2.2.2 ELF

3.2.2.2.2.1 Shared libraries

3.2.2.3 C library

Most userspace programs access operating system functionality through a C library, usually installed at "/lib/libc.so.*". The C library wraps system calls, and provides implementations of various standard functions.

Because almost all other programming languages are implemented in C (including python, perl, php, java, javascript, ruby, flash, and just about everything else), programs written in other languages also make use of the C library to access operating system services.

The most common C library implementations for Linux are glibc and uClibc. Both are full-featured implementations capable of supporting a full-featured desktop Linux distribution.

The main advantage of glibc is that it's the standard implementation used by the largest desktop and server distributions, and has more features than any other implementation. The main advantage of uClibc is that it's much smaller and simpler than glibc while still implementing almost all the same functionality. For comparison, a "hello world" program statically linked against glibc is half a megabyte when stripped, while the same program statically linked against uClibc strips down to 7k.

Other commonly used special-purpose C library implementations include klibc and newlib.

3.2.2.3.1 Exporting kernel headers

Building a C library from source code requires a special set of Linux kernel header files, which describe the API of the specific version of the Linux kernel the C library will interface with. However, the header files in the kernel source code are designed to build the kernel and contain a lot of internal information that would only confuse userspace. These kernel headers must be "exported", filtering them for use by user space.

Modern Linux kernels (based on 2.6.19.1 and newer) export kernel headers via the "make headers_install" command. See exporting kernel headers for use by userspace for more information.

3.2.2.4 Dynamic loader

3.2.3 FHS directories

FHS spec

populating /dev from sysfs.

4 Reading the source code

4.1 Source code layout

4.1.1 Following the boot process

4.1.2 Major subsystems

4.1.3 Architectures

4.2 Concept vs implementation

Often the first implementation of a concept gets replaced. Journaling != reiserfs, virtualization != xen, devfs gave way to udev... Don't let your excitement for the concept blind you to the possibility of alternate implementations.

4.3 Concepts

4.3.1 rbtree

4.3.2 rcu

RCU stands for "Read Copy Update". The technique is a lockless way to manage data structures (such as linked lists or trees) on SMP systems, using a specific sequence of reads and updates, plus a garbage collection step, to avoid the need for locks in both the read and the update paths.

RCU was invented by Paul McKenney, who maintains an excellent page of RCU documentation. The Linux kernel also contains some additional RCU Documentation.

RCU cannot be configured out of the kernel, but the kconfig symbol CONFIG_RCU_TORTURE_TEST controls the RCU Torture test module.

References:

5 Kernel infrastructure

5.1 Process Scheduler

5.1.1 History of the Linux Process Scheduler

The original Linux process scheduler was a simple design based on a goodness() function that recalculated the priority of every task at every context switch, to find the next task to switch to. This served almost unchanged through the 2.4 series, but didn't scale to large numbers of processes, nor to SMP. By 2001 there were calls for change (such as the OLS paper Enhancing Linux Scheduler Scalability), and the issue came to a head in December 2001.

In January 2002, Ingo Molnar introduced the "O(1)" process scheduler for the 2.5 kernel series, a design based on separate "active" and "expired" arrays, one per processor. As the name implied, this found the next task to switch to in constant time no matter how many processes the system was running.

Other developers (such as Con Colivas) started working on it, and began a period of extensive scheduler development. The early history of Linux O(1) scheduler development was covered by the website Kernel Traffic.

During 2002 this work included preemption, User Mode Linux support, new drops, runtime tuning, NUMA support, cpu affinity, scheduler hints, 64-bit support, backports to the 2.4 kernel, SCHED_IDLE, discussion of gang scheduling, more NUMA, even more NUMA). By the end of 2002, the O(1) scheduler was becoming the standard even in the 2.4 series.

2003 saw support added for hyperthreading as a NUMA variant, interactivity bugfix, starvation and affinity bugfixes, more NUMA improvements, interactivity improvements, even more NUMA improvements, a proposal for Variable Scheduling Timeouts (the first rumblings of what would later come to be called "dynamic ticks"), more on hyperthreading...

In 2004 there was work on load balancing and priority handling, and still more work on hyperthreading...

In 2004 developers proposed several extensive changes to the O(1) scheduler. Linux Weekly News wrote about Nick Piggin's domain-based scheduler and Con Colivas' staircase scheduler. The follow-up article Scheduler tweaks get serious covers both. Nick's scheduling domains were merged into the 2.6 series.

Linux Weekly News also wrote about other scheduler work:

In 2007, Con Colivas proposed a new scheduler, The Rotating Staircase Deadline Scheduler, which hit a snag. Ingo Molnar came up with a new scheduler, which he named the Completely Fair Scheduler, described in the LWN writeups Schedulers: the plot thickens, this week in the scheduling discussion, and CFS group scheduling.

The CFS scheduler was merged into 2.6.23.

5.1.2 fork, exec

5.1.3 sleep

5.1.4 realtime

The Real-Time Application Interface (OLS 2001, obsolete)

5.2 Timers

5.2.1 Interrupt handling

5.3 memory management

5.3.1 mmap, DMA

5.4 vfs

5.4.1 Pipes, files, and ttys

A pipe can be read from or written to, transmitting a sequence of bytes in order.

A file can do what a pipe can, and adds the ability to seek to a location, query the current location, and query the length of the file (all of which are an integer number off bytes from the beginning of the file).

A tty can do what a pipe can, and adds a speed (in bits per second) and cursor location (X and Y, with the upper left corner at 0,0). Oh, and you can make it go beep.

Note that you can't call lseek() on a tty and you can't call termios (man 3 termios) functions on a file. Each can be treated as a pipe.

5.4.2 Filesystems

5.4.2.1 Types of filesystems (see /proc/filesystems)

5.4.2.1.1 Block backed

5.4.2.1.1.1 ext2

5.4.2.1.1.2 jffs2

5.4.2.1.1.3 vxfs

5.4.2.1.2 Ram backed

5.4.2.1.2.1 ramfs

5.4.2.1.2.2 tmpfs

5.4.2.1.3 Synthetic

5.4.2.1.3.1 proc

5.4.2.1.3.2 sys

Although the sysfs filesystem probably wasn't intentionally named after the greek myth about pushing a rock to the top of a hill only to see it forever roll back down again, this is a remarkably accurate analogy for the task of documenting sysfs.

The maintainers of sysfs do not believe in a stable API, and change userspace-visible elements from release to release. The rationale is that sysfs exports information from inside the kernel to outside the kernel (what API doesn't?) and the kernel internals change, thus sysfs changes to reflect it. This doesn't explain why sysfs regularly changes things that aren't dictated by kernel internals, such as moving partition directories under block device directories after initially exporting them at the same level, moving /sys/block into /sys/devices, removing the "devices" symlink, and so on.

In reality, sysfs is treated as a private API exported for the use of the "udev" program, which is maintained by the same developers as sysfs. Any attempt to use sysfs directly from other programs is condemned by sysfs' authors as an abuse of sysfs, and attemps to document it are actively resisted and ridiculed. (Yes, you must often update udev when you update the kernel.)

The following documentation reflects the current state of sysfs. This is likely to change in future, as its maintainers break compatability with existing userspace programs they didn't personally write.

5.4.2.1.3.3 internal (pipefs)

5.4.2.1.3.4 usbfs

http://www.linux-usb.org/USB-guide/x173.html http://www.linux-usb.org/USB-guide/c607.html http://www.linuxjournal.com/article/7466

5.4.2.1.3.5 devpts

5.4.2.1.3.6 rootfs

5.4.2.1.3.7 devfs (obsolete)

Devfs was the first attempt to do a dynamic /dev directory which could change in response to hotpluggable hardware, by doing the seemingly obvious thing of creating a kernel filesystem to mount on /dev which would adjust itself as the kernel detected changes in the available hardware.

Devfs was an interesting learning experience, but turned out to be the wrong approach, and was replaced by sysfs and udev. Devfs was removed in kernel version 2.6.18. See the history of hotplug for details.

5.4.2.1.4 Network

5.4.2.1.4.1 nfs

Linux NFS Version 4: Implementation and Administration (OLS 2001)

5.4.2.1.4.2 smb/cifs

5.4.2.1.4.3 FUSE

5.4.2.2 Filesystem drivers

5.4.2.2.1 Using

5.4.2.2.2 Writing

5.5 Drivers

5.5.1 Filesystem

5.5.2 Block (block layer, scsi layer)

5.5.2.1 SCSI layer

5.5.3 Character

5.5.3.1 serial

5.5.3.2 keyboard

5.5.3.3 tty

5.5.3.3.1 pty

5.5.3.4 audio

5.5.3.5 null

5.5.3.6 random/urandom

Analysis of the Linux Random Number Generator - Zvi Gutterman, Benny Pinkas, Tzachy Reinman (iacr 2006)

5.5.3.7 zero

5.5.4 DRI

5.5.5 Network

5.6 Hotplug

Hotpluggable devices and the Linux kernel (OLS 2001)

The history of hotplug

5.7 Input core

5.8 Network

physical
  plip
  serial/slip/ppp
  ethernet
routing
  ipv4
  ipv6

MIPL Mobile IPv6 for Linux in HUT Campus Network MediaPoli (OLS 2001)

Linux Kernel SCTP: The Third Transport (OLS 2001)

TCPIP Network Stack Performance in Linux Kernel 2.4 and 2.5

5.9 Modules

5.9.1 Exported symbols

EXPORT_SYMBOL() vs EXPORT_SYMBOL_GPL()

List of exported symbols.

5.10 Busses

5.11 Security

5.11.1 Traditional Unix security model

Users, groups, files (rwx), signals.

5.11.2 More complicated security models

The traditional Unix security model is too simple to satisfy the certification requirements of large corporate and governmental organizations, so several add-on security models have been implemented to increase complexity. There is some debate as to which of these (if any) are actually an improvement.

5.11.2.1 Posix capabilities

http://www.gentoo.org/proj/en/hardened/capabilities.xml

5.11.2.2 SELinux

Meeting Critical Security Objectives with Security-Enhanced Linux (OLS 2001)

SE Debian: how to make NSA SE LInux work in a distribution (OLS 2002)

5.11.3 Encryption

The Long Road to the Advanced Encryption Standard

5.12 API (how userspace talks to the kernel)

5.12.1 Syscalls

5.12.2 ioctls

5.12.3 executable file formats

5.12.3.1 a.out

5.12.3.2 elf

5.12.3.2.1 css, bss, etc.

5.12.3.3 scripts

5.12.3.4 flat

5.12.3.5 misc

5.12.4 Device nodes

5.12.5 Pipes (new pipe infrastructure)

5.12.6 Synthetic filesystems (as API)

6 Hardware

6.1 Architectures

Linux supports many more hardware platforms than its original PC. The first modern port of Linux was to the DEC Alpha processor [TODO: open sources]

The most widely used modern ports are i386, x86-64, ARM, mips, and powerpc, all of which are supported by the emulator QEMU. Bootable kernel and filesystem images for those platforms (bootable under QEMU) are available here.

Alpha, sparc, parisc, itanium are primarily of historical interest. Each of those platforms used to have a bigger developer community than it does now, but has peaked and gone into a pronounced decline.

Most of the other platforms have special-purpose niches. For example, super-hitachi is widely used in the Japanese auto industry.

avr32
blackfin
cris
frv
h8300
i386
m32r
m68k
s390
sh
sh64
sparc
sparc64
v850
xtensa

6.1.1 alpha

The now-obsolete DEC Alpha was one of the first 64-bit processors, one of the fastest and cleanest processor designs of its time, and still has fans to this day. Despite excellent performance and widespread use in supercomputers, manufacturing of Alpha was repeatedly disrupted and a series of acquisitions by PC vendors uninterested in any non-PC architecture. Despite pressure from users and even government intervention to preserve the Alpha processor, new development of the hardware ceased towards the end of the 1990's.

The legacy of Alpha lives on in the x86-64 architecture. When Compaq bought DEC it acquired the rights to the Alpha processor, but not the chip design team. Many ex-Alpha chip designers wound up at AMD, where they designed the Athlon (x86) and Opteron (x86-64) processors. Intel also licensed and incorporated Alpha technology into all its processor lines. Internally, modern PC processors owe more to the Alpha than to the original 8086 processor.

Alpha is of great historical importance to Linux as the first non-PC port incorporated into Linus's tree, as well as the first 64-bit port. QEMU recently grew preliminary support for emulating Alpha processors.

6.1.2 arm

The ARM processor is the most popular embedded processor, powering 80-90% of the cell phone market and most battery powered handheld devices. The iPod, iPhone, Nokia N800, and Nintendo DS are all ARM-based.

While the x86 family has the world's leading price/performance ratio, the ARM processor family has the best ratio power consumption to performance. By delivering the best bang for the watt, ARM has become overwhelmingly popular in embedded devices.

ARM originally for "Acorn RISC Machine", a processor designed by a British company in the early 80's to replace the 8-bit 6502 in Acorn's successful BBC Micro. Unlike most RISC design efforts which focused on using RISC techniques to increase performance, Acorn focused on creating a small, simple processor design, initially with under 25,000 transistors (these days with about 43,000 transistors worth of core logic, before adding a cache and memory controller). In 1990, the processor design team moved to a new company, ARM Ltd, which doesn't manufacture chips but instead licenses its designs to other companies interested in fabricating chips. This also allows ARM designs to be easily customized, and embedded in things like network cards or system-on-chip designs.

Arm processor generations are divided by "architecture", which among other things indicates the instruction set the processor can run:

  • ARMv3 - The oldest 32-bit ARM architecture, now considered obsolete.
  • ARMv4 - The oldest architecture still in widespread use.
  • ARMv5 - The oldest architecture still in production. The baseline modern" architecture.
  • ARMv6,v7 - An architecture ARM inc has used NDA terms to prevent QEMU developers from releasing support for (apparently because it wants to sell proprietary emulators, and considers a GPLed emulator a threat). These processors run ARMv4/v5 code.

The newest archtecture that can be emulated by QEMU is ARMv5TEJ (I.E. ARMv5 with the Thumb, Enhanced DSP, and Java extensions). Unfortunately, ARM Ltd. has leveraged its NDAs with prominent open source developers to explicitly forbid them from contributing ARMv6 support to QEMU, apparently because it's trying to sell a competing proprietary emulation product.

Newer ARM processors run older instruction sets, and are thus backwards compatible. The advantage of newer instruction sets is that they execute faster (and are thus more energy efficient), and some produce smaller binary sizes (the "Thumb" extensions are designed specifically for small code size, but may exchange performance to get it). Recompiling an ARMv4 program as ARMv5 usually results in a 25% performance improvement.

6.1.3 ia64

The Itanium was a failed attempt to create a 64-bit successor to the x86, a role that went to AMD's x86-64 design instead. In 1994, Intel partnered with HP to produce a successor to both x86 and HP's PA-RISC, with a new instruction set ("ia64") fundamentally different from both. To support software written for the older processors, the designers included a complete implementation of each, because the new chip was already so big and complex that including _two_ entire previous processors wasn't a significant increase to either. (If this sounds unlikely to end well...)

The result was a late, slow, inefficient chip that was difficult to manufacture, more expensive than available alternatives, difficult to write efficient compilers for, quickly nicknamed "Itanic" and essentially ignored by the market. (This was remarkably similar to Intel's earlier i432 project, a 1970's attempt to jump straight from the 8 bit 8080 to a 32-bit processor which also resulted resulted in a slow, late, overcomplicated and overpriced design which the industry ignored. The i432 was finally killed off by the arrival of the 80286, which outperformed it by a factor of four. History does repeat itself.)

For comparison purposes, the Ford Edsel sold 64,000 units in its first year. Itanium took over four years to sell that many: only 500 units in 2001, 3,500 in 2002, and around 19,000 in 2003 and 30,000 in 2004. In 2005, x86-64 systems emerged as the new 64-bit PC standard, at which point Dell and IBM discontinued their Itanium servers and HP discontinued its Itanium workstations.

To give a sense of perspective, in the first quarter of 2007, the licensees of ARM Inc. shipped 724 million ARM processors. (In one quarter, not a full year.) In the third quarter of 2007, the PC market shipped http://www.news.com/8301-10784_3-9825843-7.html>68.1 million systems (mostly x86-64). Over in PowerPC land, from their launch through August 2007 the Wii had sold 9 million units, Xbox 360 8.9 million, and Playstation 3 3.7 million (all three PowerPC based). Shipments of many other interesting processor families each number in the millions of units annually. The Itanium's cumulative total of 0.05 million in its first four years combined doesn't even show up on the same graph.

The history of Itanium through 2003 was extensively detailed here. A more recent obituary for the chip is zdnet's Itanium: A cautionary tale.

Despite the Itanium's failure to gain any marketplace traction (and Linus Torvalds' personal disdain for the chip, the billions of dollars poured into Itanium resulted in lots of corporate engineers assigned to developing extensive Linux support for this virtually nonexistent hardware. But despite a documented instruction set, no open source emulators run Itanium code due to lack of interest. (HP does offer a binary-only Itanium emulator called SKI, last updated in 2004.)

Silicon Graphics still produces Itanium systems. HP no longer produces Itanium workstations, but offers some Itanium servers. Intel still spends money on it.

6.1.4 m68knommu

The most popular nommu 68k variant is Coldfire, which uses a subset of the 68k instruction set and has no memory management unit. Coldfire is currently used in a small number of high volume devices. (I.E. Coldfire isn't used in many different products, but the products it's used in are produced in high volume.)

Running Linux on a DSP: Exploiting the Computational Resources of a programmable DSP Micro-Processor with uClinux (OLS 2002)

6.1.5 mips

Mips is probably the main competitor to ARM. One advantage of MIPS is its availability as a FPGA program, allowing easy prototyping of custom hardware.

SGI produced primarily MIPS systems back in the Irix days. Sony's Playstation 2, and PSP are MIPS based, as are some Tivo and Linksys devices.

MIPS architecture

The Linux/MIPS web page

6.1.6 parisc

The PA-RISC is from Hewlett Packard. It was scheduled to be discontinued in favor of the Itanium, but the failure of ia64 led to a restart of PA-RISC development.

Porting Drivers to HP ZX1

6.1.7 powerpc

The PowerPC was created in the early 90's by a parnership between IBM, Apple, and Motorola. Apple switched to x86-64 in 2005 and Motorola spun off its processor division as Freescale (which now also manufactures Coldfire and ARM processors). But IBM is still strongly behind PowerPC, and the various users of PowerPC formed a consortium to promote and develop it.

PowerPC is commonly used in high volume set-top boxes and game consoles such as the PlayStation 3, Xbox and Xbox 360, and Nintendo Wii. PowerPC is the third most common processor type in the Top 500 supercomputers list, and was used in older cell phones (before Motorola spun off Freescale).

The most interesting recent PowerPC development is the Cell processor, which combines a PowerPC core with 8 DSP-like "synergistic processing units" which can offload compute-intensive tasks like 3D acceleration, compression, encryption, and so on.

The PowerPC 7xx is the "386" of PowerPC systems, meaing most modern PowerPC processors can run code compiled fro PowerPC 7xx (although such older code may not take full advantage of the new chip's capabilities, especially with regard to performance). The PowerPC family also has 64-bit variants (an early version of which Apple marketed as the "G5") that can still run 32-bit PowerPC code.

The main exceptions to 7xx compatability are two embedded subsets of the PowerPC, which were separately developed by IBM (the 4xx series) and Motorola (the 8xx series) for use in low power devices. These are stripped down PowerPC processors in roughly the same way Coldfire was a stripped down 68k: instructions were removed from the architecture to get the transistor count down, and thus code must be recompiled to avoid using those instructions. Unfortunately, the two vendors chose a different subset of the PowerPC instruction set, so code compiled for 4xx won't run on 8xx, and vice versa.

The 4xx line was purchased by AMCC (which has the most annoying website design ever, click one of the tabs to get it to STOP MOVING). Freescale mostly seems to have lost interest in the 8xx now that Motorola has switched its' cell phones to arm, but information is still available.

The Linux PowerPC developers hang out on the #mklinux channel on irc.freenode.net.

The Linux Kernel on iSeries (OLS 2001) PowerPC 64-bit Kernel Internals (OLS 2001) PowerPC implementation reference for QEMU

6.1.8 ppc

The "ppc" architecture is obsolete, and scheduled for removal in June 2008.

Once upon a time, ARCH=ppc was for 32-bit PowerPC processors (7xx and up), and ARCH=powerpc was for 64-bit (970/G5 and up), but the two architectures were merged together and support for most boards has since been ported over to powerpc. If you care about any of the remaining boards, bug the powerpc maintainers.

Note that ARCH=ppc does not support newer features like "make headers_install", but ARCH=powerpc does.

6.1.9 um

User Mode Linux is a port of Linux to run as a userspace program. Instead of talking to the hardware, it makes system calls to the C library. Instead of using a memory managment unit it makes clever use of mmap.

UML is sort of like an emulator: it can run Linux programs under itself (its processes show up as threads to the host system). It's sometimes used as a superior "fakeroot", and sometimes used to provide an emulated system for honeypots or shared hosting services. It's an excellent tool for learning and debugging the Linux kernel, because you can use all the normal userspace debugging techniques, up to and including putting "printf()" statements into the source code to see what it's doing. (It's great for developing things like filesystems, not so good for device drivers.)

6.1.10 x86_64

x86-64 is the 64-bit successor to x86, and the new dominant PC processor. Essentially all current PCs are now shipping with x86-64 processors, including traditionally non-x86 architectures such as Apple's Macintosh and Sun's servers.

Porting Linux to x86-64 (OLS 2001)

6.2 DMA, IRQ, MMU (mmap), IOMMU, port I/O

6.3 Busses

6.3.1 PCI, USB

http://www.linux-usb.org/USB-guide/book1.html Documentation/usb

PCIComm: A Linux Device Driver for Communication over PCI Shared Memory (OLS 2001)

Linux performance tuning using Powertweak (OLS 2001)

7 Following Linux development

7.1 Distibutions

7.2 Releases

7.2.1 Source control

Linux releases from 0.0.1 through 2.4.x used no source control system, just release tarballs. Releases 2.5.0 through 2.6.12-rc2 used a proprietary source control system called BitKeeper. Releases 2.6.12-rc2 through the present use a source control system called git.

Early Linux development didn't use source control. Instead Linus would apply patches to his copy of the source, and periodically release tarball snapshots of his development tree with a hand-edited changelog file noting who contributed each patch he'd applied. Most of these patches were posted to the Linux Kernel Mailing List, and with a little effort could be fished out of the mailing list archives.

This worked for many years, but didn't scale as Linux development grew. Eventually the issue came to a head [link], and after some discussion Linus decided to use a proprietary distributed source control system called BitKeeper for the 2.5 development branch. Linux releases v2.5.0 through v2.6.12-rc2 were put out this way.

Linux development no longer uses BitKeeper, due to the sudden expiration of the "Don't piss off Larry license" under which BitKeeper was made available to the Linux community (more here). This prompted Linus to take a month off from Linux development to write his own distributed source control system, git. This is why the current source control history in the main git development repository goes back to 2.6.12-rc2. (The revision history from the BitKeeper era was later converted to git, but remains separate for historical reasons.)

Linus initially chose BitKeeper because he wanted a distributed source control system, and the open source alternatives available at the time were all centralized source control systems.

In a distributed source control system, every user has a complete copy of the project's entire revision history, which they can add their own changes to locally. A centralized source control system requires a single central location, with user accounts to control access and either locking the tree or rejecting attempts to apply out of date patches. A distributed source control system is instead designed to download and merge changes from many different repositories after they're checked in to those other repositories. The source control system handles almost all of this merging automatically, because it can trace the changes in each repository back to a common ancestor, and then use three-way merge algorithms to better understand the changes. (Patches don't indicate which version they apply to. A distributed source control system has more information avialable to it, and uses that information to automatically merge changes more effectively.)

This allows Linux subsystem maintainers to develop and test their own local versions, then send changes to Linus in large batches (without smearing together the individual patches they committed), and finally resync with Linus's repository to get everyone else's changes. Open source development is already distributed, so distributed source control is a better fit. In this development model, Linus's repository serves as a coordination point, but not a development bottleneck for anything except putting out releases (which come from Linus's repository).

Linus described the appeal of distributed source control, and his reasons for developing git, in the Google Video tech talk Linus Torvalds on git.

The linux kernel source is also available as a mercurial repository, another popular open source distributed source control system.

This paper still serves as a decent introduction to distributed source control: BitKeeper for Kernel Development (OLS 2002, obsolete)

7.3 community

  CATB
  http://vger.kernel.org/vger-lists.html
  http://www.tux.org/lkml/
  lwn, kernel traffic, kernelplanet.
  http://www.kernel.org/faq
    http://www.kernel.org/kdist/rss.xml
  git/mercurial
  Documentation/{CodingStyle,SubmitChecklist}
  The four layer (developer, maintainer, subsystem, linus) model.
  Politics
    Stable API nonsense
    Why reiser4 not in.

7.4 Submitting Patches

8 Glossary