view master.idx @ 67:992a411c98b6

History of the process scheduler, notes about cross compiling, and some tweaks.
author Rob Landley <>
date Wed, 10 Oct 2007 04:51:31 -0500
parents eb75b1bfc3d1
children 1f41643cfe8e
line wrap: on
line source

<title>Linux Kernel Documentation</title>

<h2>Linux Kernel Documentation Index</h2>

<p>This page collects and organizes documentation about the Linux kernel, taken
from many different sources.  What is the kernel, how do you build it, how do
you use it, how do you change it...</p>

<p>This is a work in progress, and probably always will be.  Please let us know
on the
<a href=>linux-doc</a> mailing
list (on about any documentation you'd like added to this
index, and feel free to ask about any topics that aren't covered here yet.
This index is maintained by Rob Landley &lt;;, and tracked in
<a href=>this mercurial repostiory</a>.  The
cannonical location for the page is <a href=>here</a>.</p>




<span id="Sources of documentation">

<p>These are various upstream sources of documentation, many of which are linked
into the <a href=>linux kernel documentation index</a>.</p>

<li><a href=Documentation>Text files in the kernel's Documentation directory.</a></li>
<li><a href=htmldocs>Output of kernel's "make htmldocs".</a></li>
<li><a href=menuconfig>Menuconfig help</a></li>
<li><a href=readme>Linux kernel README files</a></li>
<li><a href=xmlman>html version of man-pages package</a></li>
<li><a href=>Linux Weekly News kernel articles</a></li>
<li>Linux Device Drivers book (<a href=>third edition</a>) (<a href=>second edition</a>)</li>
<li><a href=ols>Ottawa Linux Symposium papers</li>
<li><a href=>Linux Journal archives</a></li>
<li><a href=>IBM Developerworks Linux Library</a> (also <a href=>here</a>)
<li><a href=>Linux Kernel Mailing List FAQ</a></li>
<li><a href=>Kernel Planet (blog aggregator)</a></li>
<li><a href=video.html>Selected videos of interest</a></li>
<li><a href=local>Some locally produced docs</a></li>

<span id="Standards">
<li><a href=>Single Unix Specification v3</a> (Also known as Open Group Base Specifications issue 6, and closely overlapping with Posix.  See especially <a href=>system interfaces</a>)</li>
<li><a href=>ISO/IEC C9899</a> the "C99" standard, defining the C programming language.</a></li>
<li><a href=>Linux Foundation's specs page</a> (ELF, Dwarf, ABI...)</li>
</span id="Standards">

<span id="Translations">
<li><a href=>Linux Kernel Translation Project</a></li>
<li><a href=>Kernel Newbies regional pages</a></li>
<li><a href=>Japanese</a></li>
<li><a href=>Chinese</a></li>
</span id="Translations">

</span id="Sources of documentation">

<span id="Building from source">
  <span id="User interface">
    <span id="Configuring">
    <span id="building">
      <span id="Building out of tree">
    <span id="Installing">
    <span id="running">
    <span id="debugging">
      <span id="QEMU">
    <span id="cross compiling">
      <span id="User Mode Linux">
  <span id="Infrastructure">
    <span id="kconfig">
    <span id="kbuild">
    <span id="build and link (tmppiggy)">

<span id="Installing and using the kernel">
  <span id="Installing">
    <span id="Kernel image">
    <span id="Bootloader">
  <span id="A working Linux root filesystem">
    <span id="Finding and mounting /">
      <span id="initramfs, switch_root vs pivot_root, /dev/console">
    <span id="Running programs">
      <span id="init program and PID 1">
        <span id="What does daemonizing really mean?">
      <span id="Executable formats">
<p>The Linux kernel runs programs in response to the
<a href=xmlman/man3/exec.html>exec</a> syscall, which is called on a
file.  This file must have the
executable bit set, and must be on a filesystem that implements mmap() and which
isn't mounted with the "noexec" option.  The kernel understands
several different <a href="#executable_file_formats">executable file
formats</a>, the most common of which are shell scripts and ELF binaries.</p>
        <span id="Shell scripts">
<p>If the first two bytes of an executable file are the characters "#!", the
file is treated as a script file.  The kernel parses the first line of the file
(until the first newline), and the first argument (immediately following
the #! with no space) is used as absolute path to the script's interpreter,
which must be an executable file.  Any additional arguments on the first
line of the file (separated by whitespace) are passed as the first arguments
to that interpreter executable.  The interpreter's next argument is the name of
the script file, followed by the arguments given on the command line.</p>

<p>To see this behavior in action, run the following:</p>
<pre>echo "#!/bin/echo hello" > temp
chmod +x temp
./temp one two three

<p>The result should be:</p>
<blockquote>hello ./temp one two three</blockquote>

<p>This is how shell scripts, perl, python, and other scripting languages
work.  Even C code can be run as a script by installing the
<a href=>tinycc</a> package,
adding "#!/usr/bin/tcc -run" to the start of the .c file, and setting the
executable bit on the .c file.</p>
        <span id="ELF">
          <span id="Shared libraries">

      <span id="C library">
<p>Most userspace programs access operating system functionality through a C
library, usually installed at "/lib/*".  The C library wraps system
calls, and provides implementations of various standard functions.</p>

<p>Because almost all other programming languages are implemented in C
(including python, perl, php, java, javascript, ruby, flash, and just about
everything else), programs written in other languages also make use of the
C library to access operating system services.</p>

<p>The most common C library implementations for Linux are
<a href=>glibc</a>
and <a href=>uClibc</a>.  Both are full-featured
implementations capable of supporting a full-featured desktop Linux

<p>The main advantage of glibc is that it's the standard implementation used by the
largest desktop and server distributions, and has more features than any other
implementation.  The main advantage of uClibc is that it's much smaller and
simpler than glibc while still implementing almost all the same functionality.
For comparison, a "hello world" program statically linked against glibc is half
a megabyte when stripped, while the same program statically linked against
uClibc strips down to 7k.</p>

<p>Other commonly used special-purpose C library implementations include
<a href=>klibc</a> and
<a href=>newlib</a>.</p>

<span id="Exporting kernel headers">
<p>Building a C library from source code requires a special set
of Linux kernel header files, which describe the API of the specific version
of the Linux kernel the C library will interface with.  However, the header
files in the kernel source code are designed to build the kernel and contain
a lot of internal information that would only confuse userspace.  These
kernel headers must be "exported", filtering them for use by user space.</p>

<p>Modern Linux kernels (based on and newer) export kernel headers via
the "make headers_install" command.  See
<a href=local/headers_install.txt>exporting kernel headers for use by
userspace</a> for more information.</p>
      <span id="Dynamic loader">
    <span id="FHS directories">
      <p>FHS spec</p>
      <a href="pending/hotplug.txt">populating /dev from sysfs</a>.

<span id="Reading the source code">
  <span id="Source code layout">
    <span id="Following the boot process">
    <span id="Major subsystems">
    <span id="Architectures">
  <span id="Concept vs implementation">
    <p>Often the first implementation of a concept gets replaced.
       Journaling != reiserfs, virtualization != xen, devfs gave way to udev...
       Don't let your excitement for the concept blind you to the possibility
       of alternate implementations.</p>
  <span id="Concepts">
    <span id="rbtree">
    <span id="rcu">

<span id="Kernel infrastructure">
  <span id="Process Scheduler">

<span id="History of the Linux Process Scheduler">
<p>The original Linux process scheduler was a simple design based on
a goodness() function that recalculated the priority of every task at every
context switch, to find the next task to switch to.  This served almost
unchanged through the 2.4 series, but didn't scale to large numbers of
processes, nor to SMP.  By 2001 there were calls for
change (such as <a href=ols/2001/elss.pdf>this OLS paper</a>), and the
issue <a href=>came to a head</a> in December 2001.</p>

<p>In January 2002, Ingo Molnar
<a href=>introduced the "O(1)" process scheduler</a> for the 2.5 kernel series, a design
based on separate "active" and "expired" arrays, one per processor.  As the name
implied, this found the next task to switch to in constant time no matter
how many processes the system was running.</p>

<p>Other developers (<a href=>such as Con Colivas</a>) started working on it,
and began a period of extensive scheduler development.  The early history
of Linux O(1) scheduler development was covered by the website Kernel

<p>During 2002 this work included
<a href=>preemption</a>,
<a href=>User Mode Linux support</a>,
<a href=>new drops</a>,
<a href=>runtime tuning</a>,
<a href=>NUMA support</a>,
<a href=>cpu affinity</a>,
<a href=>scheduler hints</a>,
<a href=>64-bit support</a>,
<a href=>backports to the 2.4 kernel</a>,
<a href=>SCHED_IDLE</a>,
discussion of <a href=>gang scheduling</a>,
<a href=>more NUMA</a>,
<a href=>even more NUMA</a>).  By the end of 2002, the O(1) scheduler was becoming
the standard <a href=>even in the 2.4 series</a>.</p>

<p>2003 saw support added for
<a href=>hyperthreading as a NUMA variant</a>,
<a href=>interactivity bugfix</a>,
<a href=>starvation and affinity bugfixes</a>,
<a href=>more NUMA improvements</a>,
<a href=>interactivity improvements</a>,
<a href=>even more NUMA improvements</a>,
a proposal for <a href=>Variable Scheduling Timeouts</a> (the first rumblings of what
would later come to be called "dynamic ticks"),
<a href=>more on hyperthreading</a>...</p>

<p>In 2004 there was work on <a href=>load balancing and priority handling</a>, and
<a href=>still more work on hyperthreading</a>...</p>

<p>In 2004 developers proposed several extensive changes to the O(1) scheduler.
Linux Weekly News wrote about Nick Piggin's
<a href=>domain-based scheduler</a>
and Con Colivas' <a href=>staircase scheduler</a>.  The follow-up article <a href=>Scheduler tweaks get serious</a> covers both.  Nick's scheduling domains
were merged into the 2.6 series.</p>

<p>Linux Weekly News also wrote about other scheduler work:</p>

<li><a href=>Filtered wakeups</a></li>
<li><a href=>When should a process be migrated</a></li>
<li><a href=>Pluggable and realtime schedulers</a></li>
<li><a href=>Low latency for audio applications:</a></li>
<li><a href=>Solving starvation problems in the scheduler:</a></li>
<li><a href=>SMPnice</a></li>

<p>In 2007, Con Colivas proposed a new scheduler, <a href=>The Rotating Staircase Deadline Scheduler</a>, which
<a href=>hit a snag</a>.  Ingo
Molnar came up with a new scheduler, which he named the
<a href=>Completely Fair Scheduler</a>,
described in the LWN writeups
<a href=>Schedulers: the plot thickens</a>,
<a href=>this week in the scheduling discussion</a>, and
<a href=>CFS group scheduling</a>.</p>

<p>The CFS scheduler was merged into 2.6.23.</p>

    <span id="fork, exec">
    <span id="sleep">
  <span id="Timers">
    <span id="Interrupt handling">
  <span id="memory management">
    <li><a href="gorman">Understanding the Linux Virtual Memory Manager</a>, by Mel Gorman.</li>
    <li><a href=>What every programmer should know about memory</a> by Ulrich Drepper.</li>
    <li>Ars technica ram guide, parts
<a href=>one</a>
<a href=>two</a>
<a href=>three</a></li>
    <span id="mmap, DMA">
  <span id="vfs">
    <span id="Filesystems">
      <span id="Types of filesystems (see /proc/filesystems)">
        <span id="Block backed">
        <span id="Ram backed">
          <span id="ramfs">
          <span id="tmpfs">
        <span id="Synthetic">
          <span id="proc">
          <span id="sys">
          <span id="internal (pipefs)">
          <span id="usbfs">
          <span id="devpts">
          <span id="rootfs">
          <span id="devfs (obsolete)">
<p>Devfs was the first attempt to do a dynamic /dev directory which could change
in response to hotpluggable hardware, by doing the seemingly obvious thing of
creating a kernel filesystem to mount on /dev which would adjust itself as
the kernel detected changes in the available hardware.</p>

<p>Devfs was an interesting learning experience, but turned out to be the wrong
approach, and was replaced by sysfs and udev.  Devfs was removed in kernel
version 2.6.18.  See
<a href=local/hotplug-history.html>the history of hotplug</a> for details.</p>

        <span id="Network">
          <span id="nfs">
          <span id="smb/cifs">
          <span id="FUSE">
      <span id="Filesystem drivers">
        <span id="Using">
        <span id="Writing">

  <span id="Drivers">
    <span id="Filesystem">
    <span id="Block (block layer, scsi layer)">
      <span id="SCSI layer">
	<li><a href="Documentation/scsi">Documentation/scsi</a> scsi.txt scsi_mid_low_api.txt scsi-generic.txt scsi_eh.txt</li>
        <li><a href="">SCSI Generic (sg) HOWTO</a></li>
	<li><a href="xmlman/man4/sd.html">man 4 sd</a></li>
        <li><a href="">SCSI standards</a></li>
    <span id="Character">
      <span id="serial">
      <span id="keyboard">
      <span id="tty">
        <span id="pty">
      <span id="audio">
      <span id="null">
      <span id="random/urandom">
      <span id="zero">
    <span id="DRI">
    <span id="Network">

  <span id="Hotplug">
  <span id="Input core">
  <span id="Network">
  <span id="Modules">
    <span id="Exported symbols">
      <p>List of exported symbols.</p>
  <span id="Busses">
  <span id="API (how userspace talks to the kernel)">
    <span id="Syscalls">
    <span id="ioctls">
    <span id="executable file formats">
      <span id="a.out">
      <span id="elf">
        <span id="css, bss, etc.">
      <span id="#!">
      <span id="flat">
      <span id="misc">
    <span id="Device nodes">
    <span id="Pipes (new pipe infrastructure)">
    <span id="Synthetic filesystems (as API)">

<span id="Hardware">
<span id="Cross compiling vs native compiling">
<p>By default, Linux builds for the same architecture the host system is
running.  This is called "native compiling".  An x86 system building an x86
kernel, x86-64 building x86-64, or powerpc building powerpc are all examples
of native compiling.</p>

<p>Building different binaries than the host runs is called cross compiling.
<a href=>Cross
compiling is hard</a>.  The build system for the Linux kernel supports cross
compiling via a two step process: 1) Specify a different architecture (ARCH)
during the configure, make, and install stages.  2) Supply a cross compiler
(CROSS) which can output the correct kind of binary code.</p>

<p>To specify a different architecture than the host, either define the "ARCH"
environment variable or else add "ARCH=xxx" to the make command line for each
of the make config, make, and make install stages.  The acceptable values for
ARCH are the names of the directories in the "arch" subdirectory of the Linux
kernel source code.  All stages of the build must use the same architecture.
(Building a second architecture in the same source directory requires "make
distclean"; just "make clean" isn't sufficient.)</p>

<p>To specify a cross compiler prefix, define the CROSS environment variable
(or add CROSS= to each make command line).  Native compiler tools, which output
code aimed at the environment they're running in, usually have a simple name
("gcc", "ld", "strip").  Cross compilers usually add a prefix to
the name of each tool, indicating the target they produce code for.  To tell
the Linux kernel build to use a cross compiler named "armv4l-gcc" (and
corresponding "armv4l-ld" and "armv4l-strip") specify "CROSS=armv4l-".
(Prefixes ending in a dash are common, and forgetting the trailing dash in
CROSS is a common mistake.  Don't forget to add the cross compiler tools to
your $PATH.)</p>

  <span id="Architectures">

  <span id="DMA, IRQ, MMU (mmap), IOMMU, port I/O">
  <span id="Busses">
    <span id="PCI, USB">

<span id="Following Linux development">
  <span id="Distibutions.">
  <span id="Releases">
    <span id="Source control">
  <span id="community">
  lwn, kernel traffic, kernelplanet.
  The four layer (developer, maintainer, subsystem, linus) model.
    Stable API nonsense
    Why reiser4 not in.
  </span id="community">
  <span id="Submitting Patches">

<span id="Glossary">