view local/hotplug-history.html @ 84:598061b944e5

Minor update to
author Rob Landley <>
date Wed, 07 Nov 2007 01:45:27 -0600
parents 87598e3a8e3c
line wrap: on
line source

<title>The history of hotplug</title>

<h1><b>The history of hotplug.</b></h1>

<h2>What is hotplug, what problems does it solve, why do we have the current
set of hotplug mechanisms, and what legacy mechanisms did the current
hotplug implementation obsolete?</h2>

<li><a href="#before">Before hotplug</a></li>
<li><a href="#removable">Removable media</a></li>
<li><a href="#modules">Modules</a></li>

<h2><a name="before"><b>Before hotplug: static everything.</b></h2>

<p>Originally, the kernel had no hotplug capability.  A kernel without hotplug
manages a fixed set of hardware, all of which is detected and initialized at
boot time, and all of which remains present until the system is shut down.
This is very simple, but also very limited.</p>

<p>This meant device drivers statically linked into the kernel image,
and a /dev directory filled of device nodes for every potential device
when the system was installed.  A program that wanted to probe for available
hardware sifted through /dev and opened devices it found there, elminating
the ones which gave an -ENODEV error.</p>

<p>One reason for this was simplicity, but equally important was that early PCs
were not designed around hotpluggable hardware, and Linux started out on a PC.
When the PC was introduced, users couldn't even switch keyboards while the
machine was on without risking hardware damage.</p>

<p>[FOOTNOTE]Of course users widely ignored this constraint whenever they could.  Users
hotplugged keyboards, serial, and parallel devices all the time, no matter
what the manufacturer said, and by the late 80's most hardware developers had
adapted to reality and buffered their more vulnerable external I/O ports.  But
the number of keyboard, serial, and parallel ports on the machine remained
fixed, and each port could handle only one device at a time, so drivers focused
on handling ports and left figuring out what device was behind an I/O port to
the userspace application trying to talk to that device.[/FOOTNOTE]</p>

<h2><a name="removable">Removable media</h2>

<p>The original PC did have one type of early hotplug: it had removable storage
media in the form of floppy drives (and later, CD-ROM drives, zip disks,
and DVD drives).  The first stirrings of hotplug support came from Linux's
need to cope with removable media.</p>

<p>With removable media, the contents of the corresponding block devices
changed, including even the size of the media represented by those block
devices.  Since filesystems could depend on those block devices, and processes
depended on those filesystems, in extreme cases ejecting a floppy could
lead to a kernel panic.</p>

<p>The kernel's response to this was to ignore as much of the problem as
possible, and work around the rest.  Removable media was treated as a special
case, and the kernel grew workarounds rather than any real systematic
solution to a larger generic problem.</p>

<p>Since the drives themselves stayed around awaiting the insertion of
new media, media were treated as a property of drives, and most drives could
have exactly one instance of removable media in them at a time anyway.
[FOOTNOTE]There were "jukebox" style multi-CD changers, but they were poorly
supported and mostly treated like a single drive with multiple
So device drivers used device nodes representing the drive instead
of the media, and when the drive contained no media the driver would respond to
attempts to access the drive's device node with error codes.</p>

<p>Poor hardware support for hotplug continued to be a problem: most removable
media provided no notification mechanism to inform the system
when media was inserted or removed.  The driver could probe the device to
see what media it contained at any given moment, but no interrupt was
generated to signal changes.  Thus the kernel
had no way to respond to the a block device's removal except via extensive
error handling after the fact.</p>

<p>Applications using removable media probed for them or
received error codes on attempted access to an empty drive, and the kernel
developer's advice about dealing with the problems of removing a mounted
volume was "don't do that": inserting or removing media when the system didn't
expect it was dismissed as user error.</p>

<h3>A workaround: drive locking</h2>

<p>To avoid being surprised by the unexpected removal of media containing
a mounted filesystem, some drives grew the ability for software to "lock"
a drive, preventing it from ejecting its media until it was unlocked.
(Pressing the eject button still didn't generate an interrupt, it simply had
no effect until the software unlocked the drive.)  This let the operating
system force users to eject removable media from software (via the "eject"
command) rather than by pressing the button on the drive, allowing the OS to
safely use the block device at the expense of annoying users.</p>

<p>By locking the drive to prevent unauthorized eject, the hotplug-less kernel
avoided having to unmount filesystems on short notice.  This meant the kernel
didn't have to promptly flush buffers when data was written to the device (to
minimize unmount time), and that the kernel could veto attempts to unmount a
filesystem that was still in use for any reason, such as due to any process
having open files in that filesystem or that filesystem containing any
process's current directory.  (Yes, even though a process's current directory
could be deleted, it couldn't be unmounted.  Not for any deep technical reason;
support for it simply hadn't been implemented.  Removable media was a poorly
supported afterthought.)</p>

<p>Of course some drives (most notably PC floppy drives) had no provision for
locking the drive; ejecting a floppy was a manual process controlled by a
purely mechanical button.  Users that didn't remember to manually unmount a
floppy lost data, and were largely mocked as clueless by traditional Unix
developers (or else PC hardware was mocked for not having drive locking
support).  The "mtools" package provided a popular workaround, reading and
writing FAT files directly through a floppy disk's unmounted block device,
probing for media before each command, and flushing all data after each
command.  (It even accepted dos-style names for floppy drives.)</p>

<p>As late as Linux 1.0, support for eject was still a special case for CDROM
drives (/include/linux/cdrom.h had a "#define CROMEJECT"), and door locking
support was a special case for SCSI (/drivers/scsi/sci_ioctl.h #defined
SCSI_IOCTL_DOORLOCK).  As late as Linux 2.4, the underlying problems
with dynamically unplugging block devices were considered too hard to
properly solve.</p>

<h2><a name="modules">Modules</h2>

<p>The first serious hotplug mechanism in Linux was modules, because modules
allow device drivers to be loaded after the kernel boots and unloaded again
before shutdown.</p>

<p>Hotplug was a side effect, since most hardware used by Linux still wasn't
hotpluggable.  The primary motivation for modules was reducing the memory
footprint of kernels, which was increasing due to the proliferation of device
drivers.  A generically configured kernel, such as those in the emerging Linux
distributions, needed to be built with drivers for every piece of hardware it
might encounter, but most systems it ran on would use only a small subset of
those drivers.</p>

<p>Modules meant that device drivers could probe for hardware present when the
module was inserted, after the kernel booted.  A module that failed to find any
devices (of the kind it contained a device driver for) could refuse to load, so
attempting to load all modules was a simple way of probing for available
hardware.  Modules could even be removed and re-inserted to scan for and handle
new hardware.</p>

<p>This was an improvement, but not a complete solution.  Using modules as the
primary hotplug mechanism quickly revealed numerous deficiencies: the
granularity was wrong, there were insufficient notification mechanisms, and
this approach didn't handle configuration issues like device nodes.</p>

<p>The granularity is wrong because a module encapsulates a device driver, and
one instance of a driver can manage multiple instances of a device.  When
inserting a second instance of a device into a system, removing the module to
reinsert it (and thus find the new instance of the device) takes the first
instance of the device offline.  This is an extremely unpleasant side effect,
and not always possible if the device is in use.</p>

More to come...

Ad-hoc mechanisms to rescan busses 

module manages muliple devices; insert a second ethernet card.
    Need a way to tell module to rescan devices.
  Still need a separate notification mechanism to trigger module loading.
    Module loading either has to happen _after_ device insertion, or module
    has to be told to rescan after module loaded.
  Doesn't handle unplug.
    Notification problem worse: ideally need to know before device goes away
    so flush buffers, umount filesystems, close file handles, etc.  Cleanup.
    Unloading module.
  Doesn't handle /dev entries.
    Fill up /dev with every possible device, there could be thousands of them.
    (Every possible partition on every possible hard drive...)
    How does userspace know which ones are active?  (In a static /dev, presence
    of an entry gives no information about whether or not the device is there.
    Iterate through the lot and test.  Very slow, timeouts, generates spurious
    activity that can have unwanted side effects like spinning up drives...)

<h2>PCMCIA</h2> The arrival of PCMCIA (a hotpluggable 16-bit expansion card bus for early laptops) provided the first ,  USB, laptop docking stations, 

static everything.
The /proc directory
  Rescan scsi bus via /proc/scsi
laptops (pcmcia)
  Added in 2.2.7, written by Linus Torvalds throwing out earlier work.
  Having the driver detect the presence of hardware is backwards.  Driver
  loads in response to device being plugged in, but the driver detects the
  existence of the device...  chicken and egg problem.

The biggest problem with devfs
is that device nodes and physical devices aren't the same thing.  Some device
nodes have no corresponding hardware (such as /dev/null), and some devices
have multiple device nodes (such as partitioned hard drives).  Devfs had no
way to tell userspace what actual devices the system had separate from what
drivers were loaded, which was especially problematic when a newly inserted
device required some action from userspace (loading modules or firmware) before
device nodes could be created for it.  And some USB devices can be driven
entirely from userspace, with no kernel device driver

  A device that requires some action to
be taken (such as a userspace program loading firmware into it) between
device insertion and device node creation,   Suppose device that requires firmware to
be loaded (by userspace) before a device driver can probe it and create
appropriate /dev nodes
of information was missing from userspace.</p>

  Finally, the modern approach.
  /sbin/hotplug vs netlink

  Also, devfs provides /dev entries, but that's the wrong layer.  Some
  hardware provides multiple /dev entries (partitioned hard drives), some
  dev entries have no underlying hardware (/dev/zero, /dev/null, network
  block devices), and some devices have no /dev entry (ethernet, for
  historical reasons).

<h2>Static drivers, modules, and hotplug.</h2>

A kernel without any hotplug capability manages a fixed set of hardware,
all of which is detected and initialized at boot time, and all of which remains
present until the system is shut down.

The Linux module loading mechanism allowed drivers to be loaded after the
system boots

Early atte

Hotplug allows the kernel to dynamically respond to the addition or removal of
hardware.  At boot time, 

/sbin/hotplug or netlink.
Support for hotplug in Linux evolved out of Linus's rewrite of USB

  What's the probe for removable media?  (ioctl()?)