changeset 77:27dcbe1b4669

The git quickstart guide I've been working on for ages.
author Rob Landley <rob@landley.net>
date Thu, 18 Oct 2007 18:40:17 -0500
parents 75251bfa6b33
children 307408bf8982
files pending/git-quick.html
diffstat 1 files changed, 409 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/pending/git-quick.html	Thu Oct 18 18:40:17 2007 -0500
@@ -0,0 +1,409 @@
+<title>Following Linux kernel development with git</title>
+
+<h2>A "git bisect HOWTO" with a few extras.</h2>
+
+<p>This document tells you how to follow Linux kernel development (and
+examine its history) with git.  It does not assume you've ever used a source
+control system before, nor does it assume that you're familiar with
+"distributed" vs "centralized" source control systems.</p>
+
+<p>This document describes a read-only approach, suitable for trying out
+recent versions quickly, using "git bisect" to track down bugs, and
+applying patches temporarily to see if they work for you.
+If you want to learn how to save changes into your copy of the git history and
+submit them back to the kernel developers through git, you'll need
+<a href=http://www.kernel.org/pub/software/scm/git/docs/tutorial.html>a much
+larger tutorial</a> that explains concepts like "branches".  This one
+shouldn't get in the way of doing that sort of thing, but it doesn't go there.</p>
+
+<h2>Installing git</h2>
+
+<p>First, install a recent version of git.  (Note that the user interface
+changed drastically in git-1.5.0, and this page only describes the new
+interface.)</p>
+
+<p>If your distro doesn't have a recent enough version, you can grab a
+<a href=http://www.kernel.org/pub/software/scm/git/>source tarball</a> and
+build it yourself.  (There's no ./configure, as root go
+"make install prefix=/usr".  It needs zlib, libssl, libcurl, and libexpat.)</p>
+
+<p>When building from source, the easy way to get the man pages is to download
+the appropriate git-manpages tarball (at the same URL as the source code)
+and extract it into /usr/share/man.  You want the man pages because "git help"
+displays them.</p>
+
+<h2>Downloading the kernel with git</h2>
+
+<p>The following command will download the current linux-kernel repository into
+a local directory called "linux-git":</p>
+
+<blockquote>
+git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-git
+</blockquote>
+
+<p>This downloads a local copy of the entire revision history (back to
+2.6.12-rc2), which takes a couple hundred megabytes.  It extracts the most
+recent version of all the files into your linux-git directory, but that's just
+a snapshot (generally referred to by git people as your
+"<a href=http://www.kernel.org/pub/software/scm/git/docs/glossary.html#def_working_tree>working copy</a>").
+The history is actually stored in the subdirectory "linux-git/.git", and the
+snapshot can be recreated from that (or changed to match any historical
+version) via various git commands explained below.</p>
+
+<p>You start with an up-to-the-minute copy of the linux kernel source, which
+you can use just like an extracted tarball (ignoring the extra files in the
+".git" directory).  If you're interested in history from the bitkeeper days
+(before 2.6.12-rc2), that's stored in a seperate repository,
+"<b>git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git</b>".
+(<a href=http://git.kernel.org>Here is a list of all git repositories hosted
+on kernel.org</a>.)</p>
+
+<p>(If you forget the URL a git repository came from, it's in the file
+".git/FETCH_HEAD".  Normally you shouldn't need to care, since git remembers
+it.)</p>
+
+<h2>Updating your local copy</h2>
+
+<p>The command "<b>git pull</b>" downloads all the changes committed to Linus's
+git repository since the last time you updated your copy, and appends those
+commits to your copy of the repository (in the .git subdirectory).  In addition,
+this will automatically update the files in your working copy as appropriate.
+(If your working copy was set to a historical version, it won't be changed,
+but returning your working copy to the present after a pull will get you the
+newest version.)</p>
+
+<p>Note that this copies the revision history into your local .git directory.
+Other git commands (log, checkout, tag, blame, etc.) don't need to talk to
+the server, you can work on your laptop without an internet connection (or
+with a very slow one) and still have access to the complete revision history
+you've already downloaded.</p>
+
+<h2>Looking at historical versions</h2>
+
+<p>The <b>git log</b> command lists the changes recorded in your repository,
+starting with the most recent and working back.  The big hexadecimal numbers
+are unique identifiers (sha1sum) for each commit.  If you want to specify a
+commit, you only need the first few digits, enough to form a unique prefix.
+(Six digits should be plenty.)</p>
+
+<p>You can limit the log to a specific file or directory, which lists
+only the commits changing that file/directory.  Just add the file(s)
+you're interested in to the end of the <b>git log</b> command line.</p>
+
+<p>The <b>git tag -l</b> command shows all the tagged releases.  These
+human-readable names can be used as synonyms for the appropriate commit
+identifier, which is useful when doing things like checkout and diff.
+The special tag "<b>master</b>" points to the most recent commit.</p>
+
+<p>The <b>git blame $FILE</b> command displays all the changes that resulted in
+the current state of a file.  It shows each line, prefixed with the commit
+identifier which last changed that line.  (If the default version of <b>git
+blame</b> is difficult to read on an 80 charater terminal, try <b>git blame
+$FILE | sed 's/(.*)//'</b> to see more of the file itself.)</p>
+
+<h2>Working with historical versions</h2>
+
+<p>The <b>git checkout</b> command changes your working copy of the source to a
+specific version.  The -f option to checkout backs out any local changes
+you've made to the files.  The <b>git clean</b> command deletes any extra files
+in your working directory (ones which aren't checked into the repository).
+The -d option to clean deletes untracked directories as well as files.</p>
+
+<p>So to reset your working copy of the source to a historical version, go
+<b>git checkout -f $VERSION; git clean -d</b> where $VERSION is the tag or
+sha1sum identifier of the version you want.  If you don't specify a $VERSION,
+git will default to "master" which is the most recent checkout in Linus's
+tree (what mercurial calls "tip" and Subversion calls HEAD), returning you
+to the present and removing any uncommitted local changes.</p>
+
+<p>Another way to undo all changes to your copy is to do "rm -rf *" in
+the linux-git directory (which doesn't delete hidden files like ".git"),
+followed by "git checkout -f" to grab fresh copies from the repository in
+the .git subdirectory.  This generally isn't necessary.  Most of the time,
+<b>git checkout -f</b> is sufficient to reset your working copy to the most
+recent version in the repository.</p>
+
+<p>If you lose track of which version is currently checked out as your working
+copy, use <b>git log</b> to see the most recent commits to the version you're
+looking at, and <b>git log master</b> to compare against the most recent
+commits in the repository.</p>
+
+<h2>Using git diff</h2>
+
+<p>The command "git diff" shows differences between git versions.  You can
+ask it to show differences between:</p>
+<ul>
+<li><b>git diff</b> - the current version checked out from the respository and all files in the working directory</li>
+<li><b>git diff v2.6.21</b> - a specific historical version and all files in the working directory</li>
+<li><b>git diff v2.6.20 v2.6.21</b> - all files in two different historial
+versions</li>
+<li><b>git diff init/main.c</b> - specific locally modified files in the
+working directory that don't match what was checked out from the repository</li>
+<li><b>git diff v2.6.21 init/main.c</b> - specific file(s) in a specific historical version of the repository vs those same files in the working directory.</li>
+<li><b>git diff v2.6.20 v2.6.21 init/main.c</b> - specific files in two
+different historical version of the repository</li>
+</ul>
+
+<p>What git is doing is checking each argument to see if it recognizes it
+as a historical version sha1sum or tag, and if it isn't it checks to see if
+it's a file.  If this is likely to cause confusion, you can use the magic
+argument "--" to indicate that all the arguments before that are versions
+and all the arguments after that are filenames.</p>
+
+<p>The argument <b>--find-copies-harder</b> tells git diff to detect renamed or
+copied files.  Notice that git diff has a special syntax to indicate renamed
+or copied files, which is much more concise and portable than the traditional
+behavior of removing all lines from one file and adding them to another.
+(This behavior may become the default in a future version.)</p>
+
+<h2>Creating tarballs</h2>
+
+<p>The <b>git archive $VERSION</b> command creates a tarball (written to
+stdout) of the given version.  Note that "master" isn't the default here,
+you have to specify that if you want the most up-to-date version.
+You can pipe it through bzip and write it to a file (<b>git archive master |
+bzip2 > master.tar.bz2</b>) or you can use git archive to grab a clean copy
+out of your local git repository and extract it into another directory, ala:</p>
+
+<blockquote>
+<pre>
+mkdir $COPY
+git archive master | tar xCv $COPY
+</pre>
+</blockquote>
+
+<p>You can also use the standard Linux kernel out-of-tree building
+infrastructure on the git working directory, ala:</p>
+
+<blockquote>
+<pre>
+cd $GITDIR
+make allnoconfig O=$OTHERDIR
+cd $OTHERDIR
+make menuconfig
+make
+</pre>
+</blockquote>
+
+<p>Finally, you can build in your git directory, and then clean it up
+afterwards with <b>git checkout -f; git clean -d</b>.  (Better than
+"make distclean".)</p>
+
+<h2>Bisect</h2>
+
+<p>Possibly the most useful thing git does for non-kernel developers is
+<b>git bisect</b>, which can track down a bug to a specific revision.  This
+is a multi-step process which binary searches through the revision history
+to find a specific commit responsible for a testable change in behavior.</p>
+
+<p>(You don't need to know this, but bisect turns out to be nontrivial to
+implement in a distributed source control system, because the revision history
+isn't linear.  When the history branches and comes back together again, binary
+searching through it requires remembering more than just a single starting and
+ending point.  That's why bisect works the way it does.)</p>
+
+<p>The git bisect commands are:</p>
+<ul>
+<li><b>git bisect start</b> - start a new bisect.  This opens a new (empty)
+log file tracking all the known good and bad versions.</li>
+<li><b>git bisect bad $VERSION</b> - Identify a known broken version.  (Leaving
+$VERSION blank indicates the current version, "master".)</li>
+<li><b>git bisect good $VERSION</b> - Identify a version that was known to
+work.</li>
+<li><b>git bisect log</b> - Show bisection history so far this run.</li>
+<li><b>git bisect replay $LOGFILE</b> - Reset to an earlier state using the output of git bsect log.</li>
+<li><b>git bisect reset</b> - Finished bisecting, clean up and return to
+head.  (If git bisect start says "won't bisect on seeked tree", you forgot
+to do this last time and should do it now.)</li>
+</ul>
+
+<p>To track down the commit that introduced a bug via git bisect, start with
+<b>git bisect reset master</b> (just to be safe), then <b>git bisect start</b>.
+Next identify the last version known to work (ala <b>git bisect good
+v2.6.20</b>), and identify the first bad version you're aware of (if it's
+still broken, use "master".)</p>
+
+<p>After you identify one good and one bad version, git will grind for a bit
+and reset the working directory state to some version in between, displaying
+the version identifier it selected.  Test this version (build and run your
+test), then identify it as good or bad with the appropriate git bisect
+command.  (Just "git bisect good" or "get bisect bad", there's no need to
+identify version here because it's the current version.)  After each such
+identification, git will grind for a bit and find another version to test,
+resetting the working directory state to the new version until it narrows
+it down to one specific commit.</p>
+
+<p>The biggest problem with <b>git bisect</b> is hitting a revision that
+doesn't compile properly.  When the build breaks, you can't determine
+whether or not the current version is good or bad.  This is where
+<b>git bisect log</b> comes into play.</p>
+
+<p>When in doubt, save the git bisect log output to a file
+(<b>git bisect log > ../bisect.log</b>).  Then make a guess
+whether the commit you can't build would have shown the problem if you
+could build it.  If you guess wrong (hint: every revision bisect wants
+to test after that comes out the opposite of your guess, all the way to the
+end) do a <b>git bisect replay ../bisect.log</b> to restart from your
+saved position, and guess the other way.  If you realize after the fact you
+need to back up, the bisect log is an easily editable text file you can
+always chop a few lines off the end of.</p>
+
+<h2>Example git bisect run</h2>
+
+<p>Here is a real git bisect run I did on the <a href=http://qemu.org>qemu</a>
+git repository (git://git.kernel.dk/data/git/qemu) to figure out why
+the PowerPC Linux kernel I'd built was hanging during IDE controller
+intiialization under the current development version of qemu-system-ppc
+(but not under older versions).</p>
+
+<blockquote>
+<pre><b>$ git bisect reset master</b>
+Already on branch "master"
+<b>$ git bisect good release_0_8_1</b>
+You need to start by "git bisect start"
+Do you want me to do it for you [Y/n]? y
+<b>$ git bisect bad master</b>
+Bisecting: 753 revisions left to test after this
+[7c8ad370662b706b4f46497f532016cc7a49b83e] Embedded PowerPC Device Control Registers infrastructure.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Unhappy :(
+<b>$ git bisect bad # The test failed</b>
+Bisecting: 376 revisions left to test after this
+[255d4f6dd496d2d529bce38a85cc02199833f080] Simplify error handling again.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+WARNING: "gcc" looks like gcc 4.x
+Looking for gcc 3.x
+./configure: 357: Syntax error: Bad fd number
+<b>$ git bisect log > ../bisect.log  # Darn it, build break.  Save state and...</b>
+<b>$ git bisect good # Wild guess because I couldn't run the test.</b>
+Bisecting: 188 revisions left to test after this
+[16bcc6b31799ca01cd389db7cb90a345e9b68dd9] Fix wrong interrupt number for the second serial interface.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+<b>$ git bisect good # Hey, maybe my guess was right</b>
+Bisecting: 94 revisions left to test after this
+[37781cc88f69e45624c1cb15321ddd2055cf74b6] Fix usb hid and mass-storage protocol revision, by Juergen Keil.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+<b>$ git bisect good</b>
+Bisecting: 47 revisions left to test after this
+[30347b54b7212eba09db05317217dbc65a149e25] Documentation update
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+<b>$ git bisect good</b>
+Bisecting: 23 revisions left to test after this
+[06a21b23c22ac18d04c9f676b9b70bb6ef72d7f1] Set proper BadVAddress value for unaligned instruction fetch.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+<b>$ git bisect good</b>
+Bisecting: 11 revisions left to test after this
+[da77e9d7918cabed5b0725f87496a1dc28da8b8c] Fix exception handling cornercase for rdhwr.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+<b>$ git bisect good</b>
+Bisecting: 5 revisions left to test after this
+[36f447f730f61ac413c5b1c4a512781f5dea0c94] Implement embedded IRQ controller for PowerPC 6xx/740 & 750.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Unhappy :(
+<b>$ git bisect bad # Oh good, I was getting worried I'd guessed wrong above...</b>
+Bisecting: 2 revisions left to test after this
+[d4838c6aa7442fae62b08afbf4c358200f10ec74] Proper handling of reserved bits in the context register.
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+Bisecting: 1 revisions left to test after this
+[a8b64e6f4c7f3c4850be5fd303bf590564264294] Fix monitor disasm output for Sparc64 target
+<b>$ ./configure && make -j 2 && ~/mytest</b>
+...
+Happy :)
+<b>$ git bisect good</b>
+36f447f730f61ac413c5b1c4a512781f5dea0c94 is first bad commit
+commit 36f447f730f61ac413c5b1c4a512781f5dea0c94
+Author: j_mayer <j_mayer>
+Date:   Mon Apr 9 22:45:36 2007 +0000
+
+    Implement embedded IRQ controller for PowerPC 6xx/740 & 750.
+    Fix PowerPC external interrupt input handling and lowering.
+    Fix OpenPIC output pins management.
+    Fix multiples bugs in OpenPIC IRQ management.
+    Fix OpenPIC CPU(s) reset function.
+    Fix Mac99 machine to properly route OpenPIC outputs to the PowerPC input pins.
+    Fix PREP machine to properly route i8259 output to the PowerPC external
+      interrupt pin.
+
+:100644 100644 0eabacd6434b8e40876581605c619513bf9ac512 284cb92ae83a2a36e05137d3532106ff85167364 M      cpu-exec.c
+:040000 040000 68740f5b1330c7859abfea3ce31062cb92adaa7f 5c48b0d20f1c4d3115881b5e9e5b6c1d681f4880 M      hw
+:040000 040000 3ad1f0d09c60d8190d98b28318519ebaaccbb569 69efc274cec1801848de9238ae71e97681978433 M      target-ppc
+:100644 100644 2f87946e874e8f6cbf9afd47c65e0baff236dc45 b40ff3747530d275181ff071c9cc9cff1d5ba02d M      vl.h
+<b>$ git bisect reset</b>
+</pre>
+</blockquote>
+
+<p>
+
+<h2>Command summary</h2>
+
+<p><b>git help</b></p> - List available commands.  You can also go
+<b>git help COMMANDNAME</b> to see help on a specific command.  Note,
+this displays the man page for the appropriate command, so you need to have
+the git man pages installed for it to work.</p>
+
+<p><b>git clone git://blah/blah/blah localdir</b> - Download a repository
+from the web into "localdir".  Linus's current repository is at
+"git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git",
+the old history from the bitkeeper days (before 2.6.12-rc2) is at
+"git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git", and
+there are lots of <a href=http;//git.kernel.org>other trees hosted on
+kernel.org</a> and elsewhere.</p>
+
+<p><b>git pull</b> - Freshen up your local copy of the repository, downloading
+and merging all of Linus's changes since last time you did this.  In addition
+to appending lots of commits to your repository in the .git directory, this
+also updates the snapshot of the files (if it isn't already pointing into
+the past).</p>
+
+<p><b>git log</b> - List the changes recorded in your repository, starting with
+the most recent and working back.  Note: the big hex numbers are unique
+identifiers (sha1sum) for each commit.  If you want to specify a commit, you
+only need a unique prefix (generally the first four digits is enough).</p>
+
+<p><b>git tag -l</b> - Show all the tagged releases.  These human-readable
+names can be used as synonyms for the appropriate commit identifier when
+doing things like checkout and diff.  (Note, the special tag "master"
+points to the most recent commit.)</p>
+
+<p><b>git checkout -f; git clean -d</b> - reset your snapshot to the most recent
+commit.  The "checkout" command updates your snapshot to a specific version
+(defaulting to the tip of the current branch).  The -f argument says to back
+out any local changes you've made to the files, and "clean -d" says to
+delete any extra files in the snapshot that aren't checked into the
+repository.</p>
+
+<p><b>git diff</b> - Show differences between two commits, such as
+"git diff v2.6.20 v2.6.21".  You can also specify specific files you're
+interested in, ala "git diff v2.6.20 v2.6.21 README init/main.c".  If you
+specify one version it'll compare your working directory against that version,
+and if you specify no versions it'll compare the version you checked out
+against your working directory.  Anything that isn't recognized as the start of
+a commit indentifying sha1sum, or a tagged release, is assumed to be a filename.
+If this causes problems, you can add "--" to the command line to explicitly
+specify that arguments before that (if any) are version identifiers and all the
+arguments after that are filenames.  Add "--find-copies-harder" to detect
+renames.</p>
+
+<h2>Linus Tovalds talks about git</h2>
+
+<p>In <a href=http://youtube.com/watch?v=4XpnKHJAok8>this Google Tech Talk</a></p>
+
+<!--
+ "git show @{163}"... one character less...
+
+http://www.kernel.org/pub/software/scm/git/docs/glossary.html#def_working_tree
+-->