Mercurial > hg > kdocs
view pending/git-quick.html @ 77:27dcbe1b4669
The git quickstart guide I've been working on for ages.
author | Rob Landley <rob@landley.net> |
---|---|
date | Thu, 18 Oct 2007 18:40:17 -0500 |
parents | |
children |
line wrap: on
line source
<title>Following Linux kernel development with git</title> <h2>A "git bisect HOWTO" with a few extras.</h2> <p>This document tells you how to follow Linux kernel development (and examine its history) with git. It does not assume you've ever used a source control system before, nor does it assume that you're familiar with "distributed" vs "centralized" source control systems.</p> <p>This document describes a read-only approach, suitable for trying out recent versions quickly, using "git bisect" to track down bugs, and applying patches temporarily to see if they work for you. If you want to learn how to save changes into your copy of the git history and submit them back to the kernel developers through git, you'll need <a href=http://www.kernel.org/pub/software/scm/git/docs/tutorial.html>a much larger tutorial</a> that explains concepts like "branches". This one shouldn't get in the way of doing that sort of thing, but it doesn't go there.</p> <h2>Installing git</h2> <p>First, install a recent version of git. (Note that the user interface changed drastically in git-1.5.0, and this page only describes the new interface.)</p> <p>If your distro doesn't have a recent enough version, you can grab a <a href=http://www.kernel.org/pub/software/scm/git/>source tarball</a> and build it yourself. (There's no ./configure, as root go "make install prefix=/usr". It needs zlib, libssl, libcurl, and libexpat.)</p> <p>When building from source, the easy way to get the man pages is to download the appropriate git-manpages tarball (at the same URL as the source code) and extract it into /usr/share/man. You want the man pages because "git help" displays them.</p> <h2>Downloading the kernel with git</h2> <p>The following command will download the current linux-kernel repository into a local directory called "linux-git":</p> <blockquote> git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-git </blockquote> <p>This downloads a local copy of the entire revision history (back to 2.6.12-rc2), which takes a couple hundred megabytes. It extracts the most recent version of all the files into your linux-git directory, but that's just a snapshot (generally referred to by git people as your "<a href=http://www.kernel.org/pub/software/scm/git/docs/glossary.html#def_working_tree>working copy</a>"). The history is actually stored in the subdirectory "linux-git/.git", and the snapshot can be recreated from that (or changed to match any historical version) via various git commands explained below.</p> <p>You start with an up-to-the-minute copy of the linux kernel source, which you can use just like an extracted tarball (ignoring the extra files in the ".git" directory). If you're interested in history from the bitkeeper days (before 2.6.12-rc2), that's stored in a seperate repository, "<b>git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git</b>". (<a href=http://git.kernel.org>Here is a list of all git repositories hosted on kernel.org</a>.)</p> <p>(If you forget the URL a git repository came from, it's in the file ".git/FETCH_HEAD". Normally you shouldn't need to care, since git remembers it.)</p> <h2>Updating your local copy</h2> <p>The command "<b>git pull</b>" downloads all the changes committed to Linus's git repository since the last time you updated your copy, and appends those commits to your copy of the repository (in the .git subdirectory). In addition, this will automatically update the files in your working copy as appropriate. (If your working copy was set to a historical version, it won't be changed, but returning your working copy to the present after a pull will get you the newest version.)</p> <p>Note that this copies the revision history into your local .git directory. Other git commands (log, checkout, tag, blame, etc.) don't need to talk to the server, you can work on your laptop without an internet connection (or with a very slow one) and still have access to the complete revision history you've already downloaded.</p> <h2>Looking at historical versions</h2> <p>The <b>git log</b> command lists the changes recorded in your repository, starting with the most recent and working back. The big hexadecimal numbers are unique identifiers (sha1sum) for each commit. If you want to specify a commit, you only need the first few digits, enough to form a unique prefix. (Six digits should be plenty.)</p> <p>You can limit the log to a specific file or directory, which lists only the commits changing that file/directory. Just add the file(s) you're interested in to the end of the <b>git log</b> command line.</p> <p>The <b>git tag -l</b> command shows all the tagged releases. These human-readable names can be used as synonyms for the appropriate commit identifier, which is useful when doing things like checkout and diff. The special tag "<b>master</b>" points to the most recent commit.</p> <p>The <b>git blame $FILE</b> command displays all the changes that resulted in the current state of a file. It shows each line, prefixed with the commit identifier which last changed that line. (If the default version of <b>git blame</b> is difficult to read on an 80 charater terminal, try <b>git blame $FILE | sed 's/(.*)//'</b> to see more of the file itself.)</p> <h2>Working with historical versions</h2> <p>The <b>git checkout</b> command changes your working copy of the source to a specific version. The -f option to checkout backs out any local changes you've made to the files. The <b>git clean</b> command deletes any extra files in your working directory (ones which aren't checked into the repository). The -d option to clean deletes untracked directories as well as files.</p> <p>So to reset your working copy of the source to a historical version, go <b>git checkout -f $VERSION; git clean -d</b> where $VERSION is the tag or sha1sum identifier of the version you want. If you don't specify a $VERSION, git will default to "master" which is the most recent checkout in Linus's tree (what mercurial calls "tip" and Subversion calls HEAD), returning you to the present and removing any uncommitted local changes.</p> <p>Another way to undo all changes to your copy is to do "rm -rf *" in the linux-git directory (which doesn't delete hidden files like ".git"), followed by "git checkout -f" to grab fresh copies from the repository in the .git subdirectory. This generally isn't necessary. Most of the time, <b>git checkout -f</b> is sufficient to reset your working copy to the most recent version in the repository.</p> <p>If you lose track of which version is currently checked out as your working copy, use <b>git log</b> to see the most recent commits to the version you're looking at, and <b>git log master</b> to compare against the most recent commits in the repository.</p> <h2>Using git diff</h2> <p>The command "git diff" shows differences between git versions. You can ask it to show differences between:</p> <ul> <li><b>git diff</b> - the current version checked out from the respository and all files in the working directory</li> <li><b>git diff v2.6.21</b> - a specific historical version and all files in the working directory</li> <li><b>git diff v2.6.20 v2.6.21</b> - all files in two different historial versions</li> <li><b>git diff init/main.c</b> - specific locally modified files in the working directory that don't match what was checked out from the repository</li> <li><b>git diff v2.6.21 init/main.c</b> - specific file(s) in a specific historical version of the repository vs those same files in the working directory.</li> <li><b>git diff v2.6.20 v2.6.21 init/main.c</b> - specific files in two different historical version of the repository</li> </ul> <p>What git is doing is checking each argument to see if it recognizes it as a historical version sha1sum or tag, and if it isn't it checks to see if it's a file. If this is likely to cause confusion, you can use the magic argument "--" to indicate that all the arguments before that are versions and all the arguments after that are filenames.</p> <p>The argument <b>--find-copies-harder</b> tells git diff to detect renamed or copied files. Notice that git diff has a special syntax to indicate renamed or copied files, which is much more concise and portable than the traditional behavior of removing all lines from one file and adding them to another. (This behavior may become the default in a future version.)</p> <h2>Creating tarballs</h2> <p>The <b>git archive $VERSION</b> command creates a tarball (written to stdout) of the given version. Note that "master" isn't the default here, you have to specify that if you want the most up-to-date version. You can pipe it through bzip and write it to a file (<b>git archive master | bzip2 > master.tar.bz2</b>) or you can use git archive to grab a clean copy out of your local git repository and extract it into another directory, ala:</p> <blockquote> <pre> mkdir $COPY git archive master | tar xCv $COPY </pre> </blockquote> <p>You can also use the standard Linux kernel out-of-tree building infrastructure on the git working directory, ala:</p> <blockquote> <pre> cd $GITDIR make allnoconfig O=$OTHERDIR cd $OTHERDIR make menuconfig make </pre> </blockquote> <p>Finally, you can build in your git directory, and then clean it up afterwards with <b>git checkout -f; git clean -d</b>. (Better than "make distclean".)</p> <h2>Bisect</h2> <p>Possibly the most useful thing git does for non-kernel developers is <b>git bisect</b>, which can track down a bug to a specific revision. This is a multi-step process which binary searches through the revision history to find a specific commit responsible for a testable change in behavior.</p> <p>(You don't need to know this, but bisect turns out to be nontrivial to implement in a distributed source control system, because the revision history isn't linear. When the history branches and comes back together again, binary searching through it requires remembering more than just a single starting and ending point. That's why bisect works the way it does.)</p> <p>The git bisect commands are:</p> <ul> <li><b>git bisect start</b> - start a new bisect. This opens a new (empty) log file tracking all the known good and bad versions.</li> <li><b>git bisect bad $VERSION</b> - Identify a known broken version. (Leaving $VERSION blank indicates the current version, "master".)</li> <li><b>git bisect good $VERSION</b> - Identify a version that was known to work.</li> <li><b>git bisect log</b> - Show bisection history so far this run.</li> <li><b>git bisect replay $LOGFILE</b> - Reset to an earlier state using the output of git bsect log.</li> <li><b>git bisect reset</b> - Finished bisecting, clean up and return to head. (If git bisect start says "won't bisect on seeked tree", you forgot to do this last time and should do it now.)</li> </ul> <p>To track down the commit that introduced a bug via git bisect, start with <b>git bisect reset master</b> (just to be safe), then <b>git bisect start</b>. Next identify the last version known to work (ala <b>git bisect good v2.6.20</b>), and identify the first bad version you're aware of (if it's still broken, use "master".)</p> <p>After you identify one good and one bad version, git will grind for a bit and reset the working directory state to some version in between, displaying the version identifier it selected. Test this version (build and run your test), then identify it as good or bad with the appropriate git bisect command. (Just "git bisect good" or "get bisect bad", there's no need to identify version here because it's the current version.) After each such identification, git will grind for a bit and find another version to test, resetting the working directory state to the new version until it narrows it down to one specific commit.</p> <p>The biggest problem with <b>git bisect</b> is hitting a revision that doesn't compile properly. When the build breaks, you can't determine whether or not the current version is good or bad. This is where <b>git bisect log</b> comes into play.</p> <p>When in doubt, save the git bisect log output to a file (<b>git bisect log > ../bisect.log</b>). Then make a guess whether the commit you can't build would have shown the problem if you could build it. If you guess wrong (hint: every revision bisect wants to test after that comes out the opposite of your guess, all the way to the end) do a <b>git bisect replay ../bisect.log</b> to restart from your saved position, and guess the other way. If you realize after the fact you need to back up, the bisect log is an easily editable text file you can always chop a few lines off the end of.</p> <h2>Example git bisect run</h2> <p>Here is a real git bisect run I did on the <a href=http://qemu.org>qemu</a> git repository (git://git.kernel.dk/data/git/qemu) to figure out why the PowerPC Linux kernel I'd built was hanging during IDE controller intiialization under the current development version of qemu-system-ppc (but not under older versions).</p> <blockquote> <pre><b>$ git bisect reset master</b> Already on branch "master" <b>$ git bisect good release_0_8_1</b> You need to start by "git bisect start" Do you want me to do it for you [Y/n]? y <b>$ git bisect bad master</b> Bisecting: 753 revisions left to test after this [7c8ad370662b706b4f46497f532016cc7a49b83e] Embedded PowerPC Device Control Registers infrastructure. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Unhappy :( <b>$ git bisect bad # The test failed</b> Bisecting: 376 revisions left to test after this [255d4f6dd496d2d529bce38a85cc02199833f080] Simplify error handling again. <b>$ ./configure && make -j 2 && ~/mytest</b> WARNING: "gcc" looks like gcc 4.x Looking for gcc 3.x ./configure: 357: Syntax error: Bad fd number <b>$ git bisect log > ../bisect.log # Darn it, build break. Save state and...</b> <b>$ git bisect good # Wild guess because I couldn't run the test.</b> Bisecting: 188 revisions left to test after this [16bcc6b31799ca01cd389db7cb90a345e9b68dd9] Fix wrong interrupt number for the second serial interface. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) <b>$ git bisect good # Hey, maybe my guess was right</b> Bisecting: 94 revisions left to test after this [37781cc88f69e45624c1cb15321ddd2055cf74b6] Fix usb hid and mass-storage protocol revision, by Juergen Keil. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) <b>$ git bisect good</b> Bisecting: 47 revisions left to test after this [30347b54b7212eba09db05317217dbc65a149e25] Documentation update <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) <b>$ git bisect good</b> Bisecting: 23 revisions left to test after this [06a21b23c22ac18d04c9f676b9b70bb6ef72d7f1] Set proper BadVAddress value for unaligned instruction fetch. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) <b>$ git bisect good</b> Bisecting: 11 revisions left to test after this [da77e9d7918cabed5b0725f87496a1dc28da8b8c] Fix exception handling cornercase for rdhwr. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) <b>$ git bisect good</b> Bisecting: 5 revisions left to test after this [36f447f730f61ac413c5b1c4a512781f5dea0c94] Implement embedded IRQ controller for PowerPC 6xx/740 & 750. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Unhappy :( <b>$ git bisect bad # Oh good, I was getting worried I'd guessed wrong above...</b> Bisecting: 2 revisions left to test after this [d4838c6aa7442fae62b08afbf4c358200f10ec74] Proper handling of reserved bits in the context register. <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) Bisecting: 1 revisions left to test after this [a8b64e6f4c7f3c4850be5fd303bf590564264294] Fix monitor disasm output for Sparc64 target <b>$ ./configure && make -j 2 && ~/mytest</b> ... Happy :) <b>$ git bisect good</b> 36f447f730f61ac413c5b1c4a512781f5dea0c94 is first bad commit commit 36f447f730f61ac413c5b1c4a512781f5dea0c94 Author: j_mayer <j_mayer> Date: Mon Apr 9 22:45:36 2007 +0000 Implement embedded IRQ controller for PowerPC 6xx/740 & 750. Fix PowerPC external interrupt input handling and lowering. Fix OpenPIC output pins management. Fix multiples bugs in OpenPIC IRQ management. Fix OpenPIC CPU(s) reset function. Fix Mac99 machine to properly route OpenPIC outputs to the PowerPC input pins. Fix PREP machine to properly route i8259 output to the PowerPC external interrupt pin. :100644 100644 0eabacd6434b8e40876581605c619513bf9ac512 284cb92ae83a2a36e05137d3532106ff85167364 M cpu-exec.c :040000 040000 68740f5b1330c7859abfea3ce31062cb92adaa7f 5c48b0d20f1c4d3115881b5e9e5b6c1d681f4880 M hw :040000 040000 3ad1f0d09c60d8190d98b28318519ebaaccbb569 69efc274cec1801848de9238ae71e97681978433 M target-ppc :100644 100644 2f87946e874e8f6cbf9afd47c65e0baff236dc45 b40ff3747530d275181ff071c9cc9cff1d5ba02d M vl.h <b>$ git bisect reset</b> </pre> </blockquote> <p> <h2>Command summary</h2> <p><b>git help</b></p> - List available commands. You can also go <b>git help COMMANDNAME</b> to see help on a specific command. Note, this displays the man page for the appropriate command, so you need to have the git man pages installed for it to work.</p> <p><b>git clone git://blah/blah/blah localdir</b> - Download a repository from the web into "localdir". Linus's current repository is at "git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git", the old history from the bitkeeper days (before 2.6.12-rc2) is at "git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git", and there are lots of <a href=http;//git.kernel.org>other trees hosted on kernel.org</a> and elsewhere.</p> <p><b>git pull</b> - Freshen up your local copy of the repository, downloading and merging all of Linus's changes since last time you did this. In addition to appending lots of commits to your repository in the .git directory, this also updates the snapshot of the files (if it isn't already pointing into the past).</p> <p><b>git log</b> - List the changes recorded in your repository, starting with the most recent and working back. Note: the big hex numbers are unique identifiers (sha1sum) for each commit. If you want to specify a commit, you only need a unique prefix (generally the first four digits is enough).</p> <p><b>git tag -l</b> - Show all the tagged releases. These human-readable names can be used as synonyms for the appropriate commit identifier when doing things like checkout and diff. (Note, the special tag "master" points to the most recent commit.)</p> <p><b>git checkout -f; git clean -d</b> - reset your snapshot to the most recent commit. The "checkout" command updates your snapshot to a specific version (defaulting to the tip of the current branch). The -f argument says to back out any local changes you've made to the files, and "clean -d" says to delete any extra files in the snapshot that aren't checked into the repository.</p> <p><b>git diff</b> - Show differences between two commits, such as "git diff v2.6.20 v2.6.21". You can also specify specific files you're interested in, ala "git diff v2.6.20 v2.6.21 README init/main.c". If you specify one version it'll compare your working directory against that version, and if you specify no versions it'll compare the version you checked out against your working directory. Anything that isn't recognized as the start of a commit indentifying sha1sum, or a tagged release, is assumed to be a filename. If this causes problems, you can add "--" to the command line to explicitly specify that arguments before that (if any) are version identifiers and all the arguments after that are filenames. Add "--find-copies-harder" to detect renames.</p> <h2>Linus Tovalds talks about git</h2> <p>In <a href=http://youtube.com/watch?v=4XpnKHJAok8>this Google Tech Talk</a></p> <!-- "git show @{163}"... one character less... http://www.kernel.org/pub/software/scm/git/docs/glossary.html#def_working_tree -->