rss feed 2008 2007 2006 2005 2004 2002 livejournal twitter

December 31, 2009

Happy new year. I haven't done anything worth blogging about it a week. (I've twittered but that's a service _designed_ for having very little to say. Sort of the point of the thing, really.)

I have to move the blog file and start a new one tomorrow. (Actually what I have to move is the "notes.html" symlink to point to the new file. And possibly I should start making the suckers so the date is a link to the individual post...)

Or just use a real blogging service. Can't quite bring myself to make my old livejournal important again now that the russian mafia owns it. But the span tags I've been putting in the entries don't show up in the rss feed, and every time I ponder doing something about that it always brings up the fact there's no comments and no individual entry URLs (beyond the hash tags)...

I dunno, what blogging services out there these days are worth using?

December 28, 2009

Wow, am I an agnostic.

I just added somebody to my spam filter over an argument about a tecnical topic, and not even because I particularly disagree with him (the approach he takes works fine for his use case). It's that he was arguing that what he'd chosen to do was a One True Way, conflating together a half-dozen different choices that are in reality orthogonal, insisting his personal preferences were universal truths... and I just don't want to argue with people like that anymore. It's no fun. Possibly I used up my current stock of patience in last week's LWN thread, dunno.

Christmas came and went, New Year's approaches, it's the holiday week between the two and I've been sleeping for 11 hours a night. Not quite sure why, just tired all the time. (Fade points out the time of year. If I do wind up in Dublin I'm going to need big flourescent lights, right next to my desk. Yeah, that's still an if.)

December 25, 2009

Merry Christmas! Watched a festive holiday movie (Groundhog's Day), opened presents, cooked an enormous roast...

I have been _massively_ slacking off due to the holidays. Haven't made any progress on any computer related things since last week. (In part because my current FWL thingy is trying to cut a gordian knot of complexity that I'm writing an explanation of for the mailing list, but mostly because it's christmas and I just want to relax and enjoy it.)

Wrote up a status report on FWL, and then moved it to the mailing list instead.

December 23, 2009

Sigh. Back in the 90's I found a marvelous orange bread recipe on Ye Olde Internet, which was accidentally vegan. (The whole "that carrot has been murdered, I must go straight edge on you now" aspect was incidental, I liked that it was actually _good_.) I dug the recipe up out of a box again years later and it was still marvelous, but alas it seems long lost since sometime before the move to Pittsburgh.

What I really remember about it is it was trivial to make. The main ingredients were flour and orange juice, with a bit of salt and baking powder/soda. (I still can't quite keep those straight. Possibly both.) Oh, and a teaspoon or so of lemon juice, optional. There might have been a little sugar, although the orange juice had rather a lot already. I do remember that the only liquid ingredient was the juice. (No eggs, butter, oil, shortening, or anything like that.) Also, no yeast. (It was important not to stir it too much, you just fold the mixture over once or twice until it's wet and then _stop_stirring_.)

Unfortunately, attempting to google for this recipe keeps bringing up pages and pages of irrelevant crap. No grated peel, no nuts, no bananas, no pumpkin, no cranberry. No milk, no chunks of fruit.

By the time I've put in all those filters, I think I broke Google's brain. I'm now getting a video recipe for "Frog's Eye Salad", a page about "Catsup Cake Flour" (?), A PDF from a page that claims to be a tax publication but apparently contains a grocery list, a recipe for "Orange Roughy Roll-ups with Creamy Dill Sauce" (if that has any relation to fruit roll-ups, I'm going to pass, thanks), fondue recipes, "Holiday Baking with Whole Grains", and a page from the North Dakota Wheat Commission. That's all in the first two pages of results.

I suspect my best option is to just fiddle around and try to recreate this stuff by trial and error. Except I don't currently have any orange juice.

Wow. After I stopped directly replying to His Specialness over on lwn (because it just wasn't accomplishing anything), and then later stopped posting to the thread at all (well, except for this) when Jonathan Corbet (editor of Linux Weekly News) asked us to stop. (Then I fell behind on email and twitter for a couple days because it's christmas and I was busy, so I'm just catching up now.)

But other people didn't stop posting, and I still get emailed the replies to replies because I checked the little boxes, and apparenty His Specialness has decided that anyone who disagrees with him must really be me in disguise.

I find that hilarious. When have I ever been reluctant to insult him to his face? The only reason I've stopped using His Specialness's name is I think he's a glory hound on the order of those white house party crashers, balloon boy's parents, and that toupee that used to govern Illionis. Getting his name repeated seems to be what he wants, with context in which it's repeated a very distant secondary consideration. I suspect the only reason he hasn't done one of those reality television shows where you eat a bug on camera is they wouldn't take him.

LWN doesn't seem to have user profiles, so out of morbid curiosity I spent the 30 seconds to type " syspig" into google, which found a comment from 2006 on its first page of hits. His Specialness apparently did not bother to do this.

I repeat my comment from the last big post I made to the thread before the moratorium: "I'm tired of arguing with people who don't bother to do their homework". (And I note yesterday's squirrel analogy on the difference between clever and smart.)

December 22, 2009

Yesterday at the mall there were tons of banners for this thing, and today I clicked on a web ad for 'em, because I'm all for any alternative to Time Warner (which needs to die on general principles because they keep trying to retroactively sneak usage limits into their contracts), and these guys look like they've got reasonable plans with no usage caps... if you can find them.

Unfortunately, their web site was designed by somebody who thinks that clicking through five pages to answer a simple question somehow improves matters, and that all pages should be generated by javascript based on cookies so that when you finally DO dredge an interesting piece of info out of their navigational morass, you can never actually get a URL directly to that page so you can send said URL to someone else. (Oh, and two of the pages I tried to visit had firefox bring up a "redirect loop detected" page, that was nice.)

It's a very _pretty_ website. Lovely plumage.

They had a "live chat" service which was answered by a very nice lady who was very polite, gave prompt replies... and didn't actually have any new information to give me. (She was looking stuff up on the website for me, and apparently she couldn't find some of it either.)

Hopefully if I come back in a month their teething troubles will have worked through, their support people will be more experienced, and maybe they'll have burned their existing website to the ground and replaced it with something that doesn't spend all its effort on being clever and none of it on being smart. (I can hope. When squirrels hide nuts so ingeniously they can't find them again, they're being clever but not smart. It's less cute in a large corporation.)

December 21, 2009

Ok, the FROM_ARCH, FROM_HOST, CROSS_HOST, and CROSS_TARGET stuff is once again TOO CONFUSING. It works, and it was worked out laboriously via extensive trial and error, but it's too nasty and tangled for me to keep it straight in my head, and I wrote it.

Let's see...

if CROSS_HOST is blank, it's $(uname -m)-walrus-linux
if CROSS_TARGET is blank, it's $ARCH-unknown-lilnux
  if CROSS_TARGET is set by an arch dir (such as powerpc-440fp),
if FROM_HOST is blank, FROM_HOST=$FROM_ARCH-thingy-linux

Ok, FROM_ARCH defaults equal to $ARCH. (Probably $ARCH should just be called $TARGET these days.) This is blanked for the binutils/gcc/ccwrap stages when we're building a simple cross compile. It's set to the host for static builds (currently hardwired to i686), and then set equal to the target arch for native builds.

That's overcomplicated because the toolchain build and root filesystem stuff is glued together. Setting it to the toolchain host for cross and native builds makes sense, but the "default equal to $ARCH, except when it's blank" bit is crazy. There should be some way to signal we're building a simple compiler, but overloading FROM_ARCH seems a bit silly.

FROM_HOST is derived from FROM_ARCH, and overriden by CROSS_TARGET under magic circumstances. That's a mess, and I believe it's a mess for binutils. Vaguely recall much pain last time I fiddled with this, and this is just where I left off because it was SUCH a mess.

CROSS_HOST exists solely to humor the binutils and gcc builds, because the people who wrote that are crazy. It's always `uname -m`-walrus-linux. It's proof positive that autoconf is useless because it' can't figure out what the current system it's running the build on is, and must be told, and if I can't figure out how to remove it I think I'm going to hardwire it into the call sites.

Sigh. Attempting to clean up the multi-variable mess while separating the toolchain build from (and unify it with turned into a large mess. I need to do this in stages, but it's all tangled together. Where to start...

December 20, 2009

From the "fun to quote people out of context" department:

"Can I mention one other aspect of this debate that's really unfortunate? I've been around here for more than 20 years." - John McCain.

(The fact that the point he was actually trying to make was a bald faced lie is par for the course for his party. Or perhaps he's following Regan's altzheimer defense, "You can't hold me responsible for anything I don't remember.")

December 19, 2009

Today's wastes of time, this, this, and of course this.

(I am amused that the LWN articles about things I've done all have titles about whatever the project du jour is, from the "patch penguin" stuff through "busy busy busybox", but all the ones about a certain extra-special developer have his name as the title. Shows what our respective priorities are, doesn't it?)

I've also been asked why I didn't complete the forensic study years ago. (Christian Michon suggested I use "git blame"). The thing is, I vaguely recall I _did_ complete it, and found nothing of interest, but never bothered finishing the write-up because Mr. Special had gone away, which was the point of the exercise. (These days I'm out of it, and when he's not breathing down my neck he's simply not worth any effort.)

(Also, git blame isn't that great a forensic tool, because if you change the whitespace or a single variable name in a line, blame will show the newer commit but the line is still substantially similar to the previous line. I was trying for a deeper read into the code than that.)

Cleaning continues. Found a bunch of old t-shirts I like in one of Fade's big foot locker chest things, stuff I haven't seen since Pittsburgh. (It's like shopping without having to spend money or go find stuff I like in piles of stuff I don't.)

I'm trying to break the toolchain generation out of and give it its own script. Right now, can do up to four toolchains: the simple cross compiler, the static cross compiler, the static native compiler, and the dynamically linked native compiler that's part of the final root filesystem.

In theory, that should all be factored out into a single "build a compiler" script that can do all four variants based on config flags. In practice, the root filesystem might need to build uClibc even if it hasn't got a compiler, if stuff in it is dynamically linked. (But in that case it just needs the *.so files, not the *.a or *.o files or the headers. And don't get me started on

I suppose the static native compiler and the dynamic native compiler could be collapsed together. Static linking against uClibc isn't much overhead and we're talking 13 megabytes of compiler and headers and such already, so micro-optimizations are at the wrong scale to matter.

But that's still three separate roles for one script to perform, and they expose complexity in that's been glossed over so far. So the question is, what user interface should all this have?

Once gain, easy to do, hard to make _clean_...

Ok, building the simple cross compiler is:

# Build binutils, gcc, and ccwrap

FROM_ARCH= PROGRAM_PREFIX="${ARCH}-" build_section binutils
FROM_ARCH= PROGRAM_PREFIX="${ARCH}-" build_section gcc
FROM_ARCH= PROGRAM_PREFIX="${ARCH}-" build_section ccwrap

# Build C Library

build_section linux-headers
HOST_UTILS=1 build_section uClibc

Building both the non-simple cross compiler and the native compiler are just:

# Build C Library

build_section linux-headers
build_section uClibc

# Build binutils, gcc, and ccwrap

build_section binutils
build_section gcc
build_section ccwrap

The difference being how PROGRAM_PREFIX and FROM_ARCH are set.

So basically, if FROM_ARCH is blank, build the C library first, otherwise build it afterwards, and leave the setting of those two variables to the calling script? Or should that information be on the command line? And then there's distinguishing the static cross compiler for the host arch (prefixed) from a native compiler (unprefixed).

Hmmm... Implementation's fairly straightforward, figuring out a sane UI is still the hard part.)

December 18, 2009

I'm still arguing with the internet. This is par for the course, I suppose.

Got my big ccwrap.c update checked in yesterday, and today I taught the build_section logic to run and (the first does setup/cleanup for "package" automatically, and the second you do your own setup and cleanup. Basically decided to automate only the easy cases, and let the others set themselves up manually. (So any package that uses a separate build directory sets itself up so I don't have to figure out how to pass in the extra data, for example.)

During the upcoming disentanglement of the download/extract/setupfor logic I might strip out the build directory logic from setupfor and just have the script do that by hand. (mkdir ../thingy isn't hard, and I already have to list the build directory explicitly when doing cleanup, so just having that be an rm -rf lets me remove that logic from the cleanup function.)

The old "repetition vs clarity" debate rages on...

Random amusement.

December 17, 2009

Why am I Arguing with The Bruce again? Come on, I know better, I even own one of these.

Um, actuallly I think I gave my copy to Beth. Right, I need to get a _new_ one of those and hang it somewhere prominent. (Scan it in and use it as desktop wallpaper, perhaps. Or just use the original strip...)

I'm amused that Bruce and I can oppose the lawsuit for such completely different reasons. I'm also somewhat amused that Bruce has been flogging his rant (which I still haven't read more than the first few sentences of) more or less as a press release, of _vital_ importance for the community at large to read, and it's gotten all sorts of coverage about how he was never involved and still isn't involved and is all indignant about it.

Meanwhile my blog entry explaining that while I used to be involved I've stopped, and my reasoning for doing so (based on actual inside information), is not on slashdot or LWN or such. (I didn't expect it to be, but it's probably more informative than anything Bruce has to say.)

Oh well. Self-promotion is Bruce's thing. Me, I write code instead.

December 16, 2009

There are times I want to slap Ulrich Drepper.

The fix he claims doesn't exist is to "#undef _FORTIFY_SOURCE" before your #includes, because complaining that I'm not checking the return value of asprintf() is just _stupid_. (The malloc only fails if you run out of _virtual_ memory, not out of physical memory. Exhaustion of physical memory happens later when you dirty the pages, and triggers the OOM killer which you can't do much about. Yes, nommu can have a harder time of it but that can fail allocating your program's _stack_.)

And no, the suggested fix of introducing an unnecessary typecast to (void) on each occurrence no longer seems to work. I'd guess they found out that people were actually doing it, so they removed it.

(I had a macro wrapper that was going int unused=asprintf(), but then it complained about an unused variable. THERE IS NOTHING I WANT TO DO WITH THIS VALUE. GO AWAY.)

And you wonder why I want an alternative to glibc? You do not know what's best for me, and if you try to give me no choice you can't take away the choice to use a different implementation than yours. Idiots.

December 15, 2009

Some people are assuming I'm involved in the most recent round of BusyBox lawsuits. I'm not. I lost faith in the SFLC's judgement about what is and isn't a good lawsuit to file back when they filed suit against Cisco for a 5 year old toolchain that Cisco got from a vendor (Broadcom) and which predated the original BusyBox vs Linksys lawsuit circa 2003.

The result of that was that Cisco's big "Strategic investment in Linux" ended. Cisco had brought me in to consult on that, their plans to base every single new product on Linux, and schedule a week of license auditing and publication work between the time they got a golden master image and the time it actually went to manufacturing. It was an aggressive, "bet the company on Linux" move, and the week the suit was filed it went from "our first project using this new process will be the in-development MediaHub" to "our corporate use of Linux at all 'is under review'". A few weeks later I was told the CTO who had been behind the big Linux push was "spending more time with his family" or some such (never did get the details). And then a few months later, the entire Linksys division was reorganized out of existence (at the end of August). The brand name still exists, but the engineers I knew have all left the company (rather than be reassigned to work on non-Linux projects).

The remaining Linux support (for existing products that already shipped) seems to have been outsourced to Red Hat. Last I checked (a few months ago) I couldn't find any Linux work Cisco was still doing in-house, and that massive 180 seemed to have been in direct response to the "Mepis II: Even Dumber" lawsuit the SFLC filed on behalf of the FSF.

The toolchain in question came from Broadcom back around 2002. Cisco never _had_ the source code to it, because Broadcom refused to give it to them. The SFLC never made Broadcom a party to the suit, thus the company that was cooperating with the Linux community was punished and the one that wasn't was ignored.

The only reason that toolchain hadn't already been replaced was that after the first lawsuit back in 2003, Cisco washed their hands of Linux for five years and outsourced maintenance of existing projects to Taiwan, and switched their new routers to VxWorks instead. They had just gotten over that a few months before this lawsuit was filed, and their new projects were using toolchains built from source in-house (instead of supplied by an outside vendor).

The toolchain in question contained no interesting IP, projects like OpenWRT had been happily targeting that hardware with fully open source toolchains for _years_. From an engineering perspective, there was NO REASON to file the lawsuit that got filed, unless you wanted to STOP Cisco from cooperating with the Linux community and doing new Linux development. Which seems to be what happened. It was INCREDIBLY STUPID, and destructive, and massively counterproductive, and it made the community look bad.

That's why I'm not involved in the current round of Busybox suits. I have zero faith that the SFLC understands which lawsuits are in the interests of the community, and which are the equivalent of patent trolling. Humoring the more zealous and clueless excesses of the FSF's idealism division (the parts that call Tim O'Reilley a "parasite" for producing good documentation for sale, and say that having a day job writing proprietary code is "a sin")... That's not helpful to Linux. We have enough trouble distancing ourselves from the lunatic fringe as it is.

Erik Andersen's still going along with it, although based on the rounds I was involved in, I'd guess his involvement is limited to signing whatever paperwork they send him and mailing it back. Erik hasn't had any real involvement with BusyBox since about 2006, he's busy with other things. His last commit to the busybox repository was October 19, 2006. He hasn't posted to the busybox mailing list at all this year, and in 2007 and 2008 combined he posted a total of 5 messages: one about the lawsuit, twice under the title "New mailserver test -- ignore", and twice asking people to delete files off the server when its disk filled up. (I note that OSUOSL reinstalled that server and took over its administration a year ago, after it got cracked due to lack of maintenance.)

Last time I talked to the SFLC about this I suggested they get current Busybox maintainer Denys Vlasenko involved. They declined. The current Busybox project has no connection to these lawsuits that I am aware of, and the SFLC seems to prefer it that way.

Oh, one other fun detail: the code drops we did get from the rounds of suits I was involved with? As far as I know, it didn't result in a single line added to the BusyBox repository. From an engineering perspective, the entire exercise was _completely_useless_.

By the way, I got a blanket email monday sent out to all "Conservancy" members (I'm still on that mailing list, apparently), which contained this chunk:

However, if you do speak or write publicly about this, please be very clear in such communications that you only speak for yourself and/or your project, but not for the Conservancy as a whole. Also, we would appreciate it if you would avoid commenting specifically about the violations, products and the companies mentioned in the litigation. Finally, if you want to avoid this matter entirely, you are absolutely welcome to refer inquiries related to GPL enforcement and litigation directly to me or to SFLC's press coordinator, Lysandra Ohrstrom .

Obviously I'm not speaking for the SFLC, or for BusyBox. I haven't even read the new press release, so I don't know who they sued this time, or over what. I think the last time I had any communication with them was about 6 months ago, but they kept switching who it was I was talking to, so searching for the email would be more trouble than it's worth.

I dunno how "public" a blog entry is. (My sick cat is doing much better, thanks.) But yeah: not involved this time, done enough damage already. Driving companies away from Linux while giving the FSF's screwball division a bigger soapbox is not what I signed up for, so I didn't re-enlist for the current round.

Update: I have been pointed at this guy but haven't got the stomach to read it. That man is the reason I stopped doing BusyBox development.

As for why the SFLC chooses not to include him in the lawsuits, perhaps the fact he was proven not to have any code remaining in it years ago, let alone today might have something to do with it?

Bruce abandoned BusyBox two years before Erik Andersen picked up the dead code and re-launched it, and then Bruce didn't bother to post or subscribe to Erik's new list for most of a decade (until he suddenly showed up not to contribute code but to be a license troll, at which time Bruce's web page said that BusyBox was maintained by Lineo, a company that Erik had left something like seven years earlier).

It wouldn't help the SFLC to involve Bruce because he has nothing to do with BusyBox. That's a significant reason BusyBox has been as successful as it has: Bruce's _lack_ of involvement in it.

If any of the above comes off like I'm dissing Erik: I'm not. Erik Andersen is the reason BusyBox exists today. He could just as easily have grabbed Red Hat's Nash or started his own new codebase from scratch, that just wasn't his style. Erik wandered away from BusyBox development partly because uClibc and BuildRoot were eating his time, and then because he had his own company to run and a family to care for. (Erik continued to administer the server it ran on even after that, albeit somewhat distractedly.)

Erik found a new maintainer (me), just as I found a new maintainer (Denys) to hand the project off to. Bruce let his version die, and the current one is at least as separate from his version as modern BSD is from the Bell Labs code of 1975. Whatever else may be going on, the SFLC is right to keep itself away from Bruce. If only they'd kept themselves seperate from the rest of the extremist wing of the FSF...

Right. Enough of that. Coding...

Ha. Changing /etc/acpi/ to this means the stupid hal/dbus layer doesn't get a chance to veto the suspend:


chvt 1
echo mem > /sys/power/state

I should probably swap that in for as well so the low battery stuff doesn't go "yesterday, VLC was playing some media, and it didn't notify HAL when it _stopped_, thus it seems to think it's playing three media streams simultaneously when in fact VLC isn't curently running. So refuse to suspend, instead just let it run out of power and have the hard drive do its emergency park."

And you wonder why the drive's got a bad sector on it...

I would like to find the developers responsible for this:

        char inFile[80]={0}, outFile[80]={0};
        sscanf(argv[2], "%s", inFile);
        sscanf(argv[3], "%s", outFile);
        // Read data and generate header
        fh = open(inFile, O_RDONLY);
        if ( fh == -1 ) {
                printf("Open input file error!\n");
                free( pHeader );
        lseek(fh, 0L, SEEK_SET);
        if ( read(fh, buf, status.st_size) != status.st_size) {
                printf("Read file error!\n");

And I would like to ask them what they were smoking. The unnecessary lseek, printing error messages to stdout instead of stderr, closing and freeing data right before exiting the program... Those are just noise. But going out of your way to _introduce_ a buffer overrun like that is kind of impressively bad. (It's from cvimg.c by some guy at realtek, circa 2004.)

There are times you look at a program and just start over from scratch. This may be one of those times.

December 14, 2009

So the larger contract I'm working on right now is to come up with a decent board support package for a router that's using "not mips" processors. The relationship between Mips and the Lexra 5200 is a bit like m68k vs coldfire or PowerPC vs ppc-440, except with more historical acrimony.

Once upon a time a company called Lexra made a mips clone, based on the freely published MIPS instruction set. Unfortunately, four of those instructions had patents on them (unaligned loads and stores, because obviously nobody have ever done _that_ before 1986). So Lexra removed those instructions from the chips they designed, coming up with entirely original circuitry implementing a less efficient but patent-free subset of the original MIPS instruction set.

Alas, Mips still sued them, arguing that when the chip encountered these "illegal" instructions it threw an exception, and somebody could write an exception handler in software that would then implement the original behavior (even though the hardware didn't), and thus the chips violated the patents _anyway_.

The lawsuit dragged on for years. As far as I can piece together, Lexra made progress convincing the PTO to invalidate the patent, but MIPS still tied Lexra up in court until the smaller company ran out of money, then MIPS more or less bought them out with a large cash payment and a free MIPS license in exchange for Lexra voluntarily going out of business.

Really. Here's a quote from that last link, the joint press release Mips and Lexra put out about the settlement. It is an impressive piece of corporate-speak as you'll ever see:

Lexra will shift from the intellectual property (IP) licensing business and become a fabless semiconductor company while at the same time licensing the MIPS32 architecture for use in its upcoming network infrastructure chip. Under the terms of the license agreement, Lexra acknowledges the validity and enforceability of MIPS Technologies patents. As Lexra is exiting the IP licensing business, Lexra will assign its processor IP assets over to MIPS Technologies and both companies will work together to convert Lexra's current customers into MIPS Technologies' customers.

"We strongly support Lexra's new direction as a fabless semiconductor company," said John Bourgoin, chairman and CEO of MIPS Technologies.

Becoming a "fabless semiconductor company" while "exiting the IP licensing business" means you don't make chips and you don't license technology to anybody else who does make chips. Your customers have now been switched over to another company anyway, to which you assigned all your existing chip design intellectual property, after declaring that gosh, those patents we spent all that time fighting really _do_ cover using purely theoretical software you never actually wrote to implement things the hardware goes out of its way to avoid doing.

This is what the CEO of MIPS "strongly support"ed, with a straight face. Is it any wonder the rest of the industry switched en masse over to ARM chips instead?

But before all that happened, the Lexra 5200 processor design was licensed to Realtek, which used it in a pair of System On Chip designs, the RTL 8181 and the RTL 8186. These were a Lexra 5200 processor plus integrated serial, ethernet, wifi, PCI, and so on. (The two chips used the same "subset of mips" compiler, but different kernel .configs for the different peripheral device versions integrated onto the chip. Also, for some reason the /proc/cpuinfo in the kernel identifies this sucker as a "Philips Nino". Presumably another licensee of the Lexra chip designs?)

After Mips put Lexra out of business, Realtek discontinued the 8186 rather than be "converted" into a MIPS Technologies customer, but that didn't stop Chinese companies from manufacturing their own RTL-8186 variants, which they still do to this day. They weren't the ones who got sued, and the patent in question was filed in 1986 and presumably expired circa 2006 anyway. These chips are small, low power, and really, really cheap.

However, because of the FUD cloud surrounding the chip design, the patches to support this stuff in Linux never made it upstream. All the toolchain tweaks do is remove the four unimplemented instructions, so those changes are trivial to forward-port, but they never got cleaned up so they're a proper ./configure target instead of a hardwired #if 0-ing out of chunks of the source code. So the compilers available targeting this hardware are of the 3.3.3 vintage.

Worse, the integrated peripherals on the SoC are all hand-rolled Realtek stuff, the drivers for which never went upstream either. Thus the kernels that use them are circa 2.4.18.

I can't do much about the kernel right now (realtek only released binary only drivers for its SoC peripherals, although there seem to be some open source rewrites floating around out there on and such). But the toolchain and user space I can clean up and upgrade.

December 13, 2009

Hah. My father's on Linkedin. I should probably figure out how to send an invite, but I don't actually use linkedin. (Just respond when people poke me about it.)

Sigh. I need to sit down and spend the hour to write up a C version of the first perl removal patch (the hard part being surgery on the Makefile, not the actual C code). I just feel totally uninspired.

Misplaced my credit/debit card a day or two back. Phoned up to cancel it. Found the now-cancelled card five minutes later, in a bundle of papers on top of the fridge. (I refer to this as the "credit card summoning ritual", and find it quite reliable.) Thus I am on a cash economy this week, spending only what I can carry with me. Probably good discipline, but it means I'm worrying about money, which is a source of stress I'm not good at coping with.

Backing up my laptop to the terabyte drive, it encountered a bad sector. At first I thought it was in the external drive, but no, it's in my _laptop_. It's just the one, and it encountered the same bad sector in two different files (indicating the suckers are probably crosslinked), and it's in big media files I was copying from Mark so I hadn't noticed (because I'm not up to Mythbusters season 4 or Penn and Teller season 6 yet).

It's possible that the bad sector is because the power got cut during a write (which might also explain the crosslinking if reading a bad sector confused ext3's journal replay or something). This might have happened if I forgot to plug the sucker in while leaving it copying, and VLC vetoed the suspend when the battery ran out (because hey, yesterday it was playing some media at it _remembers_, man -- it does this all the time and I have yet to figure out how to shoot HAL/DBUS in the face with a bazooka for being _stupid_ yet).

Darn it, I actually don't know. It sounds like I'm tring to talk myself out of reconizing a problem as serious. If I was fully employed I'd just buy new hardware and not worry about it. This laptop is 3 years old anyway, the new laptop smell has definitely worn off. (So have the letters on the n, m, c, and d keys, and the > sign.)

But in the past few months, I've had a cat with skin cancer, another cat with (treatable) liver failure, my car's brakes replaced, my sister going through a divorce, two people buying health insurance more or less as individuals (which costs more than the mortage, condo association fee, and electricity combined), and of course trying to put Fade through college (which is important and should not be interrupted by financial concerns). Coming up, annual property taxes on the condo, the upstairs carpet needs to be replaced (cats again), trip to visit Fade's relatives, have Ike's or Lamb's frown professionally at the new clunking noise my car's making (hoping it's a CV joint and not the transmission)...

Intermittent contract work keeps the lights on, but in a recession it is _not_ comfortable. Once upon a time I could always let my apartment lease expire, put my stuff in storage, and go visit relatives or friends for a few months to get away from it all. Money just wasn't that important because I barely spent any. But I haven't got that option anymore, now I have a mortgage and a wife and cats, and the bills keep coming. There are expenses I have to _keep_up_ with. It's disconcerting. (Adulthood sort of sneaks up on you, doesn't it?)

It turns out what I'm really not good at coping with is the _annoyance_ of small expenses. I drove off without my gas cap, I'm not even sure where to buy another one of those (Fry's?). The light bulb in the fridge burned out, I need to find a place that sells _those_ (and probably replace it myself instead of just hiring a guy to Make Problem Go Away). My laptop battery now lasts half an hour on a full charge, is replacing that a luxury or a necessity? My sound cancelling headphones are held together with duct tape. It's time to renew the little car tax sticker that says Nelda has gotten her tithe.

These are incidentals. I've spent more at restaurants than any of them costs. I'm not used to them adding up and becoming stress. I'm used to having a financial cushion between me and them, but it's worn a bit thin of late. I currently have the money to pay for these incidentals, but I'm reluctant to do so because I can't _stop_ another cat from becoming sick, or the noise the car's making from being serious, or my laptop dying, or my glasses breaking, or sudden unexpected dental work, or The Unexpected in general.

This is why I was willing to move to Pittsburgh for a job back in 2006. I could earn twice per hour what they were paying me, but they _kept_ doing it so I didn't have to worry about where next month's work would come from. Alas, by the time I was on my third CEO, dozenth co-worker farewell luncheon, and fourth boss (who _I_ had to train), I had to admit that "stability" was not really what that job was providing. (Turns out it's hard to combine "stability" with "interesting work", or maybe I'm just not good at finding it.)

These days I'm hoping to move to Dublin for another shot at such interesting stability. I'm still waiting to see if I'll get the opportunity to do so. (I believe I'm entering week 7 of the decision process. The search for stability leads to prolonged uncertainty. Adulthood is overrated.)

Oh well. In the meantime, I have two other contracts to work on, and expense reimbursement paperwork to fill out. Reupholstering the financial cushion would be nice. I'm just not convinced it'll hold up to the next cat attack.

December 12, 2009

Yay, Andre's routers arrived. (Paying work, always fun.)

Pushing perl patches upstream, somebody asked Peter Anvin to ack 'em. It did not end well. On the bright side, his current objection du jour is something I can actually address. (Admittedly it's useless and almost certainly another proxy for his objection to removing perl in the first place, but hey, I can call this bluff. If I can just clear my plate to make time and energy for it.

Ripping ccwrap.c a new one, as mentioned on the FWL list.

Why does everybody keep playing Zombie Sinatra? The man couldn't sing when he was _alive_, and they JUST WON'T STOP...

Fade has resubscribed to World of Warcraft. Much watching over her shoulder. It's addictive, and they've made it even more pretty...

I note that tube-feeding a cat every 4 hours is a lot more disruptive to one's work schedule than you'd think. (Although when I don't get decent sleep, it's hard to do anything particularly demanding the next day anyway.)

December 9, 2009

Yesterday's bisectathon went off into one of those "git bisect skip is stuck after 2 dozen attempts" cul-de-sacs, which was my cue to declare that particular build break "good" so I could track down the commit that introduced its fix. The fix was:

--- alt-linux/arch/arm/common/vic.c     2009-09-15 11:51:09.000000000 -0500
+++ git/arch/arm/common/vic.c   2009-12-09 03:26:49.000000000 -0600
@@ -22,6 +22,7 @@


Which needed to be applied to bisect through a range of a thousand or so patches which otherwise wouldn't compile. Luckily, it _did_ apply to that range, and thus I could bisect through it.

That meant that the bisectathon wound up with this, on about the third attempt:

There are only 'skip'ped commit left to test.
The first bad commit could be any of:
We cannot bisect more!

So the first 9 of those 11 commits are a range of patches by Catalin Marinas committed on July 24. The other two are branch merges by Russell King (the arm maintainer), so shouldn't be introducing any new code.

Which means that the commit that introduced this bug should be one of Catalin's commits. According to the order "git log" shows commits in, those first nine are a contiguous block (of _course_ not in the order git bisect just dumped 'em in though, that would be too easy). The last commit before any of those was 05efde9d04ccc, which worked. The next one isn't by Catalin but by Hyok Choi of samsung, and it's introducing a new config symbol. There you go.

I'm not at all surprised by this. I actually suspected something like it, but it shouldn't be that much of a pain to bisect to find a simple tweak you need to make.

Ok, I need a config upgrade script. I need to expand a miniconfig against "stable", copy that .config to the alt- tree, do an oldconfig, and then compress it back down to a miniconfig... Hmmm, what UI should an automated script to do that have... Ok, that might work.

December 8, 2009

And lo, FWL 0.9.9 has shipped. (Using linux rather than 2.6.32 because .32 still needs fixing.)

I spent bits of the weekend and the first part of this week tracking down what to do about the PowerPC pmac_zilog serial panic, and now I've checked in a one line fix. (It's a test for null, averting the null pointer dereference. Doesn't seem to hurt anything else.) Probably not the right fix (since it's in common code and not pmac_zilog.c), but I posted said one liner to the linuxppc-dev list for them to _properly_ fix it upstream, and it's small enough for me to deal with it myself in the meantime. Not _currently_ my problem. (I also posted the updated perl removal patches to linux-kernel, fairly _early_ in the merge window this time. Here's hoping.)

So why is git's user interface so horrible? Here's an example:

Now I'm trying to bisect why the arm targets no longer produce any serial console output in 2.6.32. The kind of random breakage you get each new release, only now in arm rather than ppc.

I start bisecting betwee 2.6.31 (good) and 2.6.32 (bad). I set up my build environment to yank all the patches I'm applying, and symlink perl from the build/host directory just so I don't have to deal with what applies to what. (I'm only running anyway, not trying to cross compile the root filesystem. No ./configures involved.)

The previous bisect hiccuped on some random file conflict, so I did a checkout -f and then restarted it. (git bisect reset, git bisect good, git bisect bad.) The first one is a git bisect skip, and the second is git bisect bad. And then it does this:

Bisecting: 3098 revisions left to test after this error: Untracked working tree file 'arch/arm/mach-w90x900/mach-w90p910evb.c' would be overwritten by merge.

I happen to know you can fix that with "git clean -fdx", which I probably should have done last time (although nothing anywhere _hints_ that you're supposed to do this, they just expect you to memorize the magic incantation). But here's my question: what state is the thing in now? Do I need to do "git bisect bad" again? Will another checkout -f put me before or after the bisect bad? It gives me no hints what to do next. It just expects me to _know_.

I do happen to know to do a "git bisect log", and I get this:

# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect good 74fca6a42863ffacaf7ba6f1936a9f228950f657
# bad: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32
git bisect bad 22763c5cf3690a681551162c15d34d935308c8d7
# skip: [73c583e4e2dd0fbbf2fafe0cc57ff75314fe72df] Merge branch 'omap-for-linus' of git://
git bisect skip 73c583e4e2dd0fbbf2fafe0cc57ff75314fe72df
# bad: [ae19ffbadc1b2100285a5b5b3d0a4e0a11390904] Merge branch 'master' into for-linus
git bisect bad ae19ffbadc1b2100285a5b5b3d0a4e0a11390904

Note that the comment lines repeat the information from the actual data lines, and of course all the identifiers are machine readable but mean absolutely nothing to humans. The "number of revisions left to test" data is not included in the comments, of course.

The common failure mode of code that's primarily used by the people who wrote it is you have to understand how it was implemented in order to make it work. You must know how it works internally to command it. They _assume_ you know how to do this, and it never occurs to them to list your options for what to do next, because obviously you have them memorized (or can look in the source code).

This is not true of cars, phones, refrigerators, or any of the other modern marvels that achieve widespread acceptance. You don't have to know how a light bulb works to flip a switch.

Linux's 20th anniversary is in a couple years. Its desktop penetration is expected to remain below 1%. There is a reason for this.

December 7, 2009

Couldn't face my laptop yesterday, after being stuck with it in an airport for 24 hours of consecutive awakeness. Caught up on twitter and livejournal, then borrowed Fade's computer and played Sims 3 well into the morining.

Poking at the PMAC_ZILOG issue on PowerPC.

Dragon is at the vet. She has feline hepatic lipidosis, and treating her is expected to cost about as much as I paid for my car. How did cats ever survive in the wild?

December 6, 2009

At 4:30 security started letting people back into the area with the nice google beanbags and the promise of a foot court opening at 5 am where I could get a beverage. Unfortunately, yesterday's boarding pass no longer worked, so the security people directed me to the Alaska desk where they assured me someone would be there.

And there was someone, at every other airline's booth. But the Alaska people didn't show up until an hour after everyone else (including again the Frontier booth next door). In the meantime, their computer insisted I wasn't booked on that flight, and would not issue a boarding pass. So I spent an hour standing there in front of the only unstaffed desk, holding an orange "security" pass that would let me skip about 3 minutes of screening line. Whee.

Apparently the boarding pass they gave me yesterday was a specially coded sort of scarlet letter, telling every Alaska employee who saw it that I was to be treated like a transgressor. The new one looked completely different, today went much better, and the planes they exist to fly around are relatively pleasant once you're on them.

For a second time my safety information card was stuck together by something spilled into its little plastic pocket. This time it wasn't still damp with chunks stuck to it, nor was there wasn't a puddle of it on the floor in the space I was expected to put my backpack into, which was an improvement over the flight out. (Instead it tore in half when I tried to open it.) Didn't care. Slept on the plane, and got something to drink when I woke up. Bliss.

Not flying that airline again voluntarily, though. Ever.

December 5, 2009

Stuck in an airport theatre presents: my day as described in the email I sent to the person in charge of Alaska Airlines' twitter account:

I normally fly SouthWest, and admit I have been spoiled by them.

Southwest sees its mission as getting people to their destinations. If they have to cram you into an overhead bin to get you there, they will. (Southwest is proof positive you can get the contents of a Greyhound bus airborne.) But by Pharamond they will GET you there, generally on time. It's what they do. It's what Southwest is _for_. (I once had a southwest employee take me in their _own_car_ to another nearby airport when a flight was overbooked.)

In contrast, legacy carriers see their business as flying planes around from place to place. Whether or not you're _on_ them isn't really their problem, they have more important things to worry about. They take your money, which pays to fly the plane, and everything else is unimportant. I had forgotten this.

Human resources people tend not to book on southwest. It's apparently some side effect of the fact that they're not set up to give discounts to travel agents. The price on southwest's website is the cheapest price they offer to anybody. (They might as well just offer it directly to customers.) So it wasn't too surprising when a resources person in Europe booked me on Alaska Air between Austin and San Jose, and I managed to get here ok. (Although it was the first time I've deplaned onto the tarmac in years. Reminded me of Kwajalein.) My flight back was a red eye this morning. I even went to bed early.

The problems started when the phone in my hotel room did not ring for the wake up call I'd requested. Instead my phone alarm woke me up, but it was still set at the time I had to get up at _yesterday_. AAAHHH!!!

Still, it was 8am and the plane took off at 8:55 and the airport's 20 minutes away. I wasn't checking any bags, wasn't returning a rental car, and I'd made tighter time crunches before. (On Southwest.)

I tried to call the lobby to get them to pre-order me a taxi: the phone can't call the lobby. It is completely dead. Great.

Reluctantly skip the shower, shove everything in my backpack, head downstairs (elevator, cannot make it go faster), had the front desk call a taxi for immediate pickup (and informed them about the broken phone, which they confirmed wasn't working before the taxi arrived). The nice taxi guy only took about 5 minutes to get there and then did 80 getting me to the airport. So I arrived at 8:30, meaning I still had 25 minutes left to make the flight.

The last time I had a similar problem was about five years ago, back when the post-911 security theatre frenzy was even worse. I arrived at the airport with 12 minutes before my flight took off, and Southwest got me on the plane with almost a full minute to spare. Admittedly that was _awesome_ of them (and started my loyal-customer-ness), but this time I had over twice that much time. It was tight, but I might still make it!

Run to the alaska check-in machines... which wouldn't let me check in for the flight. Doesn't acknowledge that flight number exists. What? Ok, run to the Alaska counter, stand in line to speak to the bureaucrat at the desk, and the lady there _spent_time_arguing_with_me_ that it was too late to get on the plane. But she eventually agreed to let me try since the plane had not, in fact, taken off yet. (My argument was something like "The plane is right over there, it hasn't left yet, can I try to get on it please? I just need a boarding pass.")

Run to the security theatre line, spent a dozen or so minutes getting through that. (Yes of course my bag was pulled out for manual screening. I was in a hurry. Same bag that I took to SJC, containing the same stuff that didn't get screened on the way here, but oh well.)

Anyway, I got out of security theatre. RUN to the gate. There were still 6 minutes left (according to the clock at the gate) until the scheduled take-off time of the flight. Victory! And... the desk has nobody at it? Over to the next gate's desk... Where is the plane? They tell me the plane is "already in the air".

At this point, I refrained from asking the bureaucrat at this new desk what the time listed on my reservation actually represents. (That's when it achieves cruising altitude? If I was taking a bus, train, or SouthWest, making it physically to the vehicle boarding point six minutes before the scheduled departure time would work. But apparently, not in this context.)

Raising that sort of issue is not helpful. I know never to debate policy with the worker bees in a bureaucracy.

So instead I asked her the standard bureaucracy navigation question: "What should I do now?" And here is the most important difference between Alaska and Southwest: she did not, in fact, care. The plane was in good hands, I could go hang.

She books me on tomorrow's version of the same flight, hands me the pass, and turns away from a job well done. I ask her "So I should wait here in the airport for 24 hours?", and she replied (and I quote), "I guess so", and turned away again. (Note: there was nobody in line behind me, she was turning to face the other person at the desk.)

I'm aware at this point that the lady wants me to go away, presumably because she thinks I've committed some kind of sin arriving at the gate only 6 minutes before the time listed on my flight information printout. She clearly feels that she is the wronged party, and keeps turning away from me at the slightest opportunity. (Ok, I know I had to skip my shower-and-shave this morning, and I probably have hilarious gravity-defying bed hair, but would a little eye contact kill you?)

Nevertheless, I persist for a few follow-up questions, which clearly annoy her:

1) Was there a later flight? No, there are no later flights, Alaska Airlines only has one flight to Austin per day.

2) Do they fly to San Antonio? Oh no, of course not. (She seemed to think I was crazy for asking.)

3) Anywhere else in Texas at all, from where I could get a bus...? Not today.

At this point the lady makes her first non-prompted suggestion: I could buy a ticket on another airline, with my own money. She makes sure to clarify this last part of her suggestion, which again is the first actual suggestion she's made about what I can do next.

I take the hint. I wander to a corner of the waiting area near an outlet, find a chair, and plug in my laptop. In a few minutes the sign changes to a Sacramento flight. After the Sacramento flight boards and departs, there's a shift change and a new lady at the desk. I have a number of questions the previous lady obviously wasn't interested in answering, and I try one on the new person:

Can I get a transfer through another airport, three or four hops even? She checks her computer: It's not even 10am yet and Alaska Air does not have any flights, from anywhere in the country, going to anywhere in Texas, until tomorrow. (Apparently travel on this airline offers no fallback plans whatsoever. Good to know.)

I go back and sit in my corner, but this lady (on her own initiative) starts working on my case. After about 15 minutes on the phone she rebooks me on another airline. (Yay!) She gives me a little printout thing on a boarding pass, which I gladly accept, thank her _profusely_, and follow her directions back out through security and to the Frontier Airlines desk next to the Alaska desk. At this point, despite how the morning has gone, I am _happy_. It is FIXED! I get to go HOME!

The woman at frontier airlines doesn't know what to do with the Alaska stub. That's fine, I've got an hour and a half until the flight, there is plenty of time. She suggests they might be able to accept my ticket at face value and let me pay the difference? Sure, I just want to get home, thank you. She takes the it to the Alaska desk next door where she and two other people huddle over the computer.

At this point, a fourth person comes to the Alaska counter. It's the lady who wasted time arguing with me this morning while I was rushing to catch my flight. From over at the frontier desk, over the noise of the people in line, I hear her yell "No, he was late! He agreed to go on tomorrow's flight! We're not doing that!". Since this is news to me (not having had caffeine yet my contributions to our earlier interaction had been things like "Please, can you just let me go to the gate? I've still got X minutes", while looking at my watch still in Texas time).

I go over to the Alaska counter, where the lady from this morning is busily un-fixing my issue. Apparently she had unilaterally attached conditions to allowing me to attempt to get to my flight before it took off, and was angry for me altering a bargain I had not, in point of fact, noticed. (When wastes-time-with-arguments showed up, the Frontier lady went right back to her desk to serve the next people in line. I envy her escape.) So the lady from this morning returns things to her idea of the status quo (decided the moment she saw me, apparently) while trying to start a fresh argument about it (or perhaps continue the one from this morning).

At this point, I have meaningfully interacted with three Alaska Air employees. Two of them treated me like a species of bug that had done something nasty on the carpet, and the third (who actually tried to _help_) seems to have been cut off at the knees for having done so (I assume it's her being yelled at through the phone). Not wanting to get the only nice one in further trouble, I don't argue with the one in front of me as she switches my ticket back to what she feels is right and proper. Since she's already under the impression I must have agreed with her earlier, I now _do_ agree it must all be my mistake, and ask her the "So what do I do now" question, which is ignored.

I ask about a hotel. This gets a better response. She's obviously happy at the idea of me accepting my place, taking my punishment of waiting until tomorrow, and most importantly going away and not bothering her anymore. And thus she deigns to answer this one. She says that I can go to the rental car area and call a hotel from there, some of them have "distressed flyer" discount rates. She does not name a specific chain, despite repeated attempts at clarification.

At this point I decide that volunteering information is against company policy. She points me towards the baggage claim area. Our conversation was over.

I wasn't really interested in leaving for a hotel and potentially having this problem all over again tomorrow morning. I am at the airport. Leaving the airport is not progress. (Besides, my ability to get lost is _epic_, wandering around a city I don't know is not my idea of fun.)

I'm tired. I just want to go home.

I go back to wastes-time-with-arguments and ask her how I get to the SouthWest terminal. This elicits the first smile I've seen from an Alaska Airlines employee so far. Also the first truly helpful response. Dunno if she's happy at the idea of me paying twice for the same trip, or if it's merely the idea of losing me as a customer to another airline.

So I went over to Southwest to reassure myself that competent airlines do still exist. They were nice, they _cared_ (or faked it darn well), they listened to my sad story, and they had three more flights that day to Austin, but the best they could do was offer me a last minute ticket for about the same price Alaska costs with 2 weeks advance purchase. (No this wasn't a special deal, this was the price on Called Fade and fiddled with my frequent flyer program to try to overcome the fact I'm 5 credits short of a free flight.

Alas, I really couldn't convince myself that my comfort was worth more than a new subnotebook, especially since I've got my laptop with me, free wireless, electrical outlets galore, half a tin of penguin mints, a food court (alas, at airport prices), and a flight I'm not paying for tomorrow morning. Refreshing as it was to hang around people who actually _wanted_ me to reach my destination, my journey had not been entrusted to their care this time around. (To both sides' clear regret.)

I eventually bit the bullet, went back, and sat at gate C10. Didn't interact with another Alaska Airlines employee for several hours, until the nice british lady working the gate for the incoming austin flight (around 5pm local time) struck up a conversation with me. (She's very nice, warning me that this area would close overnight around 11pm, and talking a bit about the weather. She's actually helping another customer too now, and being _helpful_. So the nice/nasty record is 2 and 2. Makes me feel better about the company in general, but I'm still sitting in Terminal C with no better option than to wait until morning.) And around 7pm there was a nice security guard, who let me stay in the semi-shuttered security area (the motion sensors kept first turned the lights off in the Alaska Airlines area around noon), until 11p when they were "closing the gates" and I finally had to go out to the lobby.

Return trips have not been working out for me this weekend. I hope that returning home from the airport in Austin turns out ok. (Hey, I have a +6 familiarity modifier to navigation roles made in Austin, _and_ the 100 flyer shuttle to UT from which I was planning to walk home anyway, living on west campus and all. Note to self: do not get hit by car or anything during this. Make sure bus does not explode.)

December 4, 2009

The Googleplex!

Wow that's a nice campus. With fun people in it.

(I note that getting lost and winding up walking a couple miles back to my hotel afterwards was entirely my fault. Well nobody told me the "hotel shuttle to local businesses" could get me there in the morning, but stops early on fridays. Still don't know where the bus stop I was looking for is, but I needed the exercise anyway, and didn't exactly have plans for the evening. Hotel. Bed early, red-eye home in the morning.)

December 3, 2009

Hanging out at the rental car place in San Jose waiting for Solar and Calculus (from freenode) to pick me up for dinner. (The hotel turns out to have a shuttle to businesses within 10 miles, so I don't need to rent a car, assuming I can get back to the airport on saturday.)

So Firmware Linux is currently creating two cross compilers. The first is a quick and dirty simple one called "cross-compiler" which has no thread support, incomplete c++ support, and is dynamically linked against the host's C library. But it's enough to build the root-filesystem, and to do distcc acceleration (for both C and C++).

The second cross compiler is called "cross-static". It has thread support, uClibc++, and it's statically linked against uClibc for portability. It's (optionally) produced as a convenience for those crazy people who want to cross compile extensive amounts of stuff.

There's little point in packaging up the simple cross compiler for the downloads/binaries directory. If you're downloading prebuilt binaries, the second one is more capable, more portable, and generally superior in all respects. It's also a lot more complicated to build, among other things requiring _TWO_ of the simple cross compilers to build with (one for the host, one for the target) using a modified canadian cross process to build uClibc binaries on a non-uClibc host. (The native compiler builds the same way, it just uses the same simple target compiler for both its "host" and "target".)

Thus the quick and dirty one isn't going away, since creating that is a necessary first step before creating the second more polished version. But the "cross-compiler" I pack up isn't the one created by "". That creates the _simple_ cross compiler. The full-featured one is actually created by with some evil build flags. Part of the Great Refactoring should allow me to untangle and rename this, but to what? Building a system image currently requires calling,, and, and that's a clear story. The more complicated cross compiler is a side branch.

Also, I'm building the native compiler twice, once as part of and once standalone. I should clean that up. Maybe the new toolchain builder script should be in sources/more? Except isn't currently calling anything out of sources/more...

Argh. Dismantling code that works because it's a MESS. Always fun. And figuring out the technical infrastructure is easy, figuring out what to call it so other people can intuitively understand it without needing a lot of documentation they'll never read is the hard part.

December 2, 2009

So I missed the deadline to get a release out at the end of november. At this point I'm aiming for monday, I gotta catch a plane tomorrow.

The ongoing Great Refactoring horked the static toolchain, but I figured out what I screwed up (moving the headers to the install location before building libsupc++.a means the build can't find those headers). I'm not quite sure why the native toolchain builds and the static toolchain doesn't. Investigation proceeds, to the point where I've got several different changes backed up and needing to be untangled so I can check them in individually. Such is life...

It is expected to snow in Austin on friday (during my time in California, of course). This is weird. Many years it doesn't even get cold until January.

December 1, 2009

Digging through some old files I found recordings of an old Linucon panel I never got to attend, called "This Topic Intentionally Left Blank" (part 1, part 2). This was Steve Jackson, Eric Raymond, and Howard Taylor put in a room for an hour to talk about whatever they felt like, back in 2004. (Note that Howard had just quit his day job at Novell to become a full-time webcomics guy about a month before this con, this was his first Guest of Honor spot.) Jay "the tron guy" Maynard was in the audience and participated a bit too.

Unfortunately, there's a lot of tape hiss nearly drowning out the participants (especially Steve, who speaks softly and was probably farthest away from the recorder), and what I really need to do is type up a transcript.

There's probably more of these buried in various backups...

November 30, 2009

I've been reading a few articles about Google's job interview process, first this one and more recently this one. The second of those reminded me of an old discussion I had with Eric back when we were working on The Art of Unix Programming.

Some programmers are mathematicians at heart, and others are engineers at heart. Eric is a mathematician (as is the guy who wrote the second of the above articles). I'm an engineer.

Deep down, the mathematicians seem to believe in perfect abstract algorithms, that flaws in programs are deviations from that platonic ideal, and problems are either properly solved or not really solved. The engineers think that programs are data that controls the operation of circuitry, that the algorithm is at best an approximation of what the machine's actually _doing_ (many other aspects of which might wind up being important), and that it's usually possible to improve on an existing solution.

Another way to look at it is that mathematicians are looking for truth, and engineers are looking for consistent behavior they can improve upon. Heuristics generally creep mathematicians out because they're not even _trying_ to achieve perfection, but they're just another tool in the engineer's toolbox which produce expected and controllable errors.

Good programmers of either flavor can see the other side's point of view, but it's not where their instincts lie.

That's why when I read "most things are graph theory if you look hard enough" in the above article, I thought "if all you have is a hamer, everything looks like a nail". That doesn't mean it isn't true, but I try to be agnostic about the solution before I understand the problem. Most problems can be solved in a half-dozen ways, all of which suck differently, and you min/max your way through it and either do the best you can by deadline or change direction when you get a better understanding of the _actual_ problem halfway through solving what you _thought_ the problem was. "I can solve this problem with graph theory" and "I can solve this problem with regexes" are both probably true.

November 29, 2009

I don't understand why anyone ever votes Republican. I don't get it.

Herbert Hoover was a Republican. He's the guy who put the "Great" in The Great Depression, and thus had Hoovervilles named after him. Joe McCarthy was a Republican. He led the most overwrought witch-hunts since Salem, and publicly equated being a Democrat with treason (yes really). Richard Nixon was a Republican. The whole Watergate scandal was about Nixon's cronies getting caught breaking into the offices of the Democratic National Committee.

How could anybody vote for these people? How could any political party survive that legacy? The guy who undid Hoover's Damage was a Democrat. Bill Clinton balanced the budget and paid down some of the debt between the first and second Bushes' historic deficits. Kennedy simultaneously faced down the Russians in Cuba and launched the space program that got us to the moon _while_ paying down huge amounts of our leftover debt from World war II.

The guy the Republicans hold up as their current idol, Ronald Regan, suffered from Altzheimer's while in office. He pursued communists with the same blind zeal Joe McCarthy had, and got himself hip deep in scandal because of it, and his _defense_ was Altzheimer's. Not that he hadn't done stuff, but that he didn't _remember_. What kind of stuff? Funding terrorist organizations to attack anybody we didn't like in south america or the middle east, the same nutballs terrorists that we're fighting today are the ones WE trained. (Yes, that includes the Taliban, and Saddam Hussein.) Oh, and pursuing Regan's agenda racked up trillions of debt, for the first time since World War II (including looting the Social Security trust fund), essentially inventing our current budget problems (which two Bushes exacerbated without even the excuse of facing down the Soviets, but it _started_ with the massive fiscal irresponsibility of the Regan administration). That's their hero.

Who else is left? Harry Truman (Democrat, FDR's vice president) used the atomic bomb to end World War II and the Marshall Plan to clean up after it, helped set up the United Nations and Nato to make sure it wouldn't happen _again_. His first executive order desegregated the military, his second made it illegal to discriminate in civil service hiring based on race. Truman was so successful that Republicans invoke his record all the time in their campaign speeches! Both the first and the second Bush repeatedly invoked Harry Truman in their speeches, despite voting against him, having Trumans' relatives ask them to stop, and so on. Truman himself equated "Republican" with "rapscallion" (his word, if Wikipedia is to be believed).

Eisenhower was a Republican, elected because he was a World War II general, and like Grant before him his administration was full of corruption (Nixon was his Vice President, and McCarthy rose to power on his watch unopposed). But the guy himself was more defined by his military career than his political party, and also had his hands full cleaning up after World War II.

And of course Carter shows the main failure mode of the democrats: their ability to be massively ineffective despite the best of intentions. (The current group has the presidency, large majorities in the house and the senate, and they still can't get anything done. Of course when the positions were reversed, they couldn't prevent the other side from passing their entire agendas virtually unopposed.)

I've said for years, the democrats are incompetent and the republicans are evil. But one side means well, and the other is doing things I'd go to great lengths to _prevent_. I can see wanting to replace most Democrats with somebody with a spine, but what I don't understand is why anybody would vote _for_ a Republican.

Sigh. Remind me to make an rss feed generator that takes the span tags into account, so people who don't want the "politics" topic (or who only want the "programming" topic) can filter. For that matter, possibly I should set up a real blog somewhere for the FWL semi-news posts that are just going to the mailing list at the moment.

November 28, 2009

Yay, Andre Ruiz got back to me with a contract to work on a board support package, somewhere between 50 and 100 hours of work total. Assuming the details work out, that nicely fills up december.

Working on granularity in FWL. Breaking up the various sources/sections scripts so each one handles exactly one package, and moving more package builds into sources/sections.

I introduced sources/sections to factor out common code, and if some package builds live there they might as well all live there (so they're all in one place). Again, the age-old tension between being explicit and not repeating yourself...

While I'm at it, time to audit sources/functions and see if some of those functions are specific to FWL (read_arch_dir) and some are more generic (download). Possibly split that file into two files? Have to work out what's best. I've already got some out-of-tree users of this code, and just discussed some more use cases. Trying to make it easy, the question is how?

November 27, 2009

So far the best I've come up with for beating the path to the standard headers out of gcc is:

echo '#include <stdio.h>' | gcc -E - | sed -n 's@.*"\(.*\)/stdio\.h".*@\1@p;T;q'

Poking at the sources/sections refactoring. Lots of today was spent frowning at why a gcc ./configure section was insisting on using /lib/cpp instead of "i686-cc -E", until I figured out it wanted to use "i686-c++ -E" (Um, ok?) and that I'd broken the C++ link names. (Horrible error reporting in this stuff. Horrible.)

I think for Christmas, I'd like noise cancelling bluetooth headphones. If I had bluetooth headphones, I could get up and pace while listening to stuff. (Ok, I should also reinstall the software to update the contents of the ipod nano, which would also let me pace.)

November 26, 2009

Starbucks is open until 4 on thanksgiving! Yay!

Today's "OS/2 could do that in 1994, why can't XFCE?" is that I can't change the font size a terminal window. I can change it for _all_ terminal windows, but I can't go "this terminal window is monitoring a build process, shrink it to 6 point font size and stick it in the corner set to always on top". (Of course you can do this with kde, etc. But not with xfce.)

Picking away at the sources/sections refactoring. I've mostly gotten binutils/gcc/ccwrap broken up into individual scripts, although making them reasonably orthogonal is a challenge. For one thing, gcc is always going to depend on binutils. For another ccwrap expects a very particular layout of the binutils and gcc output, and the logical thing to do is have the first two scripts install things where ccwrap expects them, rather than having ccwrap move them from one place to another. (Because otherwise ccwrap would just expect them in that _other_ place, which is silly.)

After I get that working again, the next step is to make ccwrap take an existing functional toolchain and wrap it. This is different from wrapping the output of a fresh unpatched binutils/gcc build, because that toolchain doesn't _work_. If I ask this gcc "where is your compiler include directory", it can't actually _find_ it. A functional toolchain could respond to "gcc --print-file-name=include" with a path. Adding the wrapper is what makes the FWL toolchain functional.

But if I have an existing functional toolchain, then I should be able to beat the six paths out of it. For compiler libraries, print-file-name of "crtbegin.o" and "crt1.o" and grab all the ".o" files out of thoe directories. Do the same for "libgcc.a", "libgcc_eh.a", and "" and grab all the .a and .so files. Then "libc.a" and "" into the non-cc lib directory.

I can also do "gcc --print-search-dirs", which gives "install", "programs", and "libraries" directories (remember to chop off the pointless prepended = sign from paths that start with that). This might be useful to find the linker and such to populate the "tools" directory.

But how do I find the system headers? It's not in --print-search-dirs, the --print-file-name won't find stdio.h...

echo '#include ' | gcc -E - | sed -n 's@.*"\(.*\)/stdio\.h".*@\1@p;T;q'

November 25, 2009

So the big todo item is a new ccwrap setup script. The current one is the last third of sources/sections/, and that needs to be broken up anyway.Thus I'm splitting sources/sections/ into individual package builds, and trying to figure out how to handle the fact that some of them need "setupfor" and "cleanup", and some of them don't. (For example, the ccwrap build just uses a single .c file out of sources/toys, and thus hasn't got an associated tarball.)

Coming up with _a_ solution is easy. Doing nothing, and having each script start with setupfor and cleanup when it needs it, is the easy answer. But it means there's a bit of repeated code in almost all section scripts, and it seems like that would be easy to factor out and have build_section do it for you. But then you have to specify when _not_ to do it, and the problem is it's done before the script launches so specifying it _in_ the script is awkward. My options are grep the script for a marker before running it, or giving the scripts different names.

This is the hard kind of design problem because all three could work just fine, it's a question of which option is least ugly. Having plumbing like setupfor and cleanup calls repeated in each caller is ugly, because it's easy to screw up just because you've got to get it right in so many places, and if it ever needs to be changed it's a pain to track down and change every occurrence. But it makes explicit what's happening. Having it happen automatically means there's magic behavior you can't see unless you look somewhere _else_, which is bad.

I can try to mitigate the magicness by making it clear what's going on. Perhaps the different names could be meaningful, something like ccwrap.nosetup? (For that matter, the "" names are currently somewhat misleading because they're not scripts you can run standalone, they don't include their own prerequisites to define variables and functions. Actually the whole build_section handoff is more magic than I like right now. If the magic was centralized to there so that's the one place you had to look and then you understood a lot, that would be one thing...)

I admit this instance is just a minor irritant, but he tension between "make it explicit what's happening" and "factor out common code" has been a fundamental design problem all through this project. Figuring out how to satisfy both is always the hard part.

For the moment I can punt, leaving the setupfor/cleanup in place in each script and worrying about potentially genericizing it later. Right now, genericizing the ccwrap script is fiddly enough. I want it to be able to wrap the output of the binutils/gcc build stages, but also wrap an existing cross compiler with a known prefix in the $PATH. Since the unwrapped gcc from my build doesn't know where anything is (that's what the wrapper's for), those scripts should install their output in the right place and the wrapper should detect if components are already where it expects them and not wrap in that case.

But what counts as "already installed"? The prefix-cc gets renamed to prefix-rawcc, and the wrapper dropped in the old place. Should that renaming be done by the ccwrap script, or by the compiler script installing its output?

Hmmm... has to be the compiler script. If ccwrap checks for anything _other_ than rawcc, then it doesn't know if the rename has already been done or not, so running it twice would case "issues".

November 24, 2009

I can extract more parallelism from the FWL build scripts. The ./ stage can be parallelized, including --extract, although that means teaching setupfor's extract to work in parallel which has always been fiddly. (The main problem with creating a $PID directory to do the temporary extract in is cleaning up half-finished residue from earlier interrupted extracts. Hmmm. Probably a trap EXIT to clean up half-finished tarball extraction?)

The other thing is moving the rest of the package builds to sources/sections and then running "maybe_fork build_section" as appropriate. Most of and can run in parallel. The host-utils build waits for the busybox build so it can adjust the executable search $PATH, and then everything else can build in parallel. The build waits for kernel headers and uClibc so it can adjust the header/library search path, and the uClibc++ build has to wait for the glibc build to finish because of an incestuous link between those. But I don't think gcc has to wait for binutils (due to the canadian cross), and the busybox, make, bash, and distcc builds should be entirely orthogonal. (Modulo that business with the /bin/sh symlink between busybox and bash, but I think I fixed that with a config change...)

Anyway, bisecting the uClibc breakage. Now that NPTL has been merged (ignore the obsolete "master" branch, the current nptl branch now has a proper history and thus supercedes that branch), I'm trying to debug what's wrong with it. I'm attempting to build all my old architectures using the old threading library and the same basic configs that worked with, and of course everything's broken.

Bisecting to find the breakage in the current version runs into the fact that I can't get a clean build from any of the intermediate versions either, just for different reasons. (The tree has been crap for ages.) So when git bisect comes up with a version, is it "good" or "bad" or is the unrelated bug merely hiding the one I'm looking for? So I bisect to find where each specific bug was fixed and collect patches that fixed 'em.

So I bisected the "undefined NULL" bug to where it was fixed at git 6d3ed00a41a948. Doing a fresh bisect the next one I hit was the "conflicting types for getline" bug. While looking for where that got fixed, I saw the bug where it tries to use fcntl64 and can't find it, and the fun little bug where ldconfig.c explodes with an unexpected something before 'err'.

I'm trying to bisect one bug! There are so many things _wrong_ with this tree that A) I can't isolate the bug, B) I can't get a _good_ build since the last release to compare against.

This is not a healthy development process.

November 23, 2009

Pondering an "expect" implementation for toybox. (Something written in C, not in tcl.)

When you look at "expect" scripts:

set timeout 30

expect "login:"
send "username"
expect "Password:"
send "password"


It looks like all you need to do is write a command line utility that eats data until it matches a regex (and listens to an environment variable specifying a timeout), and then pipe data through a shell script. The "send" command is essentially just echo.

There are two problems with this. First, switching the tty in and out of "raw" mode (so you can match incomplete lines) is a bit of a pain when you keep relaunching commands. If you leave it in raw mode when your command exits, you inconvenience the next command, but if you don't input clumps weirdly.

The more serious problem is you need a circular pipe. The input of "expect" and output of "send" are some other program (though a pipe, character device, etc), which the expect script is processing. If you run that process, and pipe its output into the expect script, you then need to pipe the output of the expect script _back_ into the first process.

This means you need to run it as a child process of something _other_ than a shell, because shells don't do that. Some versions of the netcat command do that (toybox netcat -l), so I could extend netcat, but that means using "expect" as a normal command run from the shell doesn't work anymore because the shell isn't in control anymore...

Queue the Marvin the Martian voice, "Design, design..."

November 21, 2009

Ok, looking at the todo list from last week, Coleman's patch got integrated, I started a FAQ, Analyzed Grzegor'z issue and added it to the FAQ.

My vague poking at the hw-uml target has turned from adding futimes() to uClibc to fixing User Mode Linux to use futimens() instead. (Work in progress. I posted my futimes() patch to the uClibc list and it reignited the perennial "the uClibc development process is moribund" flamewar. Patching linux looks _way_ more attractive just now...)

Ben Zanin converted the presentation slides to text (all 260 of them), and cleaned up the HTML a bit. I added an index and trimmed a couple things, and put it up. Yay! (The pdf version is in the downloads directory now.)

Jean's ld search path issue and Vladimir's redone snapshots page are still pending.

At the moment, I'm working on Marc's arm-eabi issue. It turns out the problem he was seeing was an issue with his build scripts, which are based on top of the FWL scripts but extend them to do extensive cross compiling. (I had a "Wow, that's impressive! Why?" moment.)

The problem is he builds a new uClibc and adds it to the library path with -L, which doesn't remove the _old_ one from the linker path, and then the two fight.

The reason he used -L is A) It sort of worked, sometimes, when he was lucky, in the right light, with a tailwind. B) It was easy to do. But my solution is to use a wrapper script (sources/toys/ccwrap.c) which removes the existing headers and libraries from the $PATH and builds a whole new set of paths from from scratch. (You can't make gcc do this itself, gcc is totally incompetent when it comes to path logic. You can make it add stuff to its $PATHs, but the only way to get it to _remove_ stuff is to tell it to blank them and start over.)

For a while I've needed to write a script wrap an existing toolchain, building a copy of the wrapper and constructing a new wrapper directory to put it in. There are six main paths that compilers look at:

The only one that gcc _doesn't_ usually screw up and hardwire in strange behavior for is #6. (You'd think #5 would just use $PATH, but no.) So the wrapper has to force all of this, and it does so by assuming that the wrapper itself is installed at ~/bin for some ~, and then appends ../lib and such as needed to construct the other paths.

The wrapper has a WRAPPER_TOPDIR directory that lets you explicitly feed it a different value for ~, and Marc suggested I give him more granular environment variables so he could tell it where to find the things it's not finding. Unfortunately, if you start specifying this stuff yourself, you have to specify at least five different paths, and that's assuming that Path #4 above is where the compiler's "crtbegin.o" and friends live, and that path #3 is where the C library's "crtn.o" and friends live. (This is where my setup puts them, but in the wild they can wind up all over the place what with /lib and /usr/lib and /usr/lib/gcc/x86_64-linux-gnu/4.3/ and so on. On Ubuntu, /usr/lib/gcc/x86_64-linux/gnu/4.3/ is a symlink to /lib/, so the name the linker is looking for isn't in _any_ of those locations. Wheee...)

So the fix to Marc's problem is to do a "" that takes an existing compiler and sets it up with ccwrap via lots and lots of symlinks. There's some design work to do, though...

November 19, 2009

Adrienne asked why I don't like the new "retweet mangling" misfeature of twitter, and since there's no possible way to express the horror in 140 characters...

If twitter added a button that quoted a tweet (to save a cut-and-paste), and enabled the same link to original tweet that "reply" had, that would be great. That's not what they did.

Instead, they made it so that it looks like tweets are showing up from people whose feeds you're not subscribed to. "Who the heck is this person and why should I care what they have to say?" You have to look out of line, at small print, to see who you _are_ subscribed to who actually posted this. (There's a funky "recycle" icon in front of the tweet, which is light grey on a white background, looks a bit like an @ sign, and is barely noticeable when you're _looking_ for it.)

This is not an improvement over the conventional RT behavior where "@whoiknow RT @newperson" is right at the start of the message where I expect it, and the relationship is very clear.

Context is important with retweets: if @neilhimself is retweeting someone it's most likely a literary reference and probably interesting. If @simonpegg is retweeting somebody it's usually some local brit I've never heard of, or an actor on a show I never saw, and I'm unlikely to care about it. One of those is consistently worth a read, one of those is consistently barely worth a glance, and lumping them together is WRONG.

I'm bad with names. Every time I see a name in my list that I don't immediately recognize I have to stop and think "did I subscribe to this person and they just haven't posted in a while? Did I misread the name? Did some spammer crack into my follow list?" It totally breaks the flow, and turns twitter from something I can quickly scan and only have to stop and think about if a tweet is actually _worth_ thinking about, to something that regularly grabs my attention to resolve problems with no payoff. This change makes it harder to evaluate my feed. It becomes _work_. Half the _point_ of twitter is it's quick and non-intrusive.

So the old way of presenting retweets is superior to the new way, I greatly prefer the old way, the change is pure loss... but there's no way to disable it. I can't opt out. I CAN'T MAKE IT STOP. It would be trivial to make these show up as "@persioniknow RT @personidunno" but they won't do it, they think their new presentation format is unquestionably superior, and they're going to cram it down my throat whether I want it or not. I would rather block _all_ retweets than have to deal with them in the new format, but they don't give me that option.

Add in the fact that the "retweet beta" introduced itself with pop-ups, to make sure I had the worst _possible_ first impression, and I couldn't opt out, and I couldn't figure out how to even send them feedback about it until Mark pointed out the link on the right (under "direct messages" and "favorites")...

And that's why I'm looking for a non-web UI for twitter. They broke the web UI. "Kill it with fire."

November 18, 2009

Hmmm, an email I got this morning points out that I need to write more documentation. My reply was:

On Wednesday 18 November 2009 08:43:41 drsunbo wrote:
> Hi, Rob,
>  I Have a router which with a busybox 0.61 pre fireware embeded.
>  I want to upgrade it to 1.* such hight version, But I don't know how to do
> . Could you help me something or give me some related links?Thanks . ^_^
> Yours, BOBS

The first problem is that busybox is just one part of your router firmware.

What you have in your router is a small Linux system, with three major 

1) The Linux kernel

Usually a 2.4 or 2.6 version, "cat /proc/version" to see which.

2) A C library

Usually either glibc or uClibc.  Do an "ls -l /lib/*" and if it says it's uClibc, and is glibc.  What that symlink points to 
should tell you the version number of the corresponding package.

3) BusyBox.  (Which you know.)

Busybox provides the posix command line utilities, your libc provides the 
system call interface, and your kernel handles resource management and 
interfacing with the hardware.

The second problem is that you need to be able to build code for your target 
hardware.  Your router's processor probably isn't an x86 chip.  It could be 
arm, mips, powerpc, or something else entirely.  You have to find out what, and 
then get an appropriately targeted cross compiler.  If you "cat /proc/cpuinfo" 
it should tell you what processor it's using.  (The uname -m command would 
also say.)

You might be able to get a cross compiler that can use your existing linux 
kernel and C library headers, but that's a fairly precise configuration.  If 
you can't, you'll have to replace the C library and the Linux kernel when you 
replace busybox.

The next question is 'what filesystem format is the router's root partition 
in"?  "cat /proc/mounts" should tell you that one.  You'll also need to figure 
out "how do I write to the router's flash"?  (Does it have a bootloader you can 
get a serial console for?  Or do you have a jtag?  This is another hardware-
specific question, it varies per router.)

Probably the best place to go to ask these sort of questions is the #edev 
channel on  That's where a bunch of embedded developers hang 
out.  There's also a mailing list, see for subscription information.


I suppose I could have pointed him to the OLF presentation (which Ben Zanin converted to html for me! Yay! I need to get those up), but it doesn't exactly answer his question. It doesn't even include the Introduction to Cross Compiling document I wrote way back.

Gotta integrate all this stuff into one big lesson plan, and then break it down into individual lessons that provide small enough chunks to deal with...

November 17, 2009

Ripping apart the website so it's stylesheet based instead of wordpress. Much construction dust at the moment, and possibly for the next couple days as well.

The Gentoo From Scratch page needs updating, its navigation was dependent on the wordpress stuff, and I dunno how to put that back. I'm reading the html source that wordpress used to put out, and it's full of javascript and css (included from directories that don't actually seem to exist, some kind of apache rewrite going on there).

For the moment I just put a really simple index.html up directing to the FWL, GFS, and toybox pages. But only one of those is actually navigable at the moment. Still, better than a raw directory listing. :)

Updated the nav bars of FWL to provide more info. Trying to figure out why the cross-static compiler is having build issues the native compiler isn't having. (Basically trying to deal with todo items faster than they accumulate.)

November 15, 2009

So Grzegorz's issue is that when you type "qemu" on ubuntu when it's not installed, it says:

The program 'qemu' is currently not installed.  You can install it by typing:
sudo apt-get install qemu-kvm
qemu: command not found

Except that the qemu-kvm package only includes the i386 and x86_64 targets, not any of the others (arm, mips, ppc, sh4...). It installs the man pages for the other targets, but not the actual executables.

Heh. If you run qemu inside qemu, both tell you to hit ctrl-alt to exit the mouse capture. (Luckily, the external one is transmitted to the internal one as well, so you can click on the outer one to give it focus again but then have your mouse back in the internal one.)

Ubuntu's "qemu" package is just an alias for "qemu-kvm". The package you need to install to get qemu-system-mips and friends is "qemu-kvm-extras", which is crazy because these are pure qemu packages that have nothing to do with kvm. The Ubuntu package managers are nuts here, this package name is actively misleading.

Ok, time to start a FAQ. I dunno how "frequently" these questions are asked, but they _were_ asked and I haven't got a better place to put this info, so...

November 14, 2009

Ok, what are my current todo items...

Marc's arm-eabi issue (the__aeabi_unwind_cpp_pr0 thing) is still outstanding. That one's complicated. (Well, the fix might be simple but understanding it enough to figure out what's _wrong_ isn't.)

Jean's ld search path issue is still outstanding, but I need more info from him before I can address it.

Vladimir submitted a redone snapshots page, based on stylesheet instead of tables. He's got some ideas that could significantly improve the thing, but since this is largely cosmetic I'm holding off until I get the first two fielded.

Grzegorz's issue: write up documentation on installing qemu, with a note about kvm providing the host qemu only. (I'm actually installing xubuntu 9.10 i386 in a qemu image so I can test installing the debian kvm package and then installing qemu over it.)

Coleman's issue is probably also documentation: he's trying to patch uClibc to backport shm_open from uClibc-git, and the patch isn't taking effect. (Did he put it in sources/patches? Dunno.)

Possibly I need a FAQ. The _volume_ of documentation I've already got is enough that people don't read it. The hard parts are keeping it up to date, and properly indexing it so you can find the answers to questions.

On an unrelated issue, I started doing a script to download the HTML version of the OLF presentation and do better navigation for it... And the first slide was 1.6 megabytes. There are 260 of these! Downloading all the slides is most of a CD image! For text slides with a background!

So yeah, looks like what I should do is just retype the text on those slides...

So today's accomplishments: make a big todo list of the stuff I'm didn't get any significant work done on, and then waste hours fiddling with slides that are already online. Wheee...

November 13, 2009

Twitter is definitely making my blogging somewhat intermittent. :)

So after shipping the 0.9.8 release, I got feedback that the c++ links were horked (turned into absolute host paths) by the Great Refactoring, which was easy enough to fix, and that arm eabi has something subtly wrong with it, which isn't so easy to fix.

The report of subtly wrong was while cross compiling dropbear, but I also ran into it while building current uClibc-git:

path-path-path/gcc/lib/libgcc.a(_divdi3.o):(.ARM.exidx+0x0): undefined reference to `__aeabi_unwind_cpp_pr0'
path-path-path/gcc/lib/libgcc.a(_udivdi3.o):(.ARM.exidx+0x0): undefined reference to `__aeabi_unwind_cpp_pr0

I'm having trouble reproducing this in a hello world program. The _divdi3.o stuff _looks_ like the support code for doing 64 bit math on a 32 bit host, although maybe it's something looking for a soft-float support function when it should be using VFP? Except that I tried a half-dozen different division variants, and couldn't get the build to fail. (Unless gcc's optimizer is considering the return code of printf("hello\n") a constant string and optimizing out the division via constant propogation even without any opimizer options? Maybe?)

At first I was blaming ccwrap, thinking it wasn't sucking in libgcc_eh.a properly (don't those missing functions live there?), but that turns out not to be the case. The library is getting linked in, and according to readelf -a, libgcc_eh.a seems to include an implementation of __aeabi_unwind_cpp_pr0, but it's marked hidden?

Relocation section '.rel.text' at offset 0x51cc contains 37 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
000002fc  00002c18 R_ARM_GOTOFF32    00000cb4   __aeabi_unwind_cpp_pr0

Symbol table '.symtab' contains 71 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
    44: 00000cb4     8 FUNC    GLOBAL HIDDEN    1 __aeabi_unwind_cpp_pr0


Google found a message saying this bug manifests when an EABI toolchain tries to use "-mabi=apcs-gnu" instead of "-mabi=aapcs-linux". So I'd like to confirm I'm using the right one of those, except I'm not actually passing those flags in at compile time, they come from the those funky spec files where gcc invented its own language for no readily apparent reason. (I note that "gcc --dumpspecs" says "no input files", you have to say "gcc -dumpspecs" with a single dash in order for it to be recognized. Ah, gcc...)

So what _should_ it look like? The above message links to Debian's EABI page, which describes the two ABIs thusly:



So it should be getting -mabi=aapcs-linux, -mfloat-abi=soft, and -meabi=4.

This page explains that -mfloat-abi=soft doesn't mean we're _not_ using Vector Floating Point (VFP), it just means that the compiler uses the same ABI for hardware and software floating point. (Possibly this means wrapper functions around the floating point instructions? Weird.)

So anyway, the output of gcc -v isn't feeding any -mabi option to collect2/ld, and the only specs snippet that seems relevant is:


Which looks wrong, and confusing. (The | means it'll accept two different ones and what the heck _is_ atpcs?) This string seems to come from gcc-core/gcc/config/arm/bpabi.h:

/* Tell the assembler to build BPABI binaries.  */
#define SUBTARGET_EXTRA_ASM_SPEC "%{mabi=apcs-gnu|mabi=atpcs:-meabi=gnu;:-meabi=4}"

How the HECK is that file getting #included? (What _is_ BPAPI, anyway? The comments at the top of the file don't expand the acronym. It could be "base port", or it could be some specific system. I note that nothing anywhere seems to actually _include_ bpapi.h, so this gets sucked in by magic build infrastructure somehow, somewhere. Have I mentioned recently that I DESPISE the gcc build infrastructure?)

Hmmm, only gcc/config/arm has bpabi, so it's not like each architecture has a "base" file... But yes, this seems to be a base file for arm, because gcc/config/arm/linux-eabi.h says:

/* At this point, bpabi.h will have clobbered LINK_SPEC.  We want to
   use the GNU/Linux version, not the generic BPABI version.  */

How it got #included, I have no idea, but this gets #included after it and is already overriding bits of it for other stuff.

Ok, what _should_ SUBTARGET_EXTRA_ASM_SPEC look like? The acknowledged experts in this area are the codesourcery guys, so download their current prebuilt i686-hosted toolchain, make sure it can build hello world...

$ ./arm-none-eabi-gcc hello.c
/home/landley/firmware/arm-2009q3/bin/../lib/gcc/arm-none-eabi/4.4.1/../../../../arm-none-eabi/bin/ld: warning: cannot find entry symbol _start; defaulting to 00008018
/home/landley/firmware/arm-2009q3/bin/../lib/gcc/arm-none-eabi/4.4.1/../../../../arm-none-eabi/lib/libc.a(lib_a-sbrkr.o): In function `_sbrk_r':
sbrkr.c:(.text+0x18): undefined reference to `_sbrk'

At a guess, that needs to be installed at a magic absolute path, or needs a special --sysroot command line argument or some such. I suppose the docs would tell me, but all I really need is dumpspecs...

%{mabi=apcs-gnu|mabi=atpcs:-meabi=gnu;:-meabi=5} %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*|march=armv4:--fix-v4bx}

Ok, I officially don't understand what's going on here. The Debian page is saying one thing, the code sourcery implementation is saying another. And thus we're back to brute force: take apart the build in question to see what circumstances reproduce the problem (and that the symbol being hidden is the root of the problem) and then trace through the gcc build and try to figure out WHY this symbol is hidden.

November 10, 2009

More or less finished the screenshots page. Looks reasonable-ish. I still need to fill in everything but ARM for architectures, and I need to finish history.html, and completely redo documentation, but hey: progress.

I'm also banging on sparc, which currently isn't booting with uClibc-git. No idea why.

November 8, 2009

So vapier removed the readelf.c Erik Anderson wrote for uClibc almost a decade ago, apparently because he couldn't be bothered to understand it. So I grabbed a copy from the last release, spent the 5 minutes necessary to clean it up so it built under glibc, and checked it in.

Now that I've gotten the 0.9.8 release cut and uploaded (if not yet the website changed so you can actually _tell_), I'm taking a break and banging on this readelf implementation.

So what you do is you run readelf against a binary, and it tells you what its headers say, ala:

Type:		EXEC (Executable file)
Machine:	AMD x86-64 architecture
Class:		ELF64
Data:		2's complement, little endian
Version:	1 (current)
OS/ABI:		UNIX - System V
ABI Version:	0
Interpreter:	/lib64/

It's got code to automatically detect the endianness... but not 32/64 bit word size. For that, it segfaults, which is bad.

The other problem is the function describe_elf_hdr(), which is pages and pages of this:

    switch (ehdr->e_machine) {
        case EM_NONE:        tmp="No machine"; break;
        case EM_M32:        tmp="AT&T WE 32100"; break;
        case EM_SPARC:        tmp="SUN SPARC"; break;
        case EM_386:        tmp="Intel 80386"; break;
        case EM_68K:        tmp="Motorola m68k family"; break;
        case EM_88K:        tmp="Motorola m88k family"; break;
//        case EM_486:        tmp="Intel 80486"; break;
        case EM_860:        tmp="Intel 80860"; break;

Those EM_BLAH constants live in a header "elf.h", which uClibc supplies a copy of, and glibc supplies its own copy of.

Why is EM_486 commented out? Because glibc's elf.h hasn't got it. Even though this is presumably a standard file listing constants out of the ELF standard, the two have fallen out of sync. (i486 binaries aren't specifically tagged as such in the ELF header, it's "i386" from the 80386 until you switch to 64-bit with the Athlon. Early on some people did but did that mean with or without math coprocessor? It fell out of use in the face of pentium, pentium 2, pentium 3, with or without mmx or mmx2 or 3dnow or... The ELF header just isn't that detailed.)

So uClibc's elf.h still has the old EM_486 entry (which you basically never encounter binaries for anymore), and glibc just dropped it and skipped that number.

Note that the uClibc readelf.c hasn't built against glibc for years, due to this kind of mismatch (and some strange makefile magic that meant they didn't bother to #include the headers they needed at the top). What dropping "readelf.c" from uClibc implies is Mike couldn't be bothered to keep uClibc's _own_ elf.h in sync with its own readelf.c. That was too much effort.


November 7, 2009

So the release notes are finished, the repository is tagged, I've got about half the screenshots page done, and attempting to go tp "" gets redirected to "". (The mirror on my site is still there so works, except that now the header and footer #includes have been taken out, that mirror has no navigation cues to get from page to page at all.)

I'm also adding an architectures.html page, and who knows what next time? Hmmm. Gotta get together with Mark and redo the site navigation entirely. Among other things, I'd like the left-side menu back. (Progressive disclosure is great if you want to hide information, but I'd rather not hide it just now. If anything I want it _more_ on display.)

Oh well, upload it and clean it up tomorrow. Only a little more than three weeks until I wanted to do the next release anyway, gotta get a move on as they say...

November 6, 2009

Working on the new snapshots directory, and it's slow going. The only sane way I know how to get it to look remotely how I want is tables, which don't wrap based on your screen size. (I suppose I could just have the whole thing stacked vertically with the links on the right side of the screenshot instead of under it? Hmmm...)

As always, writing documentation raises design questions. Next release should leverage the BINARY_PACKAGE_TARBALLS stuff to have lots of more granular tarballs, so should the root-filesystem tarballs still be the same as the system image tarballs, just packaged differently? Or should I not _have_ root filesystem tarballs anymore, and just include directions on how to assemble one by extracting the other tarballs?

The existence of the native-compiler tarballs suggests that root-filesystem should _not_ have a toolchain in it, but should just be the minimal base system you can extract the native-compiler into to get what we package up into the development system images. Except that sounded good back when root-filesystem could just be the minimal dynamically linked uClibc+busybox stuff, and then busybox switched to statically linked by default, so now this "minimal root filesystem" would just be the busybox tarball produced by BINARY_PACKAGE_TARBALLS. (Well, not _quite_ true. There's the empty directories (/tmp, /proc, /mnt...), and the contents of sources/native (the new /etc/passwd, /sbin/

It's a granularity thing. Right now, doesn't include things like "make" and "bash", meaning turning a minimal environment into a development environment takes more than the native compiler tarball, which implies the native compiler tarball as currently packaged is sort of useless. I need to fix that. How?

Possibly once I've got all the granular per-package tarballs, I should then a HOWTO on assembling them into a system image. (It's almost "extract them all into the same directory", except that some of them assume things live under "/usr" and some don't. The ones that don't need to be extracted _into_ /usr. Is a HOWTO overkill?)

But having native-compiler as an analog to cross-compiler is appealing. And having a minimal chroot environment (which would be the sources/native stuff plus empty directories, and the reason the empty directories aren't just in sources/native is that mercurial only tracks files, not directories) would be nice. And the current root-filesystem tarball has users, people who chroot into a uClibc development environment instead of running a system image under an emulator...

Design issues! Always time consuming...

Anyway, this release is an interim release pricesly so I can get something out and worry about this stuff afterwards, so I just need to finish up the release notes and the screenshots page, and write all this other complication _down_ so I can deal with it later. (You get a simple result by working _through_ all the complicated bits. It doesn't mean the path to _get_ there was simple.)

Hey, and Ubuntu lost its marbles again:

[18916.659218] iwl3945: Microcode SW error detected.  Restarting 0x82000008.
[18916.659231] iwl3945: Error Reply type 0x00000005 cmd REPLY_TX (0x1C) seq 0x0000 ser 0x0000004B
[18916.660006] iwl3945: Can't stop Rx DMA.

So my network is dead, no packets getting routed through the bricked wireless card. Ubuntu 8.10 used to let me rmmod and insmod to get my network back, but 9.04 will kernel panic if I do that. So I have to reboot again. And the sad part is I just had to reboot last night because of the X11 hang, which is a _separate_ bug but one that's bitten me something like a dozen times this month.

(So after years of mocking Windows, Linux needs daily reboots to keep a desktop system running. Our big argument in _favor_ of Linux is better code quality, and after 18 years into Linux development every new release is still full of _regressions_ from the previous release. That's kind of sad.)

November 5, 2009

Last week I had a phone interview with a manager at Google Europe, and today I had a phone interview with a Google engineer (although he was in Oregon, not Europe).

So I've got all the binaries I want to release, but writing the release announcement turns out to be kind of time consuming. (Muchly documentation needed.)

This is basically a new entry in news.html, which involves going through hg commits 810 through 876. As usual, I keep thinking I haven't done much until I try to document it, and then there's so much stuff gets lost in the noise. (New bug tracker with roadmap! IRC channel moving! Prebuilt static dropbear, strace, and busybox binaries! New sources/more and sources/sections directories! BINARY_PACKAGE_TARBALLS option. New /dev/hdc with /mnt/! Fixed powerpc and sh4 to run with stock qemu! New armv4tl-eabi target! Now with more exclamation points!)

Another thing is that Mark made the web page lots prettier, but I can't personally navigate it anymore. There's no mention there _are_ additional pages in the FWL project unless you notice two things that I didn't (Mark told me):

In addition, I have no idea how to _modify_ this setup. Now that the nav bar went away, there's no longer any mention of an IRC channel, or the RSS feeds hg produces for commits and releases. (And I'm adding a "history.html" page, which isn't done yet. And now I'm adding a screenshots directory.) How would I add any of those? Ask Mark to do it, I guess. Except if I didn't notice this hidden pulldown menu at the top, how do I expect anybody else to?

The new layout is _pretty_, but I personally have trouble navigating it, let alone modifying it. Not sure what to do about that. I _like_ the pretty, but...

Also, Mark got a new job. (The economy sucks, we haven't had enough clients to keep the company going as more than a part-time thing. I'm interviewing elsewhere myself. This is a great hobby project, but in a recession nobody's willing to pay for what they can get for free.) So I'm not sure how much time or interest Mark is going to have in managing the website once his new job gets up to speed and starts eating his time and energy. (I suppose the same is true for me, one of the Oxford gigs effectively mothballed this project for months because I just didn't have the energy to work on it when I got home from work. *shrug*)

Oh, I'd also like the news.html page to be more than _just_ release announcements. Possibly the release announcements need their own page? I've been posting [LINK] (oct 30, Oct 26, Oct 24, Oct 15, Oct 7 twice, Oct 3, Oct 1, August 26, August 21, August 20 (twice)) general status updates to the list. Those should really be in some kind of news blog with an rss feed.

This blog has a lot of me _working_ on stuff, but it's not a summary of _results_. Even if you focus on just the "span id=programming" stuff in this blog (someday I need to updated the rss feed generator to be able to select just material in specific span tags), lots of it is ruminations about directions I don't take, vague blue sky plans I won't get to for months, debugging that results in a three line fix but takes weeks to work out. What the project needs in a news page is actual _news_.

I suppose what I really want to do in such a news thingy is have posts to the mailing list that _also_ get posted to a web page (and sent out to an rss feed), but which you can then go to the mailing list to reply to if you want. (I suppose I could set up a second announcements only mailing list and cc: that, but that seems kind of silly. These aren't necessarily "announcements" of any import, that would be the news.html page we've got now. They're news blog postings along the lines of "what's new this week".)

Eh, I'll figure something out.

November 4, 2009

Putting together a release. This script renames the cross-static tarballs to cross-compiler, including redoing the directory prefix in the tarball:

for i in ../cross-static-*.tar.bz2
  X="$(echo $i | sed 's@^../cross-static-\(.*\)\.tar\.bz2@\1@')"
  tar xvjf $i
  mv cross-static-$X cross-compiler-$X
  tar cvjf ../cross-compiler-$X.tar.bz2 cross-compiler-$X
  rm -rf cross-compiler-$X

I ran a buildall on securitybreach, and afterwards smoketest-all said:

Testing armv4eb:FAIL
Testing armv4l:PASS
Testing armv4tl:PASS
Testing armv5l:PASS
Testing armv6l:PASS
Testing i586:PASS
Testing i686:PASS
Testing mips:PASS
Testing mipsel:PASS
Testing powerpc:PASS
Testing sh4:FAIL
Testing sparc:FAIL
Testing x86_64:PASS

The sh4 test fails due to the serial port thing (that's a case of the test failing, not necessarily the system image or emulator failing). But the interesting thing is that sources/targets has 15 non-hw targets, and that's only 13. The m68k and powerpc-440 targets aren't building a system image.

I should upgrade smoketest to add a "MISSING" category, but Mark's mostly written a new expect based version, so I'll wait for that to go in. (Yes, the new version is written in Python, but I have different portability standards for a test harness as for the actual build. Especially since the test it's running is so easy to run by hand.)

November 2, 2009

So the reason the uClibc-git build has been dying all this time is that vapier decided readelf.c was just too much effort to maintain, and yanked it. Sigh. Forked a version to maintain myself, but I'm actually thinking this sucker might be best done in lua. (The bulk of it is the big long list of EM_BLAH values and corresponding strings, possibly it should just parse the comment fields from the /usr/include/elf.h header.) Eh, we'll see.

And now the i686 build is segfaulting. All right, what's going on... It's the static host toolchain, linked against uClibc-git and really not working... Because ccwrap.c is using the malloc functionality of realpath() and despite arguing about it for a week they haven't actually applied it to the repository yet, and I didn't make an alt-symlink for the patch because I thought they had.

I expected "Ew, that's disgusting that the PATH_MAX=4096 size limit is still there even in the dynamic memory allocation version, you should fix it _properly_ so it can handle unlimited length paths which is the whole point of being able to pass in NULL and have it malloc the right length". I didn't expect a week of arguing about how to micro-optimize the hack. (THIS PATCH IS QUICK AND DIRTY. That's part of its _nature_. That's part of what it tries to _accomplish_. It's not an implementation detail, it's a "we've been missing this functionality for years and now SUSv4 requires it, this at least lets programs limp by". And yet, they can argue micro-optimizing the hack for days. I question the priorities here, but don't really want to argue with Mike about it.)

Darn it, ported the patch (yes, ported due to gratuitous churn du jour, in this case they re-wordwrapped the comments before the function so I had to fix it so the patch applied to the start of the function)... and it's still segfaulting. Diagnose it in the morning...

October 31, 2009

And xubuntu had the hang again, forcing me to reboot. It does this so often I don't always bother to mention it.

On a related note, Firefox is kind of stupid. It remembers the 8 gazillion tabs you had open before the last crash, but won't let you access or edit the list. All it lets you do is "would you like to re-open all these tabs now, or discard them forever, and no you _can't_ launch a separate instance of the browser even to look at something local without answering this question". When the current hotel wireless access has a login screen (itself stupid, but that's a side issue), this can get really unpleasant. Especially if xubuntu's wireless manager "helpfully" remembered the wireless association and brought it up when you thought the laptop was offline so they'd all redirect to "could not access the network" tabs.

In theory, if you're fast ennough you can do a ctrl-alt-backspace to kill the whole desktop when you see it doing this. Except Xubuntu 9.04 seems to have removed this key binding, it no longer kills X11.

It's not that it's broken. It's not even that this OS is now 20 years old and still riddled with breakage. It's that things that _used_ to work regularly stop working. There's no positive direction, it's just circling the drain.

I have the xubuntu-9.10 iso downloaded and sitting there, and when I get home from Oni-con I need to install it... but I expect it to break at least as much as it fixes, because they always do. (And this is _my_ fault for not running the betas and experiencing daily _expected_ breakage on my laptop. I still need to buy a Mac.)

Speaking of which, xubuntu popped up a window saying my laptop battery's down to 47% at maximum charge, which is defective. This one lasted a lot longer than the previous nine cell battery did, but it's still getting into the "put more money into it, or buy a mac" decision time...

Spent the day at Oni-con with Fade and Beth, stalking seeing Randy Milholland and general anime-con-ness. Meant to get back in time for Halloween on 6th street, but it's coming up on 10pm and we're eating dinner before heading out, and it's a 3 hour drive. We'll get back around 1:30 am, not sure we'll be up for much by that point.

At a nice Empanada store. If to empower someone is to give them power, then presumably to empanda someone is to give them a panda. No wonder pandas are going extinct, if they're that tasty. They also have a machine that can turn an entire potato into one big long spiral french fry, which is quite impressive.

October 30, 2009

Drove to Oni-con with Fade and Beth. (Mark decided against coming, we meet up with him for trick-or-treating on 6th street saturday night.) Got here after all the interesting panels, but early enough to catch Randy Milholland's "Bunnies and Burrows" game.

We not only got the halloween candy the king wanted (most of which _wasn't_ covered in blood) but after the monster had finished off all the local humans we A) set it on fire with the gasoline, B) hit it with the grenade, C) hit it with the rat poison, D) had poison ivy in the fire so it breathed the smoke, E) the police who'd showed up to investigate the grisly murders saw the burning monster and shot it repeatedly. Pretty much all at once. (The frightening part? This is what I was _trying_ to do. It came together more or less by accident, but it was my announced plan and none of the party members died pulling it off! Although my rabbit was down to 2 hit points left at the end there.)

Yes, this is a Gurps game loosely based on Watership Down. No, he didn't mix in the Cthulu gurps rulebook this time.

Good times, good times...

October 28, 2009

Went out with Beth to see the Toy Story double feature. (Fade was going to come but we missed the 3pm start time and she couldn't come to the 7:25 showing.)

It's kind of odd how primitive the Toy Story 1 graphics look by modern standards. The story holds up just fine (and it's hard to top "Wind the frog!" as a battle cry), but Toy Story 2 is actually the better movie.

Looking forward to the third movie.

The reason distcc was failing is that QEMU 0.11.0 switched its default network adapater type (for x86 and x86-64) to the intel gigabit ethernet thingy, which the kernel .config didn't have a driver for. Easy enough to fix... once you know what the darn problem is.

Went ahead and checked in the unfinished history thing just to get the hg diff noise down to a dull roar.

October 27, 2009

Heh, remember how I contacted Google Europe three months ago? Well, they got back to me, and want to schedule a phone call...

I'm trying to finish the little Firmware Linux History document I've been writing so it doesn't keep showing up as pages of noise every time I do an "hg diff", and that prompted me to track down the old series of initramfs columns I wrote back when I was at TimeSys. (I even managed to dig up the third one they never published; I'm not actually sure I completed it. I can think of material I'd want to add.) Anyway, they're mirrored here now.

The uClibc config rewrite mostly worked, except some platforms have DOPIC=y set and i686 builds fine but can't compile anything natively when that's set, and I don't know why. Oh, and I haven't written the USE_UNSTABLE support yet so my in-progress sparc tests of the current uClibc-git can't progress until I get that back in...

The "UCLIBC_HAS_MMU" and "UCLIBC_USE_MMU" dichotomy remains insane but the guy who wrote it defends it, and since A) he hates me, B) I think he has horrible technical judgement (I suspect the two are related, yes), it seems unlikely to change any time soon.

Ok, I'm officially sick of distcc:

  CC      procps/top.o
distcc[13524] (dcc_build_somewhere) Warning: failed to distribute, running locally instead
  CC      libbb/wfopen.o
distcc[13532] (dcc_build_somewhere) Warning: failed to distribute, running locally instead

WHY IS IT FAILING? Give me a HINT, guys? Was the network not set up right? Could distccd on the host not find the right gcc binary? What actually went WRONG?

And while you're at it, DON'T RUN LOCALLY. _FAIL_. This is an error, you stop working now because this should not happen and it needs to be FIXED.

October 26, 2009

Drew Cohan pointed out that ctrl-r will redo in vim, which is really cool to know. (Several ubuntu versions ago I'd looked up a vim redo command and it popped up a "this vim is compiled without redo support" message, apparently that's changed.)

(Christian Holtje suggested capital U, but that one doesn't seem to work.)

Didn't get the job I interviewed for last tuesday, but they say they might be interested in contract work.

In my career I've only really had three "permanent" positions that didn't start with a built-in timer. The first one (IBM) I left partly because I was fresh out of college and wanted to see the world, but mostly because OS/2 was dying and I couldn't get into Java or Linux development (except as a tester) unless I left IBM to do it. (I needed to wait 6 months and 18 months, respectively, for IBM to get religion on those two topics. Oh well.)

WebOffice I rode down into the ground, even going to half-time at half pay to stay on a bit longer in hopes of making it work, but their problems weren't something software could fix. And TimeSys I stayed through four managers, two CEOs, and longer than 90% of the other engineers.

It's not that I necessarily want contracting more than full time work, just that my skillset seems to be aimed in that direction. When things go well, I fix people's problems and then they don't need me anymore. (At the other end are the classing "consulting" gigs where their in-house efforts screwed things up so badly that they couldn't cope and called in an outsider to clean up the mess, or at least take the blame for it. A lot of times that's just a way of delaying making a hard decision or facing an unpleasant reality. some consultants milk those for years but I've never had it in me to prolong a situation just because it pays well.)

Started collating together the uClibc configs for the various targets. I'm making one big baseconfig-uClibc common to all architectures, and adding the 2-3 lines unique to each one to the settings file of each target, so it can be appended and then the result expanded into a full-sized .config.

Small and straightforward enough that I should be able to get it stable in time for the release at the end of the month...

October 25, 2009

It's ironic that the vim "undo" functionality is the most dangerous feature in the program in terms of data loss. If you accidentally wander from insert mode into edit mode and hit u (a vowel), entire paragraphs of recently written text can suddenly go away... and there's no way to get it back. (There is a "redo". It's not implemented.)

The reason yesterday's fun little debugging session went so long is that distcc is fiddly. The new stuff doesn't manually take a path argument to where your cross compiler lives, instead it searches the path for $ARCH-cc and sets up its own set of names for it distcc can use. Specifically, distcc wants unprefixed names for all the tools, so the new code searches for $ARCH-cc out of your $PATH (the "which" command again), figures out which directory it lives in, gets a list of all the tools with $ARCH- prefixes in that directory, and populates a directory of unprefixed symlinks to them, which it then points the distccd $PATH at.

Curning through all the variants of "these symlinks aren't being made right", we then get to "the symlink is fine, it just won't run because I've violated the expectations of ccwrap.c".

The point of the wrapper is to rewrite all the paths to point to header and library directories relative to where the executable lives. But this new directory of symlinks lives somwhere _else_, so rewriting all those paths relative to the new symlinks will give nonsense. I need to resolve the symlinks until I find out what it actually points to, and then make the paths relative to _that_.

The libc function to do this is "realpath()", but that function is deeply flawed in that its second argument (the new buffer to copy data into) hasn't got a length argument, and a path could be arbitrarily long. Once upon a time there was PATH_MAX, but these days that's only advisory.

The Linux guys fixed this long ago in libc5/glibc so passing in NULL for the destination buffer would malloc() a new one of the appropriate size, without which this function is insecure now that PATH_MAX is obsolete. Unfortunately, uClibc hasn't caught up.

So I sat down and spent the 5 minutes to come up with a quick uClibc patch to make the malloc behavior work, even if it is just a quick hack that does an alloca() of a PATH_MAX sized buffer and then strdup()s it on the way out. (The returned result can't be longer than PATH_MAX, but the bounds checking for that was already there.)

(What surprises me is that in the past 5 years nobody _else_ has bothered to do this. Oh well.)

I then spent a while debugging _that_, because although it seemed to work fine at first it turned out the new codepath wasn't triggering because the compiler was optimizing out the null test. (Wha...?) After much head scratching I traced this down to the function prototype, which had this strange little __nonnull((2)) annotation on it to let the compiler know this argument could never be null, although obviously it can. (At first I thought declaring it char blah[] instead of char *blah was doing that, but changing that didn't fix it. No, it's a funky __gcc_extension. Right.)

It's also strange that:

$DO_DEBUG PATH=/blah/blah:$PATH thingy

tries to run PATH=... instead of running thingy. Even when $DO_DEBUG isn't set and thus goes away, the resulting command line doesn't do the "set variables only for this run of the program" thing unless you remove the variable expansion, meaning you can't set it to "echo" to display the relevant commands instead of executing them. (Enabling tracing mode in bash isn't nearly so useful because it spits out 20 lines of garbage calculating port numbers and creating symlinks and so on, not the actual meat of what the script is doing.)

Must fight with that later...

October 24, 2009

Took advantage of the marvelous weather and biked to chick-fil-a, where I spent most of the day implementing and debugging the new ./ script plumbing. It's touched the top level (most of which went away, it's trivial now), and (which got largely rewritten), and I added a new adjacent to in each

I'm sometimes ambivalent about UI issues because every option sucks. Way back when I had three scripts which called each other and which you ran depending on what you wanted. (This script sets up distcc and calls this script which adds a home directory which calls this script to actually run the emulator.) I threw that out and smashed it all together into a single script with a bunch of command line options, which didn't actually turn out to be noticeably simpler.

It's also touched sbin/ inside the emulator, although that's mostly better status reporting, and it also makes sure that /home is writeable space (mounting a tmpfs on it if you haven't got an hdb). Although if hda is writeable, that's not really necessarily ideal behavior either... (As I said, "ambivalent". Checking for and handling a half-dozen different special cases is too complicated and makes the scripts brittle and hard to read, but _not_ doing it means they won't always do the right thing. Lots of sitting down going "ok, what do I want to accomplish and what's steps cover the most ground for the least complexity". Yes, I often spend two weeks pondering before writing five lines of code, because otherwise they aren't the RIGHT five lines. Actually they usually still aren't, but they suck less.)

Met a blind guy on the way home, who was standing very still in the middle of the sidewalk facing a building, and then started slowly inching his way up a garage parking ramp. His name was Raymond, and it turns out he'd locked himself out of his place until morning (when the manager gets in), and was trying to make it to 6th street where he knew of a 24-hour convenience store. (This was 11th and Lavaca, and he was facing east.) His closest relatives live in San Antonio and Waco, so there wasn't anybody he really wanted to call on my cell phone, he was just waiting until morning.

Locked up my bike and led him to a coffee shop that was open another hour, then went home and got my car and drove him to a Whataburger on the bus route that would take him home. Got home around 2am.

Distcc is fiddly. The reason yesterday's fun little debugging session

Sat down and spent 5 minutes

October 23, 2009

Largeish ripping-a-new-one of the code underway. It's always been easy to build the appropriate target and then use, but it's been a bit of a pain to download the right system-image (and cross-compiler) tarball and hook them up in the right way to get a development environment. The command line for that is too long and fiddly and not properly documented.

So the NEW way to do it is to download system-image-$ARCH.tar.bz2 and extract it and run out of that. That gives you the 256 megs of ram and 2 gigabyte hdb.img. If you want the distcc accelerator trick, you also need to download cross-compiler-$ARCH.tar.bz2 and add its bin directory to your $PATH. (Or extract that tarball into the system-image-$ARCH directory the first tarball created, or into your home directory. It'll automatically pick it up from either location even if it isn't in the $PATH.)

So I'm redoing the script so it sets up distcc based on the $PATH, and adding the new wrapper script that defaults 256 megs of memory and creates a 2 gigabyte hdb.img if it isn't there.

I'm also removing all the old command line arguments that used to just set environment variables. It's just as easy to set the environment variables directly.

The also announces when it DIDN'T find the components for the distcc accelerator trick, and why. (Either couldn't find distcc, couldn't find $ARCH-cc in $PATH.) And should also indicate whether or not it's using distcc.

(It would be nice if there was some kind of distcc smoke test it could do. I've had way too many "it thinks distcc is working, but it isn't really" occurrences. The static native build logs are a good way of noticing next time I break that, but it would be nice to get something properly automated.)

Heh. Over in another corner of the project, the reason that uClibc-git on sparc wasn't behaving like I expected is that the file you put in sources/targets/sparc is "miniconfig-alt-uClibc", not "alt-miniconfig-uClibc". Oops, my bad. That _would_ explain why the symbols in there weren't taking effect. :)

October 23, 2009

Sigh. 5pm came and went, no contact from Tuesday's job interview. Darn it, that was a really good one.

I think a better way to do the --with-distcc stuff would just be to grab the appropriate $ARCH-prefix compiler if it's in the $PATH. Then could just adjust the $PATH rather than passing in a magic command line option.

User Interface questions are always fiddly, there's lots of things that could work and you have to try to figure out what's easy to understand and remember. This has the advantage that you can set it up and have it work automatically. I should have spit out a line if it can or can't find the compiler, so you have some feedback. (It's not like there's a shortage of kernel boot messages at the moment anyway, although a --quiet option might not go amiss...)

October 22, 2009

I really hope the nice place I interviewed at on Tuesday hires me. I really don't like job hunting, and something stable would be really nice. (I was hoping Impact would be able to provide stability, or at least continuity of health insurance. But it's just more endless job hunting for the next contract.)

Last time I chose "stable, long-term position I should be able to spend 5 years at" as my primary job search criteria, and I got to watch Timesys melt out from under me. I went through four managers (the fourth of which I had to train) and two CEOs, and watched 3/4 of the engineering department leave without being replaced until I finally gave up and followed them out the door because I didn't even know what the company did anymore. (Answer: take a fresh infusion of venture capital and hire a bunch of recent college graduates. It sold expertise.)

But I still want stability. I get it from open source projects, and I've been living in the same condo since 2003. But Fade's got a family reunion coming up in January: Will I have time off? Can we afford to go?

It'd be nice to be able to make plans...

Haven't heard back from the PPC guys. (Well, benh said he'd look when he got back from Japan in a week and Alan Cox's email address from the patch bounced. He's at Intel now, but is he still interested in fallout from his TTY patches?

Last time I dug into the tty later it took 3 days. Maybe I'll bang on it this weekend if nobody else gets to it before then.

Found my copy of "Wolf who Rules" and spent most of the day reading that.

Stressed. Hate job hunting. Got approached by two indian companies (one pretending to be in New Jersey, the other pretending to be in Los Angeles) acting as recruiters for the Texas Water Board. Mark got approached by _four_ different recruiters for the same position.

It looks like a nice job, but I didn't apply for it and that's quite a feeding frenzy over one job. (Which might be three positions, but that's still more recruiters than positions.) And they want me to fax them a signature before they'll submit my resume, which sets off my "phishing" sense somewhat, especially since they approached _me_ rather than the other way around...

Wound up "not pursuing this opportunity", and feeling bad about that too. I should be job hunting more.

At least the place on Tuesday was only interviewing two other people for the position. (And this one's a nice enough one it would suck to _not_ get it, which is probably why I'm stressed.) They said they'd let me know on either Friday or Monday. I have various questions I should probably email them, such as what's the dress code like (seemed like shirt with collar and black jeans would pass?) and what's the health insurance like... But at this point either they're going to hire me (in which case I'll live with whatever their defaults are) or they won't (in which case it won't matter).

I have a book. It's a good book. I go read that. No programming today.

October 21, 2009

So the powerpc thing is somewhere between 2.6.28 (which works) and 2.6.29 (which panics as soon as interrupts are enabled). Unfortunately, in between is a long stretch of null pointer dereferences in the IDE driver, and after calling :git bisect skip" over two dozen times it's still noddling around in there.

Why can't git bisect skip chip off 1/3 from either end and look _there_? Testing commits adjacent to the one that didn't work is highly likely to find another one that also don't work for the same reason. Testing 25 adjacent comments in a range of several thousand is ludicrous. Pick a point 1/3 from either end instead of halfway between them, which should at least get you away from the error, and then if you continue bisecting from there it should introduce enough jitter to bypass it entirely if it isn't a particularly huge range. But having to rebuilt dozens of commits is not helpful.

Also, when I "git bisect skip", the number of commits to test goes down by one. Honest and truly it does, because I TESTED that commit. The results may have been inconclusive, but that's not my fault.

Found it. The powerpc serial problem I reported turns out to have been introduced by commit f751928e0ddf54ea4fe5546f35e99efc5b5d9938 written by Alan Cox. If I check that out it fails, but if I rever the patch "git show" gives me, it then works.

If I revert that patch i686 builds and runs fine, but I haven't gotten powerpc building in-situ properly with the reverse of that patch in FWL yet. (I think I broke the .config going from 2.6.28 to 2.6.31.) Oh well, deal with that in a bit. First I need to report it to #mklinux on freenode and linuxppc-dev at (And no, google can't easily find either of those.)

October 20, 2009

Job interview today. Hope I get it, seems like a really nice company.

Interview went well but 2 other people are also interviewing for the position. They should let me know friday or monday. (Still better than the national average, which is six candidates chasing every job opening. I do not like this "recession" thing.)

October 19, 2009

And Eric went public.

Beth totally saved his ass...

Wrote up the whole story (from my point of view) and emailed it to Beth. If she wants to make it public, that's her choice. The balance of power's back in her court, where it belongs. (Set metaphor to "frappe"!)

So the 2.6.24 kernel doesn't have the powerpc serial panic, but 2.6.31 does. Time to bisect.

October 18, 2009

Laptop had the X11 hang again. Essentially lost the day's work in the reboot. (Not all the work, but context and what I was doing was all in the open windows.)

Reading my LUA book. (I've got a job interview tuesday morning for a company that A) looks cool, B) is local, C) uses LUA. Yay! An excuse to give this the time it needs to properly learn it.

What impresses me about this language is how stunningly minimalistic it can manage to be while still accomplishing everything it needs to. It's so SIMPLE. For example, the book points out that "A and B or C" is almost equivalent C's "A ? B : C". (The and gets evaluated before the or, so it becomes (A and B) or C. If A is false B gets evaluated. If A is false B doesn't get evaluated, but C does.

The corner case is if A and B are both false C still gets evaluated, but in Lua "0" isn't false. Only nil and the special boolean typed value "false" are false, so "max = (x > y) and x or y" works as expected even when x is 0. Yeah, it's funky in that it _doesn't_work_like_C_. (The whole arrays starting from 1 instead of zero thing is also going to bite me a few times before I get used to it.) But by slightly adjusting the assumptions it's been able to jettison about half the complexity of other languages, it just doesn't _need_ a lot of the constructs you take for granted elsewhere...

This is a noticeable step up from Python, which is now a completely different language than when I started learning. (Python 2.1-ish, circa 2001, didn't even have generators. I've picked up a few of the new contructs as it went along, but by no means all. By Python 2.5 we were already well into "new language" territory, and then Python 3.0 made it explicit. They could easily have called it "anaconda" or something, they just wanted to move their existing userbase to the new language. Yeah, I need to learn Python 3.0 at some point. Knowing "Python 2.3 or so" isn't the same thing at all.)

Meanwhile, Mark is learning Ruby. He seems to be enjoying it immensely. I should probably learn that one too, I've heard good things about it, and oddly enough todaty I've been reading about a Ruby book, if not actually reading said Ruby book...

I am totally going to have to read the LUA runtime source code. I have theories about how they implemented some of this...

October 17, 2009

Trying to prevent Eric from turning into RMS, via email. Long, draining, tiring thread I'd much rather not be involved in, but Beth called me last night all upset and had me confirm that the email he'd sent her was _actually_ from him (it was that out there), and it sort of snowballed from there.

I really hope this thread doesn't go public. It won't just wind up on slashdot, it'll wind up on _fandom_wank_.

Spent most of the day out with Stu's daughter Beth, and with Mark. Visited a toy store, a bead store, and a "general goods" store that sold rubber ducks in the shape of celebrities, candles scented like flowers I couldn't name (but Beth could), and enough 1950's diner style plates and utensils and syrup pitchers and such to launch your own diner.

The north drafthouse location has already stopped doing the Toy Story special feature in 3D, so we saw Cloudy with a Chance of Meatballs (me for the second time, Beth for the first). It continues to be awesome.

Then went downtown to meet Mark and see Master Pancake Theatre mock "Nightmare On Elm Street" (which oddly is _not_ to the tune of "21 jump street", nor to "Life in the fast lane"). Their theory that Freddy and Michael Jackson are related has some merit (both wore a fedora, were badly burned, had a trademark single glove, peaked in the 80's, had disturbing interactions with children, and are now dead). Good to see Owen back with the group, even as a guest. It's a family thing.

Since the downtown drafthouse is on 6th street, we all went to see 6th street. Mark showed us several bars. There's one called "Pure" whose theme is antiseptic sterility. The place is scented with various cleaning agents, and their tables are solid sheets of plastic lit from underneath to glow blue. (The ground floor is all blue-lit. The top floor is all red-lit. As far as we can tell, the building is heading downwards at a significant fraction of the speed of light, but it seems to be ok with that.)

And I found the coffee shop on 4th street where you can order S'mores. You pay the money and they bring to your table a tray containing a pile of marshmallows, graham cracker pairs in little plastic packets, hershey bars, little wooden sticks, and a metal pot containing fire. (Sterno or something, they light it at the table and it burns for 10-15 minutes.) You then toast the marshmallows and make smores, there at the table. It's a wonderful thing. The marshmallows, graham crackers, and chocolate were surprisingly well balanced but we almost wound up asking for more fire. (It was close at the end, we talked for too long.) I probably shouldn't have eaten the third S'more, but it was really good, and Mark drove Beth home so I got to walk it off.

Fun day. Spent most of it away from my laptop.

October 16, 2009

Ok, the next con I'm doing guests for I need to invite Diane Duane.

Over in FWL-land, I'm juggling lots of non-working platforms. Interest in sparc has resurfaced, and current uClibc-git has a lot of fixes. It also doesn't build busybox, because include/utime.h isn't getting installed. (I don't know why, I've got CONFIG_SCREW_SUSV4_CODE_ACTUALLY_USES_THIS set.)

The powerpc panic seems to be due to the Poppy Z. Brite PMACZILOG serial driver, which panics as soon as interrupts are enabled (I.E. right when the kernel launches init). The issue might be fixable by bisecting back in the history and finding the last time the driver worked, or it might be fixable by switching from qemu -M g3beige to the mac99 one, which is apparently less badly maintained?

No progress on Alpha or m68k recently. I think if qemu hasn't got a board emulation for it, then it's not a FWL 1.0 release target. That does mean I should dig up and dust off the xylinx/microblaze patches, though.

October 15, 2009

Last night, I went to an organizational (possibly too strong a term) meeting of the Texas LinuxFest. This is the second local austin convention that I could probably take over and turn into a "Linucon jr." if I really wanted to put in the effort, but again once again starting over from scratch managed to seem like the easier approach.

The meeting was scheduled for 7pm, I arrived about on time and sat outside with the other people who had been locked out: the building we were meeting in locks as 7pm. Luckily the guy delivering pizzas had the number to call, so they let us inside where we ate the free pizza and sodas.

I believe the total attendance at the meeting was 6 people, if you include me. They had about as many again on speaker phone, although only one (David, from his Dell office up in Round Rock) actually spoke that I recall. (On the bright side, this meant there was plenty of leftover pizza and we all got to take some home.) Apparently the google group has about 40 subscribers.

They have no con chair, but believe that things can get done via committee or general acclaimation. The guy whose idea it was attended and ran the meeting, but doesn't want to be con chair.

They're actually not very interested in a combo sf/tech event ala Penguicon or Linucon. They have an existing "Linux Fest" model that they're following, which I'm only passingly familiar with, largely due to a lack of interest on my part. (Note that actual technical events like Ottawa Linux Symposium don't fall under this model: that's not a "fest". Ohio LinuxFest counts, though.)

They hinted a few times that they were looking for people to step up and do stuff, but it really doesn't sound like my kind of fun, nor are they interested in me trying to add my kind of fun to their event. They know what they want to achieve (in an extremely vague sort of way), they just seem to want somebody else to do it for them.

They have no venue, no speakers, no preregistrations, but what they do have a name, an event date, and lots of verbal agreements with sponsors. Apparently five figures worth of sponsorship if everybody they have verbal agreements with comes through. I didn't ask about bank account or incorporation details, but my impression was right now they haven't bothered to set up either. (They did mention two existing 501c3 entities that might want to handle money for them, if/when they actually get some.)

Despite their lack of venue, their event date is in six months. No venue candidates were presented during the meeting (they haven't visited any sites or gotten any of the little information folders the places give out). Finding some is the first item on their todo list, and they hope to meet again in a month to see if they have something by then. (They can't change the event date because it's one of the only things anybody knows for sure about the event.)

They were interested in the new facility UT built on MLK, or getting space at UT or St. Edward's, but have no students or professors involved in organizing the event. I brought up doing a call for papers to attract interest from professors (and corporate researchers), but I'm not sure there's enough time left to do one properly. (I tried to work out a rough schedule for them starting _now_, but nobody wrote anything down. Doing a call for papers raises a lot of complexity I don't think they're prepared to deal with anyway.)

My venue candididates don't really help, because a one day (saturday-only) event means people are probably not going to travel to it, so doing the SF con thing of bartering room block for function space means that schedule deprives them of any significant number of room nights to barter with. They also have to work setup and teardown into their day, or rent the facility for extra time on adjacent days, for vendors only, not attendees. (I suggested they ask armadillocon for a run-down of the local hotel function space anyway, even if they're paying cash it gives them more options.)

Admission is free, which explains why there is no preregistration mechanism. (Why would you preregister for an event that's free at the door?) The downside of this is nobody's committed to coming until the last minute, at which point it's an impulse decision either way and easy to decide not to bother because you have nothing invested in it. (On the other hand, since they're not using attendee registrations to pay for the event and aren't bartering a room block, on a financial level it doesn't really seem to matter whether anybody actually shows up or not.)

When I asked about the focus of their event, they used the phrase "big tent" to confirm an intentional lack of focus. They said their intended focus is "open source" in general, but I pointed out the name they've chosen is more specific (Linux only, and again they can't change it because it's one of the only concrete things you can say about the event, plus their "Linux Fest" model). When I specifically asked about open source on the Macintosh (in theory built on Darwin), or users of Firefox and such on Windows, or even things like Jamie Zawinski's struggles to get Dali Clock running on his Palm Pre, they said they didn't see those as significant sources of open source development effort. So they have a certain amount of focus due to both lack of interest, and to the unintended consequences of their name. (Apparently non-Linux can go hang, except for the BSD guys that slavishly chase every occurrence of Linux to say "me too" and the FSF guys who try to stick a GNU/Dammit on it. Both were brought up at the meeting, although not quite phrased that way.)

I asked about tracks (embedded, supercomputing, server, desktop, kernel, network security, desktop usability, programming, performance, debugging...), and they seemed to think that was a bad idea. They said having lots of different panels might make more people come. (For the one panel that interests them? Ok.) But also that panels aren't really of much interest, because it's not what the half-dozen people in that room (except for me) went to other LinuxFests for. The main draw for them seems to be hanging out with the vendors. (They did mention their distro sponsorships might lead to a lot of distro guys coming, so presumably they could put together a track about... repository management and package integration, maybe? Not sure. It was at least a theme.)

I asked "where's the shiny?" and explained I wanted to know what excited them about this thing, which they might want to communicate to their potential attendees. They couldn't come up with anything specific. The idea was that their companies had sent them to other shows around the country (as vendors), and they didn't like having to travel out of Austin to go to Linux shows when there's so much high tech in Austin. So they want lots of vendors hanging out with each other, here in Austin. (I note that Door64 has arranged two of that sort of event in the past 6 months, neither of which I went to. As far as I know, the Texas LinuxFest guys and the Door64 guys are not aware of each other's existence.)

They specifically mentioned basic introductions and tutorials when asked for example panels, and they're looking forward to giving out lots of distro CDs, which they saw as an asset forming one of the big draws of the event, and I see as the modern version of AOL coasters. It seems to me that the people who would go to a LinuxFest are people who already using Linux, and thus not the target market for basic introductions and distro CDs?

I did get a _little_ accomplished. I pointed out that their google group was set so you have to log in to view the archive, and they confirmed it and promised to fix it. I explained about the "heartbeat blog" of Penguicon's first year, and they thought that was a good idea, and that they'd ask on the google group if anybody wanted to do that. (The guy currently doing the website was there and said he'd actually just volunteered to let them use his server, not to create and maintain the actual website for them. As far as I can tell, there wasn't a single position anybody there wanted to _do_.)

Oh well, at least they updated their twitter for the first time in almost 2 months, mentioning the organizational meeting less than 2 hours before it happened. (Wheee?)

I don't know what they want. I don't know what they're trying to do. I don't see how I can help, or why I'd want to put much effort into it. I note that Stu, Mark, and Linas all declined to come to this meeting, and they didn't need to see it to make this decision.

Sigh. If I want another Penguicon/Linucon-style event, it looks like I'm going to have to create one myself. Repurposing other events doesn't seem to work, whether it's a tired old event like Armadillocon actively looking for new blood and new ideas, or a barely-there event like Texas LinuxFest looking for somebody to take over and do it for them so they can be vendors at the result. It's just not what they _are_, it's too far from their self-image to bend that way without losing their identity. And maybe too weird an idea to explain until you see it in action.

Oh well. If it's free, I'll probably still go and see how it turns out.

October 14, 2009

Finished factoring out busybox and toybox, and checked them in. Several other small cleanups too. Not fully recovered, but definitely feeling better.

Trying to benchmark static vs dynamic busybox under qemu, using the good old make 3.81 "time ./configure" to do it. (Even in a host build of that package, the configure stage takes three or four times as long as the compile, tarball extraction, and install _combined_.) It's pretty much the patholigical case for emulated native builds, and thus a good thing to try to optimize.

First run, dynamic busybox:

real	2m 35.02s
user	1m 49.66s
sys	0m 42.62s

Second run, dynamic busybox:

real	2m 36.14s
user	1m 50.54s
sys	0m 43.64s

First run, static busybox:

real	2m 6.95s
user	1m 26.56s
sys	0m 37.86s

Second run, static busybox:

real	2m 3.66s
user	1m 25.53s
sys	0m 36.18s

So it shaved off 30 seconds from a 2.5 minute job, for just under a 20% speed improvement. Nothing to sneeze at.

Alas, 99% of what this ./configure seems to be doing is lots and lots of little calls to gcc (which is still dynamically linked, and brings in the distcc transaction overhead which is at its worst for really small compiles and also still dynamically linked). You'd think the compiler itself would have some way to go "is this header there?", "is this function there?"... but no.

Still, it does imply that if the _entire_ root filesystem was statically linked, we'd see more speed improvement. Worth a poke. (After all, BUILD_STATIC is there already... :)

First run, static everything:

real	1m 56.20s
user	1m 13.31s
sys	0m 40.79s

Second run, static everything:

real	1m 58.68s
user	1m 15.38s
sys	0m 41.61s

Right, so making busybox static is worth 20%, and making everything else static is worth another 5%. Good to know, but it looks like only busybox is worth building statically by default.

I should ask the list...

October 13, 2009

Busy day, very little of it programming.

On the programming front, I managed to tell people _twice_ to test code that I hadn't checked everything in for. (I really hate having a cold.) Got a little done on factoring out BusyBox, but my brains weren't quite together enough to poke much on even that.

Took a bus to pick up the car, which was fixed for under $250. (Yay Ike's automotive on Anderson.) Bought an energy drink along the way to be up for this (there was some significant walking involved, and it's not so cool anymore, and the liquor store on The Drag sells the orange mango passion fruit rockstars I like).

Said energy drink wore off while Fade and I were at Mark's playing The War on Terror. (Fade spent about 1/3 of the game as the Evil of Empire, because Axis of Evil landed on her color the third time we spun it. I got a picture of her wearing the Balaclava of Evil, but it's still on my phone. She's warned of dire consequences if it's posted for general oogling.) Alas, the game has a mechanic that people can drop out at any time, which makes all their cities turn into terrorist units, and after somebody did so with 1.3 billion (because they had to leave) it was largely inevitable that the next player to turn terrorist (and thus control all those units) would win, rather than any of the empires. Oh well. (I was 2 victory points away, having grabbed Antarctica and Australia early and put 3 cities on each.)

As I said, energy drink wore off around the time a second player turned terrorist, and we let the terrorists win via general acclaimation. (When an energy drink is wearing off to let a cold reassert itself, Penguin mints just don't cut it.)

Then it was "catch up on errands" time, now that we had the car back. I was only up for groceries, laundry will have to wait...

I hope the cold ends soon.

October 12, 2009

So the downside of the new online bug tracker is that when I'm not online, I can't check the roadmap to see what needs doing next. This is especially annoying when my laptop decided to do the X11 hang again (Ubuntu 9.04 is almost as stable as Windows 95 was; second forced reboot this week) so I had to reboot and lose all my state, so I don't remember half the things I was in the middle of doing becaue all the open windows are gone.

Oh well, I still have roadmap.txt.

Darn it, my repository is completely horked. I pulled from mirell's repository, and doing so reverted the last half-dozen commits of mine, because it had decided that "tip" was now mirell's branch without my changes. So I told it to merge, and it complained that I had local changes. What the heck does that have to do with anything, I'm telling it to fiddle with the _repository_, not with my locally checked out files. They're two separate things!

So I told it to "force" the merge, and this did something but didn't fix the problem. I tried doing it again and it said I had an outstanding uncommitted merge.

How the hell do you have an uncommitted merge? It's a merge commit! That's what a merge commit DOES, take two already committed commits and glue 'em together. Ordinarily when I tell it "hg commit" without arguments, it goes through and finds all my changed files and tries to commit them all, but I don't _want_ it to do that. I want it to commit this abstract "merge" object, which somehow didn't get commited when it got created for no readily apparent reason. How do I specify an abstract object that doesn't have a location in the filesystem? Why would I NEED to? What POSSIBLE use does having it in an uncomitted state have?

So I told it "hg rollback", hoping this would destroy the commit 844 from Mark and put me back at commit 843 where I understood the state of my repository. That's not what happened. Instead, the repository went away. Completely. When I go "hg log -v", I get _nothing_. The entire project history is gone.

This probably has something to do with branches, but "hg branches" says nothing either. I have once again gotten the repository into a state where the project's maintainers would probably go "how did you do that", and I want it to stop RIGHT NOW and I'm ready to take a hex editor to the repository to make it happen. (Luckily, I haven't rsynced yet.)

I'm never pulling from an external tree ever again. Mark's change was a one-liner, and I lost a day of work because of it (and confused poor wangji who was trying to use the new work I'd done that mark's commit erased), and I didn't even learn anything (other than "don't ever do that again").

Ah, figured it out. I'd forgotten that before I did the rollback (expecting trouble), I tarred up my working directory, blew it away, and did a fresh checkout. (In an attempt to make that darn uncommitted merge commit go away, and also to give me an empty working store to do the merge in.) The rollback didn't undo the last commit, it undid the _clone_. (Yay shell command line history showing the actual commands I'd typed, even when I don't remember them. I hate having a cold. I also hate having to try to figure out what the mercurial developers were thinking when they implemented a given non-obvious behavior. Still light-years ahead of git, but that's damning with faint praise in spades...)

October 11, 2009

Once upon a time, I had a casette tape of Roxette's "must have been love" mixed without the stupid drum beat. It's a marvelous song, and the bass keeps the beat just fine without a monotonous drum going "thunk" every second and a half.

Alas, when your hearing starts to go, you don't percieve a restricted range of frequencies you can hear, you just have more trouble separating them. I used to be able to ignore this kind off thing easily, listening _past_ stuff and _around_ stuff all the time. But now noises blot out other noises, which is obnoxious. (Dear modern medical science: look into fixing this, please. By which I mean get out of the darn HMO nightmare that's glued everything together into one big money-gouging lump and go back to what you were doing in the 1970's and earlier. Or by all means go for the canadian model. Something where the actual healthcare parts don't get crushed between insurance paperwork, malpractice insurance, massive corporate conglomerations, and the religious nutballs taking over all four hospitals in Austin so they can make sure none of them provide abortions.)

(I have no idea how to tag the above, it wanders between entertainment, health, and politics.)

Banging on powerpc. Laurent Vivier pointed me at a page with debian qcow images that run under qemu 0.11.0 just fine (and even manage to have -hda set /dev/hda). I grabbed the /boot/config-2.6.26-1-powerpc out of that, and did a "make oldconfig" under 2.6.31 to bring it up to date. Took a bit, lots and lots of of new symbols.

If you try to run the debian image -nographic, the bootloader figures it out and talks to the serial console, but the Linux kernel does not. I fought with said bootloader for five minutes trying to figure out how to feed it a new kernel command line. If I type "help" it says:

Enter the kernel image name as [device:][partno]/path, where partno is a
number from 0 to 16.  Instead of /path you can type [mm-nn] to specify a
range of disk blocks (512B)

(Of couse it also says that if I type "walrus".)

It doesn't say what by "device:" options are (or partno) for that matter. (Dos drive letters? hda?) If I hit enter it'll boot a default, but it won't tell me what string I would have typed in to _get_ that default. It does tell me that the kernel command line is "init=/dev/hda3 ro", but not how I'd go about supplying an alternative.

Not very impressed with this bootloader.

So I built a 2.6.31 kernel with the new config, using the powerpc cross compiler I've been using all this time (which built a kernel that booted up to userspace before), and it produced four different arch/powerpc/boot/zImage* files, all of which crashed instantly with an illegal instruction when I pointed qemu-system-ppc at 'em. A quick check of my sources/targets/powerpc/settings showed I was running the vmlinux (that's the unpackaged ELF image in the root directory), and sure enough that got as far as prom_init where openbios handed off to the kernel. Roughly where the original debian kernel was failing to talk to the serial console. Even when I append console=ttyS0. (It's looking like this kernel hasn't got serial console support built in? No, menuconfig says it does...)

Ah, there's more than one type, and the 8250 emulation appears to have been replaced by the zilog one. Right, I remember now, it's console=ttyPZ0 now. *shrug* As long as it works.

Ok, using Debian's kernel .config (more or less), and debian's root filesystem, and the exact same qemu, the kernel panics when using the serial console and doesn't when using the emulated graphics card. It's userspace trying to talk to the serial console that causes the problem.

October 10, 2009

It's always something.

So I've got the native static builds building, and I've got the timeout code working, and I decide I need one more cleanup because passing in QEMU_EXTRA="-hdc /path/to/hdc.sqf" has a whitespace problem.

An absolute path from root could have spaces in it. I don't know what your system's paths look like, you could have a windows background. Heck, Mark left spaces in the URL to the slides from OLF, because he's using a mac now and macs do everything through a pretty GUI. It's easy to let spaces sneak in, and I have to support them.

The problem is, shell scripts suck at dealing with spaces. They have the "reparse command lines" problem with whitespace the way perl has the "list context vs scalar context" re-parsing problem where _it_ loses track of similar grouping information unless you stand over it and hit it with a stick.

The -hdb "$HDB" (and -hdc "$HDC") argument(s) should be optional, therefore the easy thing to do is set WITH_HDB="-hdb $HDB" and let that resolve to nothing when it's not set. But that can't have quotes around it or -hdb and its argument become a single item that qemu can't parse. Unfortunately, if $HDB has a space, then _that_ becomes multiple command line entries. Doing WITH_HDB="-hdb \"$HDB\"" means the quotes become literal characters considered part of the filename, meaning "temp dir" would be the worst of both worlds: two separate arguments but each containing a quote character.

This constant wrestling with whitespace is pretty much the bane of shell programming, and the bugs it causes don't usually happen on _my_ system because the path from root to my firmware directory hasn't got any spaces in it. (Possibly I should move it into a subdirectory that _does_ have them, just so these problems are forced to manifest so I can fix 'em.)

Ordinarily I quote the heck out of everything to try to deal with this, but in this case I'm not quite sure how to go about it. Feeding -hdc "" to qemu is probably going to make it try to open an empty filename; even if I make $HDCARG a variable that doesn't need to be quoted and will thus drop out if empty, the quote arguments don't drop out when empty. So I want to group spaces, but not have empty arguments persist, and those aren't orthogonal in shell...

This is trivial to do in something like python. There's a _reason_ people don't try to do complicated stuff in shell anymore. But I'm not introducing more environmental dependencies.

What I really want is something like the "$@" behavior where an array turns into a block of arguments with its original grouping level (regardless of spaces). Unfortunately, "$@" is a magic one-shot special case the shell people put in to deal with command line arguments, not a general facility available to shell code in other contexts. (And dealing with arrays in shell is black magic anyway.)

I can re-quote the command line and then feed it through "eval", which always makes me think I'm opening a security hole, even though this is local code that doesn't take arbitrary third party input data. An advantage of that is I can echo the command line before running it. The result isn't remotely _simple_ anymore, but the echo might make up for that. (Cut and paste _this_ to run it yourself. Even if you have to run the script rather than reading the script to see how it calls qemu.)

Well, even the echo isn't trivial. I'd have to echo the un-evaled line in order to have the quotes show up in the output (without which cutting and pasting and re-running the line means things like -append's big long argument block won't be aggregated). But if I do that, the command line will get re-parsed based on spaces for echo, so any occurrences of multiple spaces between lines will get squashed to one space, tabs will get squashed to spaces, etc...

Sigh. The designers of shell really weren't thinking about general purpose programming when they did it.

I suppose I could put one big occurrence of single quotes around it and then use double quotes for everything I do inside, and hope none of the data contains any single quotes. No filenames with apostraphes. No, I'd have to process the data to escape them (and escape backslashes), and how do you process the output without echoing it through sed or some such? By using more bash extensions. Grrr...

Such a simple issue should not open a can of worms. That's poor language design, right there.

October 9, 2009

The door 64 meetup was a lot of fun. I only caught the last 10 minutes of it (car still at the vet and I underestimated how long the bus and walking would take), and then had to leave for an appointment, but I look forward to the next one, whenever they decide to have it.

October 8, 2009

A few days ago I posted history of Penguicon, at least through the end of year 2. These days Penguicon is owned by Matt Arnold, who didn't even bother noting when the Year 2 chair had a heart attack (nothing about it on their livejournal or twitter). But oh well.

Yesterday I poked the qemu list about the curses problem again, and today it was completely ignored, zero responses and buried under a flood of new posts. (Just like last time.)

I also wrote up fairly extensive data dumps on why I expect to rename the project when the 1.0 release comes out (giving the reasoning and potential new names), and a way too long post about why using uboot under emulation is more trouble than it needs to be (by explaining how linuxbios, bochs bios, bootloaders like grub, and the early kernel boot actually work).

Unfortunately, I can't link to those because the mailing list archive is down. I've poked Mark, but he's not answering his phone at the moment. (It was working last night...)

Not a lot of programming yet today, I biked to the transmission place to help them reproduce the funky noise my car was making (it mostly only does it when it's cold, and it was 90 again today) and managed to make myself vaguely ill in the process (I'm not over this cold yet). Spent the rest of the day reading "Bloodhound", by Tamora Pierce, and doing laundry.

Exciting times.

Ah, I figured out what's going on. I stuck some printf() calls in the qemu source code to see why the curses code was triggering, and it isn't. The curses thing _was_ fixed during development, the current escape codes that were being so weird are busybox ash, trying to figure out how big the display attached to the serial console is. (Exporting LINES and COLUMNS stops it, but it doesn't actually cause the problem the curses thing did.)

Ok, that's a relief.

October 7, 2009

Mark and I are playing with a new bug generator. (Yes, I'm voluntarily submitting to one this time. Call it personal growth, call it "this isn't bugzilla".) I'm trying to beat behavior out of it that's an actual improvement over a text file roadmap. So far it seems nice, and in need of much configuration. Luckily, Mark is good at this sort of thing. (He can touch stuff without breaking it immediately!)

And Alpha needs the #ifndef inhibit_libc fix I've been applying to sh4. (The gcc-core-fix-inhibit-libc.patch thing.) GCC really isn't all that well-maintained, is it? The FSF doesn't regression test bootstrapping its old architectures... Hey, and the "-O2 causes internal compiler errors during the uClibc build" issue that m68k had _also_ applies to Alpha.

After you've gotten your first dozen platforms working, you really do start to spot patterns. "Anything maintained by the FSF is crap" is a pattern.

And I didn't get around to banging on the qemu -rc releases before 0.11.0 dropped, I'm hoping the curses issue got fixed. By default, qemu's new ./configure enables curses support if it detects you've got it on the host, and curses mode isn't scriptable. (It craps escape sequences randomly into the output, which confuses things like expect.)

During development, curses mode was enabled by default, but the current source code (in vl.c) looks like -nographic and -curses are unrelated. Here's hoping...

Nope. It's still broken. Time to ping the list...

October 6, 2009

The "um-fest" that is our Ohio LinuxFest talk is now online. We managed to compress an 8 hour talk into 54 minutes, which is about like doing the minute waltz in 11 seconds. We did it by eliminating all the hands-on parts, and the actual demonstrating the software parts, and skipping maybe 1/3 of the slides entirely, and skipping about half the material on the remaining slides, and then ZIPPING through the rest as quickly as possible.

The result was exhausting for me to listen to, and I _wrote_ it. It's fairly dense with insufficient breaks, but that's pretty much unavoidable when you compress a topic that much, even with such lossy compression....

The reason it turned into an "um-fest" is we didn't practice enough at the faster speed, so the normal delivery pace full of _pauses_ had to have all the pauses eliminated, meaning I went "um" between every point while waiting for the next slide because I _knew_I_couldn't_stop_talking_. (And the longer ums are us advancing past skipped slides. And I even got Mark doing it during one section...)

It's slightly less embarassing if you think of "um" as the "advance slide" signal, but still... :)

The first minute with a DJ trying to be funny is just inexplicable, but he came with Ohio LinuxFest and when he asked us for a summary of our company and I gave him the "Impact is how you get embedded" line and... well, listen for yourself...

Oh, the "Max Headroom" delivery is because OLF bought eeepcs to record the talk, and apparently didn't take flash write latency into account (nor test long enough for it to manifest). It's in the original flac, not an mp3 encoding artifact, alas. (Yes, realtime issues affect the real world. Who'da thunk?)

But hey, the recording is up! (And the slides are here.)

On an unrelated note, it occurs to me that Barack Obama uses the Moist von Lipwig style of project management: "Do everything at once, get it done before they can object". (Napoleon also followed this path.) Too bad the federal government doesn't work that way...

October 5, 2009

Dropped off my car at the vet so they can keep it overnight and check the transmission in the morning (when it's cold).

Walked home (yay exercise, but Fade says I'm sunburned now), and banged on the no-hang code for a bit. No progress. I know a darn awkward way to do it, but there should be a more elegant way...

October 4, 2009

Beth Lynn Eicher (chair of Ohio LinuxFest) pointed me at the twitter of some organization calling itself Texas Linux Fest. That thing hasn't been updated in a month, and points at a google group with no public archive (you need to log in not only to post, but to read anything).

Another guy, Bryan Bishop, emailed me after my name was misspelled on slashdot. He's acquainted with the people running it, and pointed me at their new website, which is almost content free. (Other than that they've picked a date, but don't yet have a location... I thought it went the other way around?)

So I went to their contact page and tried to send them email to the contact address that listed:

: host ASPMX.L.GOOGLE.COM[] said:
    550-5.1.1 The email account that you tried to reach does not exist. Please
    try 550-5.1.1 double-checking the recipient's email address for typos or
    550-5.1.1 unnecessary spaces. Learn more at
    550 5.1.1
    8si4971359vws.69 (in reply to RCPT TO command)

They have a wiki. It has exactly one page, containing their banner graphic and nothing else.

So much for my attempts to volunteer to help out. I think I'll wait and see what the first year is like...

October 3, 2009

I actually posted to livejournal for the first time in ages, for reasons explained in the entry.

Once again, I keep finding weird limits in the standard tools.

The bash "read" command has a timeout option, -t, but the error code it turns on a timeout is indistiguishable from the error code it returns on end of stream. So I can't selectively send a kill signal, not that targeting it would be easy if the end of a pipe is trying to send a signal to the start of a pipe.

This time, I want to control whether or not tee exits when one of its...

Due to a sudden interruption I actually posted to livejournal for the first time in ages, for reasons explained in the entry.

Where was I?

Right, what I'm trying to do is monitor the output of a qemu run, and exit if nothing's come out for 60 seconds (meaning it's hung). A bit like does now, only using the new -hdc /mnt/init logic to drive the emulator.

My first thought was to pipe the output into a function that does {while read -t 60 i; do echo "$i"; done} and thus exits, hopefully ending the pipeline. Except that this would batch the data into lines, meaning the "..." of tarball extraction won't progress, you'll get nothing until it's done and then the whole line at once. And in ./configure the delay between "testing..." and "ok" would display nothing until the whole completed line pops up at once. (May not seem like a big deal, but it's darn irritating when a human's looking at the output and trying to figure out what's going on.)

My second thought was to fix the dispaly problem using tee, ala

"./ | tee >(while read -t 60 i; do true; done)". Long ago I fixed tee to pass along data promptly, without batching issues (and fixed User Mode Linux's tty layer to handle similar issues: short writes to stdout happen people. Yes busybox tee implementation, this means you, not bothering to check the return value of fwrite() to see if it spit back some of the data. You can make this happen with ctrl-z, or sending a kill -SLEEP to the output process. A loaded system will do it when it schedules the target process away long enough for the pipe to fill up, or swaps out one of the target process's pages so it blocks on I/O. I _really_ need to implement tee in toybox, without this funky FEATURE_TEE_USE_BLOCK_IO code duplication, or gratuitous mixing of posix and ANSI I/O idioms...)

Anyway, this still wouldn't address the "one big long line" issue, because partial input data doesn't reset the read timer. (Quick test: while true; do echo -n .; sleep 1; done | read -t 5 i" and it times out after 5 seconds.) But it would be progress: if it worked.

However, tee continues on when one of its outputs ends, meaning the pipeline won't end. And more to the point, even if it di the "hopefully" two paragraphs back turns out to be misplaced, because even when you output directly to a function, which exits, it turns out the pipeline doesn't exit until all the component programs do. The producer will get a SIGPIPE next time it tries to write to the consumer through a closed pipe, but if it's hung and not producing output this will never happen, so the pipeline won't exit, defeating the purpose of the exercise.

So what do I do? Add a ; kill $PPID right after the "done". A kill signal going to the parent pipeline kills it, but not the whole shell script, which is exactly the behavior I want. Control passes to the next line down (to kill the netcat+ftpd instance now that it's no longer needed), and life is good.

The remaining issue is the "tarball ... progress indicator" issue above: big ones like the 50 megabytes of bzipped Linux tarball can legitimately take a full minute when extracting under an emulator (eating CPU to decompress and then creating 30,000 little individual files takes some doing, plus it's blocking on a lot of disk access to flush all those dentries). The dotprogress function prints a period every 25 files, but the timeout is how long since we've seen the last newline.

So it's not ideal. What I really need is either a shell "read block of data but not necessarily until the next newline", which is unlikely to exist (and looping on single characters in shell is too CPU intensive), or a timeout in the tee command itself (of which there aren't any yet)...

Ah! But there _is_ a -n option to limit the number of characters it reads. So if I set that to say 32, I can still read large enough chunks that I'm not spinning doing a whole shell loop for each character of input, but it can return when the ... line gets long enough.

Sigh. Except it's _not_actually_working_. Small chunks of it worked in my tests earlier, why is it failing when I try it on securitybreach? (Different version of bash, perhaps?)

October 2, 2009

Fade's had a sort throat since yesterday, and today I've got one. Not bad, but enough to notice.

Of course with the notices of Swine Flu (or the more politically correct term "H1N1" if you like: it's Swine Flu, deal with it) going around Austin, and also that it was at Ohio Linux Fest last weekend, it's only a matter of time until we get that. So far, it seems about as deadly as SARS, but then that did actually kill a few people and so has this. But after the whole "Swine Flu at PAX" thing, it's only a matter of time until we all get it, so it might be nice to get it over with.

October 1, 2009

Good blog entry from Mark about eglibc. (Yay Mark blogging again.)

I am sad that my "internet-through-cell-phone" doesn't seem to let me log on to freenode. (They seem to be blocking t-mobile? Dunno. Hangs attempting to connect.)

I hath posted a road map for FWL development, or at least the start of one. (Yes, it's a UHF reference.)

The OLS guys say the recording of our presentation should be up on by the end of the week. Our slides are up, although the PDF is enorous (87 megs) and Mark is still redoing the html to be in a more friendly format than Apple's Impress produces by default (which really doesn't scale to presentations with >200 slides).

Once again, the fact that "Firmware Linux" is kid of a sad name for the project has come up, this time in the context of "what if we want to try a netbsd or darwin kernel?" The layers _are_ supposed to be orthogonal, although _how_ orthogonal is an open question...

September 30, 2009

Taking the UT shuttle bus up to Far West to pick up my car from the mechanics (the engine is fine, they're guessing the weirdness is the transmission). I like Lamb's automotive, they didn't charge me for something they couldn't fix.

I'm amused by the number of different wireless networks that xfce keeps popping up and then disconnecting from. (Keep in mind it only attaches to ones I've previously given it permission to use. I bike around a lot. :)

Haven't rsynced in a couple days, because Mark's redoing the website to look less horrible than my hand-edited html. (He's even adding a style sheet. Ooooh.) Which unfortunately means that if I just rsync before I check his changes in, I'll stomp right over 'em.

Alas the drop he gave me last night turns all the URLs everywhere into absolute URLs, so when you browse the local "www" directory in your downloaded source, all the links point to And he stripped the ".html" extensions off them all, for reasons I don't understand. Torn between just checking it in and washing my hands of the website (which probably means I'll never update it again), and drawing out a process that would be fairly simple if I wasn't in Mark's way... Hmmm.

Checked in the armv4tl-eabi target today, thanks to Marc Andre Tanner. I should rename the old armv4l to armv4l-oabi, since it's the only oabi target....

Continuing to bang on an automation script. Trying to build an expect in bash is frustrating enough I've given up for now, and instead I'm adding infrastructure to automatically mount /dev/hdc on /mnt if it exists, and then hand off to /dev/hdc/init (if there is one) instead of firing up the shell prompt.

Alas, getting the data back _out_ (without requiring root on the host) is kind of tricky. It turns out that dropbear is really picky, it won't run if your current user isn't in /etc/passwd, and since there currently _isn't_ an /etc/passwd in the system images, it won't even work for root. So I can't just scp the data out like I'd wanted to...

Pondering. There are a half-dozen approaches I could do, but so far I'm not happy with any of them: The netcat trick's brittler than I like. Writing a tarball to the start of one of the existing virtual drives has several problems (hda and hdc are shared, read-only, and in use at the time I'd need to write the tarball. hdb is just /home and stops being used when we exit, but that's where the data I need to copy _from_ lives. I _could_ partition home, but how much space do I give it and finding the offset to start reading the tarball from is icky. Adding a fourth drive for output seems kind of silly, and I haven't even tested that all the targets I'm messing with have _three_ yet...) Catting data to stdout via uuencode or something and then intercepting it just won't scale. I'm not sure I have a second serial port to do the >(process) trick to. Exporting a directory via nfs from the host requires root access, and doing so via samba requires samba to be installed on the host (it went gplv3, I'm not building it from source in host-tools) and smb support in each target kernel...

I suppose there's busybox's ftpd on the host and ftpgetput in the client. Or I could just add a one line /etc/passwd to the system image and make another stab at getting dropbear to do this for me. (The downside is if dropbear doesn't build I don't get any output out, but I think I'm ok with that.) Hmmm, the ftp approach might be better...

I'll have to play around...

September 29, 2009

Taking the advice of the community building panel from OLF and writing up a Firmware Linux roadmap, so people know what the upcoming TODO items for the project are. (Mark and I know them, but who else does?)

Building static versions of strace and dropbear for mipsel, because somebody on the list asked. In order to make this part of the nightly cron job, I have to write an "expect" variant in bash. (I've already written on in python, back in the timesys days, so it's not that hard. And I've implemented timeout code in

What I need to do is make a library function for sources/ that waits a specified number of seconds specific string (possibly a regex) to wander by. If the string happens before the timeout, it returns success. If the timeout happens first, it returns failure.

Fairly straightforward to specify, darn fiddly to implement in shell, but I've already got most of the pieces.

I may also need to split up sources/ into bits only useful to FWL (download, cleanup, setupfor, read_arch_dir, and so on) and bits generally useful to anybody building inside an emulator (killtree comes to mind).

Or perhaps I should split it into API functions meant to be called by users, and internal functions only meant to be called from the api functions...

Cleanup is a constantly running process, which can never catch up...

September 28, 2009

Harper Collins put the entire text of Terry Pratchett's new "Unseen Academicals" online. That's not fair. It means I have to go to the bookstore today if I want to get anything done on my computer.

(Ah, it's not the full text, just 70+ pages that cut off abruptly. Oh good, that means I don't have to go buy the thing to stop reading it online if I get hooked, and can thus put off buying it until I get around to it.)

Started a "history of Penguicon" in the airport. It's... long. (I'm up to year 2!)

September 27, 2009

Ohio LinuxFest was fun, and really busy. Tired. Airplane back now.

Doug McIlroy was cool (and so was his talk, I hope they post the video and slides). It was good to see Beth again (she was a co-worker of mine at Timesys, and the con chair of OLF this year). Mark's and my talk seemed fairly well received and several people poked us about getting the slides online (we told 'em to check the Impact Linux website wednesday). Didn't get to hang out with Peter Salus much (my talk was scheduled opposite his), but I bought his new book from the Barnes and Noble booth. Met Mackenzie Morgan. Suggested that Ubuntu call its new bootable USB hardware test image "magic smoke". Got tips on how to add Hercules/S390 support to FWL from David Boyes (who is not the lawyer, but apparently gets his airplane seat upgraded all the time because airlines _think_ he is). My name got misspelled on slashdot.

As I said, busy weekend.

Ohio being just south of Michigan, several people told me about how the technical programming for "Matt Arnold's Penguicon by Matt Arnold" apparently imploded this year. (I'd heard the con suite ran out of food, and the brazilian beef went so badly they decided to strike its name from the program book and never speak of it again. But this is the first I heard about the technical programming.)

For the record, A "call for papers" requires you to send out acceptance letters well ahead of time, and have people confirm they're coming. Failing to reply to the submissions at all until 3 weeks before the event, and then emailing submitters the the schedule grid with their talks already on it, really is not the same. (Politeness issues aside, submitting a paper is not the same as making travel arrangements.)

Charging admission to those who do show up, when this was never mentioned to them before, doesn't improve matters. (The whole handicap access issue is sort of a rounding error at that point, although not the people directly affected by it.)

Still, attendees sitting in empty rooms with no speakers in them didn't seem to bother the organizers (just the presenters and attendees), so no harm done there then. (Then again thhis year was the first year I didn't attend, so I'm working on secondhand information delivered months after the fact. I didn't even hunt down and read lots of con reports like I did for all the previous years.)

Speaking of "for the record", I should write up a "History of Penguicon". I did co-found the thing, work rather extensively on organizing it for 5-ish years (1, 2, 4, 5, and a smattering of 3 and 6). Plus it's got a year and change of history that predates even Tracy, that _only_ I know about... :)

September 24, 2009

260 slides total.

Plane to Ohio LinuxFest takes off at 7:10 am, first thing in the morning. Hope I can get some sleep on it...

September 23, 2009

Spent the day at Mark's, preparing Saturday's talk for Ohio LinuxFest. We made 169 slides, and I'd estimate we're a little over 2/3 of the way through the material.

We have about 8 hours of material (already) and one hour to do the talk. If anybody would like to hire us to do the full day-long version (or any of the other talks we've got prepared), we'd be happy to quote 'em a price... :)

September 22, 2009

Now the general who wins a battle makes many calculations in his temple ere the battle is fought. The general who loses a battle makes but few calculations beforehand. Thus do many calculations lead to victory, and few calculations to defeat: how much more no calculation at all! It is by attention to this point that I can foresee who is likely to win or lose.

He who is destined to defeat first fights and afterwards looks for victory.

- Sun Tsu, The Art of War

The Republican party is not just the party of "no", it's the party of "stupid".

You can trace the "party of stupid" trend back to Ronald Regan, an Altzheimer's patient who Dave Barry described as "napping towards glory". But it got _bad_ under the man who put the "duh" in W.

George W. Bush was not a smart man. That was obvious (and widely remarked upon) during the 2000 election. He had a certain animal cunning, but that's not the same as intelligence. "The Decider" was not a thinker, he did not enjoy solving problems and seldom if ever pondered the ramifications of his actions. The entire Iraq War was condemned by the simple failure to even once consider the question "then what?" They didn't have a _bad_ plan, they literally had _no_ plan.

When he couldn't simply lash out at a problem with overwhelming force, his only other option was to hunker down and wait for it to go away, whether reading "My Pet Goat" to school children while the World Trade Center burned, leaving Bin Laden hiding in a cave through the end of his presidency, or refusing to cut short his endless vacations for Hurricane Katrina.

An old military adage is that amateurs study tactics, the experienced study strategy, and professionals study logistics. These people DID NOT STUDY. They waged two wars plagued by constant failures of "intelligence" in more than one sense, and kept threatening to invade more (Syria and Iran near the top of the list).

The self-styled "education president" championed "Intelligent Design" while dismissing climate change. (His legacy in education was the disastrous No Child Left Behind act, perhaps a response to the way education left him behind. Ask any teacher how slowing the entire class down to the speed of the slowest student worked out in practice.)

He was _viciously_ anti-intellectual. He didn't value the advice of smart people (nor did he want to be surrounded by them, choosing instead "Heck of a Job Brownie" and the "Duct Tape and Plastic Sheeting" guy). He especially distrusted science and scientists, ordering them to change or bury their conclusions when he didn't like them.

Back in the 1970's, the long-running BBC science fiction program "Doctor Who" contained the quote "The very powerful and the very stupid have one thing in common. The don't alter their views to fit the facts. They alter the facts to fit their views. (Which can be very uncomfortable if you happen to be one of the facts that needs altering.)" Bush was very powerful, _and_ very stupid. He put a man in charge of NASA, Michael Griffin, who not only eliminated weather monitoring programs because he (like Bush) didn't believe in global warming and didn't want to collect data that might contradict this belief, but who publicly _admitted_ it.

What he did respect was wealth and power: obvious, measurable, superficial signs of success. Thus his energy policy was written by oil industry lobbyists, even as the price of gasoline neared $4/gallon. His fed chairman came from Goldman Sachs, and due to insufficient regulation and even less enforcement that Fed chariman needed to give an enormous bailout to his former employer (and selected other companies, but _not_ to Goldman's largest competitor which was allowed to go under) towards the end of the administration. But Bush thought corporations could do no wrong (since they were the ones who understood stuff he couldn't be bothered to ask about, yes even after Worldcom and Enron and so on), and thus everything must be deregulated and privatized from the military through the FDA, with disastrous consequences. (Anyone remember the massive pet food recall because nothing had been tested for contamination? We deployed _mercenaries_ in Iraq, from Blackwater to Wackenhut: forget about Abu Ghiraib "naked prisoner human pyramid" stuff for a moment, is it really a good idea to outsource the functions of the US military to organizations that answer to the highest bidder? It's like the companies that outsource all their employees to India and then are _shocked_ when overseas competitors emerge. The scandals ran together to the point it was hard to even remember them all.)

This man was in charge of the Republican party for eight years. This attitude shaped everything. Being smart and educated held no weight, science was a matter of opinion. And it left its mark. The people in charge now are trying to cash in, just about exclusively, on the Stupid Vote.

Half the "appeal" of ex-beauty-queen Sarah Palin is that she's similarly convinced that anything she doesn't understand can't really be all that important. (She has, of course, been replaced in the party's mind by younger, prettier, dumber beauty queen Carrie Prejean.) Does Joe Wilson really _believe_ Obama was lying, or is he simply trying to appeal to the willfully uninformed and easily confused? How is the "anti-Czar" hysteria _not_ an attack on Ronald Regan?

Lincoln was the one who pointed out you can fool some of the people all of the time, but he didn't consider it a _good_ thing. His party's current leaders (Rush Limbaugh, Glenn Beck, and Bill O'Reilley) have made a _career_ out of it.

I really hope these guys go the way of the Federalists and the Whigs, and maybe the Blue Dogs can split off to become the new opposition party. Unfortunately, our system has a whole lot more inertia baked into it than in 1861, and Lincolns are hard to come by. Our winner take all voting system brings out the loonies during the primaries, and trying to fix the electoral college and such seems out of fashion again.

September 20, 2009

Bisecting kernels is a pain. Found the commit that made serial output happen again (a10b32db348), saved it into FWL as alt-linux-blah.patch, bisected back into the middle of the dead range, and applying the patch didn't produce any output. So the fix consisted of multiple patches, and I have to bisect for the other one.

Or I could do what I'm doing, which is finding the _beginning_ of the damaged range and seeing if I can revert _that_ patch. (And also confirming that the bug I'm looking for wasn't introduced before then, which would make it much easier to find.)

And I immediately hit yet another bug. Let's see, the bug I'm looking for in -rc1, the no output bug, the two different build breaks towards the end of the no output bug, and now this one. Fifth bug. Wheee.

This bug (in 8a1ca8cedd) makes qemu exit immediately after launch with an "unauthorized access" and then

qemu: fatal: Trying to execute code outside RAM or ROM at 0xa0000000

Which is what the sh4 emulation does when trying to reboot itself anyway. But the point is, this is yet another different behavior, which doesn't prove that the one I'm looking for _isn't_ in this one, just that something else is happening. It could simply be preventing the problem I'm looking for from manifesting.

And since nobody ever tests platforms like sh4 in the presence of hundreds of random other changes raining down, git bisect skip remains useless because it tests adjacent commits and when there are sometimes 750 of them with the same bug (as there were last night with the first of the two build breaks)... Sigh, no better ideas, and I have a book. Skip, skip, skip, skip, skip (each one is a 5 minute build)...

Hey, it relented after only about an hour of building. That's pretty quick for using "skip" in a kernel bisect...

Interesting. Looking for the front of the "no output" range, I started hitting ones that behaved like -rc1 and ones that booted all the way to a shell prompt, so apparently the bug was introduced before the lack of output (yay!), but get this:

There are only 'skip'ped commit left to test.
The first bad commit could be any of:
We cannot bisect more!

The f199 commit (sh: remove old TMU driver) dies without any boot messages (it seems to hang a bit and then exit), the 8be5f commit (sh: Kill off the remnants of the old timer code) manifests the -rc1 hang. So the introduction of the hang and the failure to output anything were adjacent commits, both messing around with what looks like the timer code.

Poked the kernel and qemu lists about it, since I dunno which one needs the fix. (I know the kernel is what changed, but it looks like they switched from one time source to another, and qemu may only have been emulating the first. Since the qemu timer stuff for sh4 was already kinda horked (or at least there were long inexplicable delays), properly emulating the new one might be an improvement? No idea, worth a ping. (This is, of course, assuming the sh4 guys tested it on real hardware. Which may not be the case.)

This is why I need to get the cron job stuff working, and running a _reliable_ smoketest rather than the current one with intermittent race-condition failures probably attributable to virtual serial port initialization delay jitter eating a non-constant amount of data when qemu gets unexpectedly scheduled. (The sad part is that the previous sentence _isn't_ Star Trek technobabble, it actually makes sense to me. I need to go do something else for a bit.)

September 19, 2009

I sometimes wonder if the people who invented "git bisect" ever actually use it? You'd think they'd _have_ to, but the "git bisect skip" option seems to assume that any broken commits are fixed almost instantly, and will happily test a dozen adjacent commits rather than going "splitting it in half didn't work, let's try lopping off 1/3 and maybe we can avoid the damaged range that way". No no, there's never a broken commit that nobody notices for hundreds of revisions, so let's make you test EVERY SINGLE ONE OF THOSE.

I'm trying to bisect for the sh4 hang in the 2.6.31 kernel (it's not getting to userspace, it's hanging during kernel bootup; 2.6.30 booted fine with the same compiler, config, root filesystem, etc). The hang was there in 2.6.31-rc1 and is still there because apparently nobody ever tests non-x86 targets in mainline. (I need to, that's why I'm getting everything else fixed so I can do so more easily.)

I've already had to change what I'm bisecting for once because most of the range between 2.6.30 and 2.6.31-rc1 gives no boot messages at all, rather than spitting out several dozen and then hanging. So now my definition of "good" is not getting any boot messages, so I can find the end of that range. But then I hit a build break, and of course skip skip skip and it's still there because git bisect skip is really stupid.

This is why I didn't spend much time on it in my old Git bisect HOWTO, because you wind up having to checkpoint and do speculative bisection all the time. You find stuff you can't test for whatever reason and have to change what you're bisecting for, generally to bisect for the patch that fixed whatever this problem is and then go back and apply the patch to the last one you wanted to test for the _previous_ problem and continue your bisection from there, patching as appropriate. (Or bisecting for the start of the range and reverting that patch, same difference.)

So now I remember that c4c5ab3089c8a794eb0bdaa9794d0f055dd82412 build breaks and 0dd5198672dd2bbeb933862e1fc82162e0b636be reproduced the "no output" problem, and of course that's what git calls them because it's not like humans use this stuff. Just kernel developers. And I change the definition of "good" again to call the problem I'm seeing "good". If the problem goes away but I see the no output problem, then it met the inner two definitions of "good" and I can continue bisecting for the patch that fixes the no output problem. If I instead get the original hang then I find the build break patch, back up to the "build breaks" commit above (doing a log dump and reply like I described in my HOWTO), apply the patch and test, figure out whether it reproduces the previous nested problem I was looking for, and go from there.

Yes, this is the HOWTO I deleted off because some idiot found it confusing that I didn't put much emphasis on "skip" and I didn't feel like arguing.

Hey, and once I got after the dma_sync_single break, the sh4 build breaks in a _different_ place. Most likely the first break was hiding the second, so i'll have to bisect for this one again, so I should remember that 650a10dc484f067883fc05a2d4116e1ee3f909c0 showed this second problem. In the meantime, bisect for the patch that changed the bug I was looking at, so this one would be "bad"...

Ooh, 09ce42d3167e3f20b501fa780c2415332330fac5 builds to the end (so the second break was introduced afterwards, I think?), and it shows the -rc1 bug behavior. Ok, that's still not the one that fixed the bug I was looking at, so call it bad. (Aren't these horrible names? It's "before" and "after". Note that good _must_ be before, you can't bisect for where a bug was fixed by calling the after "good". Git won't let you.)

*blink* Ok, d246ab307d1d003c80fe279897dea22bf52b6e41 has the -rc1 hang behavior, and 692684e19e317a374c18e70a44d6413e51f71c11 has the no output behavior, and both build, and according to git bisect they're something like 4 commits away from each other... Good sign? Sheer luck? Possibly git bisect doing something strange?

Um, let's see, that means I call the no output one... bad? Good? Darn it, I've forgotten. (Does bad mean before? No, git always assumes bugs are only introduced, never fixed, so bad means after, meaning I've done it wrong. Sigh. The bug I'm trying to FIX is "good", the behavior where it's fixed is "bad". Because it's git.)

September 18, 2009

Got the perl removal patches resubmitted, so the list can ignore 'em again. This is the... fifth time? Maybe?

Also finally got around to splitting up this year's OLS papers and uploading them to Properly indexed this time (ok, I cut and pasted the descriptions from the OLS website. I should go back and do that for previous years, but for some the database no longer worked last I checked...

No idea if there are slides or audio or video posted anywhere. I didn't go this year, and there isn't anything posted to their website about it. (Then again, the ols website is kind of spotty. For example, their archives page doesn't list this year's proceedings yet, and in "past websites" doesn't mention 2007 or 2008, even though they're there if you can guess the URL.

September 17, 2009

It's been nice out, so we've had the windows open. This has its downsides. The neighbor who can't park, and who has an even dozen bottles of alcohol in his window (yes I counted, although the only two I remember were vodka and everclear), was outside with a friend of his being loudly racist. Apparently secret service people finding a guy with a sniper rifle near the white house is a laugh line, although a bigger laugh line was "What did they think he was going to do? Shoot the black guy!" (This was after he insisted to his friend that he was going to get an NFL contact because he had a dream about it All done in a heavy rural southern accent, of course.)

This is the one neighbor who has remained constant since I bought the place. The others come and go, but this guy stays, because his mom owns the unit. (He was very proud of this fact back when he came over to yell at me for parking entirely within the lines of my space in a way that inconvenienced him.)

All this does make me seriously consider volunteering for one Obama's health insurance reform phone banks this weekend.

The rebuild with the toolchain build scripts unified and the 2.6.31 kernel finished, and the targets to break this time are mipsel (but not mips), sh4 (build break in uClibc), and armv4l (but not v5l or v6l). Off to diagnose...

Ok, testing mipsel by hand, it worked. There's some kind of funky asynchronous failure in, which is probably a qemu thing and most likely has to do with catting a script into stdin without some kind of "expect" functionality. I wonder how hard it would be to write "expect" in bash?

The armv4l problem seems the same, it worked for me building on my local laptop. The sh4 thing is that once again the asm types #ifdef guard changed (due to the sh 32/64 unification), so uClibc's bits/kernel_types.h needs to be redesigned from scratch updating. Whipped up a quick patch and submitted it to the uClibc list.

September 16, 2009

Slept 10 hours last night and managed to screw up my neck somehow. Ouch. Kinda distracting.

The gating issue on my todo list is getting The Great Refactoring to a good stopping point, so I can regression test the existing builds under it and then break for a bit to tackle the September 9th kernel release (and resubmit the perl removal patches while the merge window is still open), and then test the uClibc patches Bernhard sent me.

Next up after that, I borrowed Mark's BeagleBoard and I need to finish the HOWTO this week so I can get it back to him. Possibly this will involve getting an alternative build script (using gcc 4.4 with armv7 support) so I can build the x-loader and such for beagle.

Noticed that the mention of FWL on my resume was horribly out of date and did a newer summary, and filled in at least the Cisco stuff for the Impact Linux section. (It no longer fits in a page. It hasn't for years. More a CV than anything else at this point.)

Coming up on 2 weeks without caffeine. Still tired and unfocused, but more or less back to functional again. I'll probably go back on it when I leave for Ohio LinuxFest next week. Hopefully the recording of Mark's and my talk can go up on soonish. Possibly we'll also do an instructional video for the BeagleBoard once the HOWTO is finished.

September 15, 2009

My todo list runneth over, lemme see:

There's more, but that's the top of the todo list for now.

September 14, 2009

Today's musical selection at Lava Java appears to be some sort of novelty record. "What if indigestion could sing, and tried to be morose without actually sounding Russian?" And now it's drunk, gargling, and singing about "tabletop jones" in the presence of a disinterested piano. I'm fairly certain this is intentional, I'm not sure why.

September 13, 2009

I spent rather a lot of last night trying to unify the binutils and gcc builds for The Great Refactoring, and I've made it up to the wrapper stuff. Now I'm trying to figure out if the wrapper and uClibc++ builds should be in a big binutils-gcc script, or if they should be split out.

The individual binary package tarballs complicates matters, of course. The package tarballs grab the new files between setupfor and cleanup, and since ccwrap.c isn't in packages (it's in sources/toys) it doesn't get packaged. (Currently the wrapper build is done before the gcc build's cleanup is called, so that's not an issue.) Plus building the wrapper shuffles around a lot of the gcc files, and there's no way to represent _moving_ files in a tarball. So packaging the wrapper by itself is problematic.

However, one of the goals of The Great Refactoring is being able to drop in replacements for various components: a glibc build instead of uClibc, a different compiler (tinycc, llvm/clang, pcc, or even a GPLv3 version of gcc) instead of the last GPLv2 version of gcc, and so on. The wrapper is currently tightly tied to the current gcc build, but other compilers might need it too. So if I want to do a drop-in replacement script to build binutils 2.18 and gcc 4.3 (for armv7l/beagleboard) obviously the script should only do that. But a binary package tarball of gcc that _doesn't_ contain ccwrap is kind of useless.

Perhaps I need to do the symlink approach, and do a null setupfor just so I can package the wrapper? I'd still need to move gcc->rawgcc, though. It MIGHT be possible to break off the wrapper build, but do I want to? Hmmm...

In any case, I should be able to eliminate the redundancy between and describing how to build the same packages in slightly different ways. That's the _other_ big goal here, not having to repeat the busybox build between and, not having almost the exact same the uClibc build sequence in both and, and so on.

If I want to do a drop-in replacement script to built binutils 2.18 and gcc 4.3 (for armv7l/beagleboard) obviously the script should only do that. But a binary package tarball of gcc that _doesn't_ contain ccwrap is kind of useless...

And then there's the whole export of the funky static c++ support library so uClibc++ can suck it in and use it. That's a very ugly handoff between two different packages. I suppose I could just treat it as a very ugly prerequisite. And right now, uClibc is separated out from the toolchain build, so I suppose separating out uClibc++ makes as much sense.

Wheee. Design issues.

September 12, 2009

Much rain. Waited 2 hours for it to let up enough I could bike home from the mug of unity one mug's Einstein's Bagels this morning, and eventually biked home through the rain anyway. Autumn isn't supposed to start around here until mid-december, I thought? Climate's being weird. If an indian summer is a warm week after the first frost, I guess a week of fall coming a couple months early would be... a pilgrim winter?

Another fun shell corner case, the "VAR=VALUE command" syntax works with source statements. Ala "VAR=VALUE ." will export VAR=VALUE, read in and execute in the current context, and then set VAR=VALUE back to whatever it had been before. Essentially, source statements have their own shell function context.

Good to know, and quite useful for the refactoring I'm doing. I'm considering making the snippets be sourced so I don't have to say #!/bin/bash at the top of all of 'em. (I'm still ambivalent about the bash dependency and would love it if busybox ash grew the extensions I'm using. I need to test that, and possibly fix it.)

Also poking at the BeagleBoard HOWTO, hoping to get it finished this weekend.

September 11, 2009

Trying to merge the and scripts (at least the toolchain portions) is educational, even though I wrote it. I'm looking back at these flags and asking things, like why is the cross binutils build explicitly specifying AR AS LD NM OBJDUMP and OBJCOPY, but the canadian binutils build only specifying CC and AR? Why are they _different_? Or for that matter, why is cross specifying --with-lib-path=lib but the canadian one isn't? (I think it has to do with that "lib64" but on x86-64, but don't quote me on that.)

I'm trying to squish this into one codepath, and it's _not_ a straight line. And binutils is only 1/10th as evil as gcc, the FSF hasn't _really_ maintained binutils for a decade now. (It was Cygnus, which got bought by Red Hat.)

Alas, this means I probably need to do more investigation. Possibly I need to look at my old blog entries, although I doubt I recorded what I was doing in enough detail. (For example, OBJDUMP all caps doesn't previously appear in this year, 2008, or 2007's blog entries.) I can dig through "hg annotate" to see where it was checked in, but most likely it was one big lump with a single comment. What I want to know is what happens when each of those isn't specified.

The thing is, over-specifying stuff can give me a build that works for all three of my use cases (creating a basic cross compiler from host to target, canadian crossing a native compiler for the target, and canadian crossing a cross compiler from an arbitrary host to an arbitrary target statically linked against uClibc on its host, ala i686->arm build on an x86-64 host). But it also accumulates cruft. How much of this is still needed for the 4.2.1 build, and how much was 4.1.2 only? What breaks if I try to spin a 4.3 toolchain (which, alas, armv7 needs and that's what the shiny new Cortex A8 in the BeagleBoard is using).

What I probably have to do is get the big overspecified one working, and then yank each symbol one at a time and run all three build variants without it to see what happens. And that's incredibly boring and time consuming. Oh well.

It's so easy to slap stuff together to make it work once, and always so much extra work to do it _right_.

September 10, 2009

Interesting post from a professional writer.

You are not owed a read from a professional, even if you think you have an in, and even if you think it's not a huge imposition. It's not your choice to make. This needs to be clear--when you ask a professional for their take on your material, you're not just asking them to take an hour or two out of their life, you're asking them to give you--gratis--the acquired knowledge, insight, and skill of years of work. It is no different than asking your friend the house painter to paint your living room during his off hours.

One of the rules about inviting convention guests is that you can't ask them to do whatever it is they do for a living for free. It's HORRIBLY impolite. To use an extreme example: if we invited Weird Al as a panelist, we couldn't ask him to sing. He makes tens of thousands of dollars for a concert, he worked years to achieve that, and no matter how much he may LIKE us, giving us even a little filk session for free would undermine that. Similarly, if we invited a professional author (defined as "one who makes their living by their income from writing"), we couldn't ask for a short story for our program book.

I vaguely recall that when we asked our web cartoonists to do covers for the program book, we bartered for it, giving them some advertising space elsewhere. (And we bought advertising on their website as well, so it was part of a larger commercial transaction where they got paid cash.)

Lots of hobbyist conventions don't understand this. Asking guests for their time so they can be admired and entertained is one thing (and it's entirely ok for them to say no simply because they don't want to, they don't _need_ a reason). Asking them to cheapen their professional services is something else entirely.

September 9, 2009

While I was at Starbucks re-reading a Tamora Pierce book, I thought of a solution to some of the FWL refactoring, and actually got enthused about working on it again. (For the first time in a month, not just vaguely poking at it out of a sense of obligation but _wanting_ to work on it.)


Got the uClibc build part reimplemented as a standalone script. (Well, kernel headers plus uClibc install. The two go together, no point in separating them. I'll probably do binutils+gcc the same way.)

I've noticed that Lava Java on 26th plays Zombie Sinatra, and the Thundercloud off of 15th plays a whole lotta Bob Dylan. Of the two, Dylan is a lot easier to listen to. (Sinatra wasn't a very interesting singer. Only reason I can think of him getting big gigs is "mob ties". Then again, I don't understand the "beat poet" craze of the 1950's, either.

And I attribute the continuing popularity of monotonic swearing (the C in rap is silent) to a cultural identity good from an oppressed and alienated minority youth demographic, and then a bunch of people romanticizing the mindless violence it portrays just like they do with pirates and vikings, none of whom you'd really want to meet smell in person.

I'm all for a good beat. Lots of things from the Queen's "We will rock you" to the Miami Vice theme to the Terminator II theme were pretty much just drumbeats. James Brown's career was built on nontraditional precussion, which I'm also a big fan of. (Drums are overused, and as good as "I need a hero" is its use of a drum machine instead of a live drummer grates a bit after a few re-listens.) It can be done well, but just about all rap isn't.

Every time I criticize [c]rap, I'm worried that if I'd been born 40 years earlier I'd have been one of those people criticizing Rock and Roll. (Although I personally think Elvis is kind of boring too. Beatles yay, Momas and the Papas had some good bits, even the beach boys were accomplishing something interesting.)

But I suppose Frank Sinatra made even Elvis sound outright enthralling. (It's not like energetic music was _new_. "The boogie woogie bugle boy of company B" is from World War II. They just replaced the small scale troubadors with player pianos, and later phonographs, and scaled up the "big band" stuff to the point where everybody thought you needed to 50 people to play an interesting song.

September 8, 2009

The toolchain build that and do is close enough that I'm trying to split it out into a separate script I can call from both of 'em. But it's different enough that this is hard to do.

For example, the uClibc "make utils" stuff is horrible. (This is for ldd and readelf and ldconfig.) In the cross compiler case it's building a special magic "hostutils" target so it knows "no, you're not to try to link these suckers against the C library you just built, stupid", and then installing the results with a shell loop involving sed to rename the suckers (because the uClibc install for this is useless). In the native case, it calls the make twice in a loop because making the library and making the utils at the same time isn't SMP safe (missing dependency somewhere I guess, I hate make). And then of course doing the utils install by hand, but in a different way.

Merging that together is ugly. Another bit of ugliness is the aforementioned "build uClibc before or after the toolchain" issue. The cross compiler _must_ build it after the toolchain or else there's nothing to build it _with_. The root filesystem would like to build it first so everything else links against the shared library that's actually installed (for all we know our cross toolchain is glibc based). But there's still the problem of telling an arbitrary toolchain to link against a specific C library, which is what the wrapper script does because gcc is built _entirely_ out of unfounded assumptions and overriding these assumptions from the gcc command line is possible but horrible.

If I assume the arbitrary toolchain is gcc, it gets slightly easier, but not by much. I can build the C library, then build a host version of the wrapper script, then run "gcc --print-file-name=libgcc.a" on the cross compiler to find the gcc magic internal library directory (and use the --print-file-name=include trick to find the compiler internal includes directory), then drop symlinks to everything to set up a cross compiler wrapper directory... It _might_ work, but would need testing. And of course it would only work with gcc.

If our cross toolchain _isn't_ gcc... well, most likely the kernel and uClibc won't build with it. But assuming it's icc or some such, its static library may not be called "libgcc.a", and there's no generic "libcc.a" symlink the way there's libc.a pointing to glibc. (Note that gcc was personally maintained by Richard Stallman for longer than most other GNU projects, and thus its design is insular bordering on xenophobic. Dude does not like the concept of interoperability, he wants one way migration just like Microsoft.)

What's almost as ugly is that just about all builds are going to want a shared library, but not all want a toolchain. The forest of if statements the current root-filesystem has is horrible. I'm reluctant to break the script into a bunch of little scripts (which I did for the previous iteration of FWL, the pre-2006 one that used uClibc instead of qemu and was thus x86 only), and it turned into a forest of complexity.

On the other hand, if I want to call the same snippets in a different order, having them in different files is pretty much the only way to go. Right now, it's easy to see what's happening in an entire build stage, it's still pretty much one big shell script per stage. There are setupfor and cleanup functions, but those are mentally easy to gloss over as "extract the tarball and cd into its directory" and "rm -rf the source we're not using any more and check for errors". (They do a bit more than that, but it's more convenient than important. Stuff to let multiple builds run in parallel, create stage tarballs, automatically patch the source as it's extracted and cache the extracted source so it doesn't need to be re-extracted, and so on.)

I already have a top level wrapper that calls the other stages in order, so the whole thing isn't one _big_ script. I could turn those scripts into wrappers that call the individual package build scripts in order, assuming I can genericize 'em enough so things like the above uClibc build differences can be turned into a couple environment variables you pass in to tell it what to do.

What I want to _avoid_ is FWL turning into a mess like buildroot, where you can stare at the build for an hour and still not have any idea what's going on. Admittedly, that's mostly because it's a series of nested makefiles, which combines declarative dependencies with targets executing carefully sequenced imperative code, and most people these days give up and call makefiles recursively because they can't make this imperative/declaritive mixture scale. Also, buildroot made the mistake of turning into a Linux distribution, building dozens and dozens of packages.

Right now, the FWL build scripts are close to a Linux From Scratch style tutorial: reading them teaches you how to build a cross compiler, how to cross compile a root filesystem, and how to package the result up into a system image. You only really _need_ to read three scripts to do this, plus to get the list of source URLs.

I'm reluctant to mess with that, but as projects increase you've got to split stuff into more modules. Figuring out the appropriate granularity is always tough, and shifts over time. :P

September 6, 2009

Ok, after about 3 days of sleeping 12 hours/day, I've at least gotten past the headache. Whether or not I can concentrate enough to do something useful is another question.

Arguing with the uClibc developers (mostly Bernhard). In the most recent release version,, you selected your architecture in menuconfig using the "Target Architecture" menu. I've been using uClibc for most of a decade now, and that's always been the way you selected your target architecture. That menu is still there, it just no longer works, because commit 6625518cd68 broke it. The menu still saves all the symbols it used to, but now a _new_ symbols has shown up which can't be set from menuconfig.

The fact that this is a behavior change seems to have come as a complete surprise to Bernhard. He's maintaining the project. Fills you with warm fuzzies, doesn't it?

Hey, and kmail continues to deteriorate. I got 300+ identical spams (as you do), so I did the sort by title thing, click the start of the range, shift-click the end of the range, right click and move all to trash. With previous versions of kmail, this worked just fine. This time, kmail locked up for half an hour, rebuilding the indexes on every single mail folder I have. (It does this periodically, as a background process. There's no way to stop it that I've found.) Unfortunately, the two processes interfered with each other, and when it came back I found it had grabbed about 1500 random messages (scattered from 2005 to today) and threw them in the trash too. (Yes, I'm sure I didn't mis-select the range in my caffeine deprived state. As far as I can tell, there _is_ no ordering in which I could have selected this particular swiss cheese arrangement.)

So kmail has random asynchronous cpu-eating background tasks (which you can't tell it _not_ to do. When it's been eating 100% of a CPU for more than 15 minutes I generally just "killall kmail" twice and re-launch it). Range selects are not safe to do while one of those is running, and not only does it not clearly notify you when it's doing this (it updates the little text bar at the bottom... after it's done), but it can choose to start doing them _after_ you start a range select action.

Isn't that short bus special? What a piece of garbage.

Ok, I need to find a new email client now. All things KDE are now too broken to live, and I need to make a clean break. I've heard good not bad things about thunderbird...

September 5, 2009

Still cold turkey on caffeine. Spent most of yesterday with a headache. I normally try to do this annually, but it's actually been about 3 years since I did this. Oops.

Fade was out visiting friends most of today, so I played Sims 3. (About all I felt up for.) There was a full tin of penguin mints on her desk the whole time, taunting me.

Of course I'd find a store on Guadalupe that sells the Orange Mango Passion Fruit Rockstar energy drinks I like _now_. Bought one. It's in the fridge, taunting me.

Gonna be a long couple weeks.

September 4, 2009

Looks like the sysdeps thing (which is probably an arm-specific problem) _didn't_ get fixed, it's just that git 6625518cd68 added new breakage that strikes first. Nice. Need to poke the mailing list.

I ran out of tea yesterday, and emptied my current tin of Penguin mints at the same time, so I haven't had any caffeine for over 24 hours. That would explain why I both went to bed before midnight and slept until 2pm.

Visited Mark and started on a BeagleBoard HOWTO, but I'm just too out of it to finish right now.

September 3, 2009

Well, Jonathan Corbet did do a write-up of the kernel thread. Unfortunately, he seems to have missed some of the subtleties in the RAID case.

The problematic paragraph is:

But what if more than one drive fails? RAID works by combining blocks into larger stripes and associating checksums with those stripes. Updating a block requires rewriting the stripe containing it and the associated checksum block. So, if writing a block can cause the array to lose the entire stripe, we could see data loss much like that which can happen with a flash drive. As a normal rule, this kind of loss will not occur with a RAID array. But it can happen if (1) one drive has already failed, causing the array to run in "degraded" mode, and (2) a second failure occurs (Pavel pulls the power cord, say) while the write is happening.

The "additional failure" isn't the same _kind_ of failure. You don't need an additional disk to fail, all you need is an unclean shutdown of a degraded array, because "dirty plus degraded" means one or more groups of stripes are probably unreadable and the data in them is lost. That means _any_ unclean shutdown with a degraded array that's being written to can eat your data despite journaling, and NOT JUST THE DATA YOU WERE UPDATING AT THE TIME. If your kernel panics or hangs with a degraded dirty RAID, you can lose random data from OTHER FILES, in a way journaling won't even _detect_ let alone fix.

Starting a paragraph presumably talking about that issue with "But what if more than one drive fails?" is extremely misleading. The second failure isn't an additional disk failure, it's an unclean shutdown while writing to the disk (or having recently written to the disk, since the data is buffered and could be written back at any time, and don't get me started about atime updates). That's a different issue.

Incidentally, my laptop paniced again last night, still running stock Ubuntu 9.04 with updates from Ubuntu's updatey thing. Probably a delayed reaction to having unplugged and re-plugged my USB bluetooth dongle earlier in the evening (the bluetooth stack really doesn't like controllers going away between reboots), but that's just a guess since it paniced with X11 up and all I got was a frozen display (including mouse pointer) and flashing keyboard LEDs.

The point is: the kernel _does_ panic and hang. That's why "watchdog" and "heartbeat" systems exist. If either of those triggers while a system has a degraded software RAID, you can get silent random data loss in parts of the filesystem that weren't recently updated. All the arguments about "use a UPS", "use ECC memory", "use dual power supplies with thermal sensors", "don't use PC hardware"... Side issue to that.

Here's another bad paragraph from the article:

RAID arrays can increase data reliability, but an array which is not running with its full complement of working, populated drives has lost the redundancy which provides that reliability. If the consequences of a second failure would be too severe, one should avoid writing to arrays running in degraded mode.

Which is misleading because redundancy isn't what provides this reliability in other contexts. When you lose the redundancy, you open an _unrelated_ issue of update granularity. A single disk doesn't have this "incomplete writes can cause collateral damage to unrelated data" issue, nor does RAID 0 (which has no redundancy in the first place). A degraded RAID 5 is _more_ vulnerable to data loss than RAID 0 is, which is not what people are taught in most classes about this, and is a point seems to have missed.

The fundamental issue is update granularity being bigger than the filesystem expects, which is a _software_ issue, not the standard RAID 5 purpose of protecting against physical disk failures (which is a _hardware_ issue).

I attempted to document this already. It'll probably be ignored.


Ooh, The biggest ball of twine in minnesota is 61 miles north of where my sister lives, pretty much a straight shot from new ulm to darwin along route 15.

According to Weird Al, since he did the song they've actually built a "Twine Ball Inn" which sells postcards that say "Greetings from the twine ball, wish you were here".

Next time in Minnesota, I may be required to go to this thing.

I wonder if sources/patches should be in packages/patches instead? Since they correspond directly to the tarballs. (Except that the patches are checked into source control, and all the files in packages are downloaded or generated.)

Still bisecting the uClibc thing. They had a "fix compilation" commit that made my revert of their original commit stop applying, but didn't actually fix the compile. Wheee...

Ooh! The filker at Confluence 2006 (back in Pittsburgh) who sang the marvelous song about the Mandlebrot set turns out to have been Jonathan Coulton, back when he did tiny little cons where the room holds maybe 100 people (if they stand) and opened for acts like Ookla the Mok. I found him performing the song on Youtube, and pulled up the old Confluence 2006 page to confirm he was listed among the filkers.

Cool. If I'd made the connection I'd have followed his "thing a week" when it was happening (instead of putting it on my todo list and never getting around to it), and paid a lot more attention to him when all my friends were pointing me at him in 2007. :)

Naturally, he teamed up with Paul and Storm, the two members of Davinci's Notebook who are still performing...

The uClibc bisect wandered past a fix commit that only fixed half the problem, requiring me to manually edit the source to test past it. (Luckily I built the FWL infrastructure so that if you edit the build/packages/alt-uClibc directory after it's extracted it, it'll just accept what's there as long as sha1-for-source.txt contains all the sha1sums for the source tarball in packages and each sources/patches/alt-uClibc-*.patch (without any extra lines; removing patches calls for a re-extract too). What you do after it's extracted is your problem, but when I do the next "git archive HEAD | bzip2 > ~/firmware/packages/alt-uClibc-0.tar.bz2" the setupfor function will automatically re-extract and re-patch the uClibc source.

Alas, this got me to 4ae13b764d61fa9 which does:

  CC libc/sysdeps/linux/common/syscall.os
libc/sysdeps/linux/common/syscall.c: In function 'syscall':
libc/sysdeps/linux/common/syscall.c:11: warning: asm operand 1 probably doesn't match constraints
libc/sysdeps/linux/common/syscall.c:11: error: impossible constraint in 'asm'
make: *** [libc/sysdeps/linux/common/syscall.os] Error 1

Which is assembly screwed up in a different file in a different way. So I have to fix this. Grr. Once again, change what I'm bisecting for halfway through a bisect, and hope I can keep it straight in my head despite every cycle taking ten minutes between decision points without letting me easily check back to look at previous commit/result pairs to see what I was doing unless I write up notes for myself. Wheee. I do this for fun?

Time to play Sims 3 for a bit, I expect.

So I tracked down the "impossible constraint" issue to uClibc git b1913a8760599 which moves an #include of and adds a symbol defined in that. Wheee. (The one right before that, 0784fb9ef, worked fine, at least with 5efcfdc5150 reverted.) Presumably the fix will be to unistd.h, which is a file uClibc proides.

September 2, 2009

I hate git. I really hate git.

My wwwsend script does a "git pull" on all the interesting repositories every time I sync my blog. (It also backs up some directories to a server, rsyncs my mercurial repositories to and, and so on. It would even update the page if I'd done anything to that in months.

This means, the uClibc archive I've been playing with had a "git pull" while I was in the middle of a "bisect". So when I did a "git bisect reset", it goes:

Your branch is behind 'origin/master' by 5 commits, and can be fast-forwarded.

Here's why I hate git: it never says how you do a fast forward. I mess with most software the way I used to play "zork". I'd look for keywords in the text I could type, and if that fails I make a couple obvious guesses or go "git help" and see if that lists an interesting looking keyword. This approach works fine for mercurial, but for git it's useless.

Instead git has 8 gazillion man pages, without a clear list of what they are. (That's all the git help pages are, if you go "git help sodomy" it says "No manual entry for gitsodomy".) It's full of magic assumptions and keywords that you're just supposed to _know_, some of which are listed in 859-line man page for the "git" command itself, but text searching that for "fast-forward" didn't find a hit.

So the way to deal with this is to connect to the internet (which my laptop is not always within arms length of) and use google. In this case, the first hit is this which has an 'On merges and "fast forward"' section near the end that helpful explains this is something the merge command does. I just did a git diff and there's no local changes in my repsitory, and I've only pulled from one upstream. What the heck is there to _merge_?

The third google hit is the "git pull" man page, which shows the --no-ff and --ff options. Here's the help text for the --ff option:

Do not generate a merge commit if the merge resolved as a fast-forward, only update the branch pointer. This is the default behavior of git-merge.

I want to "update the branch pointer" without actually A) generating a new and utterly meaningless commit in the repository that makes my repository differ from upstream, B) going out to the net and grabbing yet more changes from the upstream server as a side effect. Is there a way to do this? I honestly have no clue, nor do I know where to start looking, nor do I really care enough: it sounds like the next pull will fix this.

This is a piece of software where I have little or no control over what it's doing, I'm just doing cargo cult programming at it in hopes that its deep pool of black magic remains undisturbed and does not vomit forth a cthulian behavioral problem.

I don't feel that way about Mercurial. It's built on the same general principles but I understand what it's doing and have even tweaked its source code to get it to stop doing it at times. The mercurial UI is vaguely friendly, if a bit brusque in places. The git UI is a crawling horror requiring an extensive initiation before it will allow you access to its secrets. The two programs are doing approximately the same thing, but which one's popular in the kernel community? Three guesses...

Speaking of which, the interminable kernel thread I've been cc'd on continues, but I think I've run out of willingness to read it anymore. The private emails are fun too, such as the guy who had the (and I quote): "blatantly obvious realization that if you add an extra disk to gain safety from redundancy, then losing a disk eliminates the safety which that redundancy provides."

To which my reply was:

Except the safety you're losing didn't come from that redundancy, since 
journaling works just fine on a single disk.

RAID protects you from the physical failure of a single disk.  Journaling 
protects you from system failures leaving the disk in an inconsistent state 
requiring an fsck before you can safely write to them again without 
overwriting or leaking allocated storage space for data.  This failure mode is 
neither, it's a write granularity issue overwriting data in seemingly 
unrelated storage space which a write transaction was not recently addressed 

Interesting thread.  On one side are people insisting it's user error for not 
having a deep enough understanding of how the implementation details of 
particular storage devices are leaking through the "block device" abstraction.  
On the other are people offering "blantantly obious" analogies that only make 
any sense if you have no clue what's actually going on.  And both sides are 
arguing that this is why it _doesn't_ need to be documented.

Considering how I ended my most recent post to the list, and that I've already tried to shut up and show them the code, I think I'll just call it a loss and move on to something more productive even if I _am_ still being cc'd.

Otto von Bismarck said "Laws are like sausages, it is better not to see them being made." (Except he presumably said it in German. The fact that Wikipedia thinks he wasn't the one who said it actually makes me feel better about the attribution.) Occasionally I'm tempted to add open source software to Otto's list.

September 1, 2009

Fade was in class all day and her Big Shiny iMac Monolith was just sitting there unused, so I spent most of the day playing Sims 3. (My household now has a ghost baby, from a firefighter mother so she may have the hidden pyromaniac attribute letting her set stuff on fire when she gets older.)

Yup, the talk will be titled "Developing for non-x86 targets using QEMU", and the description I submitted is:

Emulation allows even casual hobbyist developers to build and test the software they write on multiple hardware platforms from the comfort of their own laptop. QEMU is rapidly becoming a category killer in open source emulation software, capable of not only booting a knoppix CD in a window but booting Linux systems built for arm, mips, powerpc, sparc, sh4, and more.

This panel covers application vs system emulation, native vs cross compiling (and combining the two with distcc), using qemu, setting up an emulated development environment, real world scalability issues, using the amazon cloud, and building a monster server for under $3k.

Spent most of the evening bisecting why uClibc doesn't build. Bisecting finds more than one bug, of course. Narrowed down the build break introduced by uClibc-git 5efcfdc5150 which sprinkles attribute_noreturn in a lot of places that (in my .config at least) it's never #defined in the headers.

There are actually several different things I should track down. I'm using a patch to prevent uClibc from throwing pointless absolute paths into its linker scripts (thus preventing a toolchain using it from being relocatable, by which I mean installed at an arbitrary location such as a user's home directory without being recompiled from source). I submitted that patch upstream to the uClibc guys but got sufficient pushback (along the lines of gcc's --sysroot option can work around this in a very clumsy way and we all use that, even though what this does is _remove_ unnecessary code) that I thought it wasn't going to be merged. Nevertheless, somewhere between uClibc-git c54be380406 and 9889c8791b5 the problem got fixed upstream, and I should bisect to see where and see what changed. (This is another example of "good" and "bad" being reversed in a bisect. The "good" fails and the "bad" succeeds. What they really mean is "before" and "after" the patch I'm looking for.)

But meanwhile, I should track down the "everything is x86-64" I posted about yesterday, because the bug I just tracked down was not that bug, merely a bug preventing me from _finding_ that bug.

August 31, 2009

And the qemu guys broke stuff again. Now the curses support is enabled by default, but doesn't disable itself when stdin isn't a tty. That means stdout is full of ascii escape crap even when run from something like "expect". Wheee...

Mark and I will be speaking at Ohio LinuxFest in Columbus Ohio, September 25-27. I'll post details when I have 'em.

Mark and I currently have the following talks prepared, so it'll probably be one of these:

Cross Compiling:
  - Why it sucks, how to do it anyway, and alternatives (see QEMU panel).
  - Toolchain weirdness: Host vs target (vs build), canadian cross, leaks.
  - Lying to configure, pitfalls for the unwary, pitfalls for the wary.

Building embedded system images with uClibc and BusyBox
  - What Linux From Scratch doesn't tell you.
  - Smallest near-posix shell prompt is linux, busybox, uClibc.
  - Add gcc, binutils, make, and bash and you have a self-bootstrapping system.

Developing for non-x86 targets using QEMU.
  - Emulation allows even casual hobbyist developers to build and test their
    software on multiple hardware platforms from the comfort of their laptop.
  - QEMU is rapidly becoming a category killer in open source emulation
    software, capable of not only booting a knoppix CD in a window but booting
    arm, mips, powerpc, sparc, sh4, and more.
  - This panel covers application vs system emulation, native vs cross
    compiling (and combining the two with distcc), using qemu, setting up an
    emulated development environment, real world scalability issues, using the
    amazon cloud or building your own monster server for under $3k.

A survey of embedded build systems.
  - buildroot, openwrt, open embedded, gentoo embedded, Fedora for Arm,
    emdebian, openmoko, scratchbox...
  - Orthogonality, reinventing the wheel, accidental distros.

Firmware Linux and Gentoo From Scratch
  - The creators of two embedded Linux build systems explain their design
    choices.  What they are, how to use them, and why we did them that way.

From idea to device
  - What's involved in actually installing and running your software on a
    standalone piece of hardware?  Choosing a processor and development board,
    toolchain, kernel, root filesystem, jtag, bootloader, emulator, debugger
    UI issues for a headless box, the Hello World problem.

At a guess, probably the QEMU one.

August 29, 2009

Current uClibc-git dies trying to build for an i686 target with:

  AS lib/crt1.o
libc/sysdeps/linux/x86_64/crt1.S: Assembler messages:
libc/sysdeps/linux/x86_64/crt1.S:96: Error: bad register name `%rdx'
libc/sysdeps/linux/x86_64/crt1.S:97: Error: bad register name `%rsi'
libc/sysdeps/linux/x86_64/crt1.S:98: Error: bad register name `%rsp'
libc/sysdeps/linux/x86_64/crt1.S:101: Error: bad register name `%rsp'
libc/sysdeps/linux/x86_64/crt1.S:103: Error: bad register name `%rax'
libc/sysdeps/linux/x86_64/crt1.S:107: Error: bad register name `%rsp'

Yup, that's x86-64 assembly out of the x86_64 directory, when the .configure says it's an i686 target. It's autodetecting what host I'm building on, and building for that despite what I told it. (Note that this is the exact same build script that cross compiled the last release version just fine.)

And it does it when building for armv4l too. Fired up menuconfig and confirmed it's got "arm" selected as the target...

It's official, even the uClibc developers can't keep host and target straight. Cross compiling _sucks_. And now I have to go bisect this to see when it was introduced and if maybe if they have some new redundant way to specify the target a _third_ way. (They've got .config, they've got the toolchain #defines, but that's not _enough_. Although it was for

August 28, 2009

Visited Mark's, we worked out that the instructional videos should probably be audio with either slides or screen caputre. (Coincidentally, MacOSX 10.6 came out today, and has the ability to make screen capture videos built in to the new OS release. Convenient, that. SSH to a Linux box and type thingies live, while recording audio.

Meanwhile, I'm trying to figure out why current busybox breaks the binutils build. The large range of build breaks doesn't help with this, my first bisection attempt insisted that the breakage was in a range of 8 commits I'd skipped all of because they didn't build. So I tweaked the trimconfig to remove the dhcp server and client, which seem to be what's breaking. (Ok, disable hdparm too. And mkfs.vfat...)

Ok, it's busybox commit 09e63bb81f127 that broke, which claims to be an attempt to fix "df /" by hardwiring in a string compare for the word "rootfs", rather than handling the general overmount case which my toybox df implementation was doing back in 2007. So the patch itself is the wrong approach to the problem it tries to fix, and contains lots of totally unrelated code that has nothing to do with its stated goal which is probably what's breaking my build. (Since tar timestamps are what's getting screwed up, fiddling with df ain't gonna screw that up.)

Nevertheless, it looks like I need to push my toybox implementation of df into busybox... Sigh. Dowanna. Not fun anymore.

Deal with it in the morning.

August 27, 2009

Fade spent today in class and I did _not_ play Sims 3 on her computer! (This is an _accomplishment_. I am PROUD!)

I got sucked into a Linux-kernel mailing list discussion again. (In my defense, I was cc'd.) It's about documentation of a fairly nasty bug, and it's reminded me of why I don't hang out on lkml anymore.

The bug is an assumption in most standard Linux filesystems (including ext2 and ext3). They all assume that the update granularity of the underlying block device they're writing to is greater than or equal to the filesystem's granularity. (I.E. that the smallest write you can do to the block device is the same size or smaller than a filesystem block.) Oh, and also that the algnment works out, so that data you're not writing doesn't get destroyed as "collateral damage" by a failed write operation.

This isn't true for flash, which has "erase blocks" up to a couple megabytes each. As with burnable cd-roms, you have to do an "erase" pass on an area of flash memory before it can have new data written to it. You can't erase just _part_ of an erase block, it's all or nothing.

Actual Linux flash filesystems (like jffs2) are aware of this. That's why you can't mount them on an arbitrary block device, you have to feed them actual raw flash memory hardware (or something that emulates it) so they can query extra information (erase block boundaries). They cycle through the erase blocks and keep one blank at all times, copying the old data to the blank one before making the old one the new blank one. (That's why mounting them is so slow, the filesystem has to read all the flash erase blocks to figure out which one's newest, and which order to put them in.)

But flash USB keys pretend to be normal USB hard drives, which you format FAT, or ext3 or some such. And when you write to them, what they're actually doing internally is reading a megabyte or so of data into internal memory, erasing that block of data, and then re-writing it with your modifications. Generally they'll cache some writes first so they're not rewriting a whole megabyte of flash every time you send another 512 byte sector (which would not only be insanely slow but would run through the limited number of flash write cycles pretty quickly).

This sounds fine... until you unplug one during a write and the device loses power between the internal "erase" and "rewrite". Suddenly, the sector you were writing to might be ground zero in the middle of a megabyte of zeroes. You can lose data before, after, or on both sides of the sector it was updating.

Journaling doesn't pretect you from this, because it was built on the assumption that data you weren't writing to didn't change. The "collateral damage" effect of flash erase blocks undermines what journaling is trying to do, by violating the design assumptions of conventional Linux filesystems. In fact if the journal isn't what got zeroed, the journal replay may complete happily and mount the filesystem and hide the fact anything went wrong... until you try to use the missing data, anyway. (Not that using a non-journaled filesystem would actually _improve_ matters, but it's less likely to give you a false sense of security.)

For further reading, see Linux filesystems expert Valerie Aurora's excellent post on why she personally doesn't want a flash disk in her laptop.

So a Linux developer named Pavel Machek tried to document this months ago, which is where I first heard about it. (It explained some failures the Neuros guys were seeing at the time, so I passed the info on.) The response from the kernel community was "USB flash disks that pretend to be hard drives are crap, don't use 'em."

A couple days ago Pavel came back because he found a _second_ case where the "update size <= filesystem block size" problem could cause data loss: degraded software raid 5. When a RAID 5 looses a disk, it reconstructs the missing data via parity checking, based on the "stripe size". According to this howto, a reasonable stripe size for raid 5 is 128k, times the number of disks in the raid (minus one) gives you the chunk size. So a 5 disk raid would give an update granularity of 512k.

Back to degraded raid: when it loses a disk it can read all the matched stripes and compare them to reconstruct the information, but it needs the relevant stripes from all the remaining disks to do this. It can't read just some of the stripes and get anything useful. Meaning when you're writing to it, if you write some of the stripes but not others (because the power failed, the kernel paniced or hung, a watchdog timer rebooted your machine, a load balancer killed the xen instance running your node...) you not only lost the data you were writing, you lost some or all of the data in the matched stripes. Collateral damage again, because the update granularity was bigger than the sector size.

This is more or less a second manifestation of the same problem. There's an assumption at the core of most Linux filesystems that's not necessarily true on modern hardware. (And as hardware gets bigger, the update granularity is likely to get coarser.) Once I've seen a problem twice, I expect to see it a third time, so I've been hoping this gets documented and possibly even addressed in new filesystems like btrfs (although how exactly to do that is an open question).

Unfortunately, the linux-kernel developer trend for the past few years is "if it doesn't hit big iron, we don't care". Developers keep bringing up the Google cluster, cloud storage, invoking full-time trained system administrators magically knowing all this already, insisting that nobody should ever use software raid when battery backed up RAID cards exist, and generally ignoring the existence of individual users, small businesses with less than 30 empoyees, embedded developers, and so on. So we talk past each other a lot.

(Not that I can really argue with the assertion that adding a mention of this to the kernel Documentation directory would be useless because nobody actually reads the kernel Documentation directory.)

Given the funding profile of Red Hat these days, the big iron tropism isn't really surprising, but it is discouraging. (The need for IBM's big iron systems to enumerate zillions of devices is the reason transport type is more or less hidden by the scsi layer. The fact your laptop has exactly one SATA drive and you haven't _got_ a device enumeration problem finding your root filesystem doesn't help IBM, therefore the kernel developers employed to solve big iron problems went to great efforts to make this _everybody's_ problem until either it was fixed for IBM or until nobody was using Linux on laptops anymore. Both seem to have occurred. Back in 2006 when I first put mdev into busybox, I was really annoyed about this.)

Of _course_ there's politics involved in what used to be primarily technical discussions. The fact that the guy who started this conversation, Pavel Machek, has apparently been Making a huge neusance of himself on other mailing lists came as a bit of a surprise to me. It doesn't change my position in the debate, but does undermine it significantly. Oh well.

This is why I don't hang out there anymore. The signal/noise ratio is still reasonable, but the signal/pain ratio buries the needle. I don't _want_ to have to care about any of this. (I'm not particularly happy about needing to know it in the first place, I don't _want_ it to be my problem.) I've written and then _deleted_ about as many messages as I've actually posted. (I try to be polite, sometimes very hard, but the snark comes out eventually.)

Hopefully, this can become somebody else's problem and I can go back to happily ignoring lkml until they break something I care about again (I.E. the next quarterly release, like clockwork).

August 26, 2009

Fade's classes started yesterday. Today, we did 8 gazillion errands.

The most expensive was that we bought yet another printer, because the Epson CX8400 we had turns out to need its color cartridges replaced when you buy new black ink for it, even when it wasn't complaining about them _before_ I put a new black ink cartridge in, and now it's complaining about all three. (We've printed basically nothing in color so far.)

Called Epsion tech support and the nice lady in India confirmed this is Broken As Designed (not _quite_ in so many words), and since this is too evil money grubbing bastard weasely for me to ever want to give money to Epson ever again in my _life_, and Fade needs a printer for class, we hit Fry's and bought a laser printer. It just does black and white, but even the cheesy "substandard free ink cartridge that comes with it" prints 1000 sheets. And a new ink cartridge is $50 and prints 3000 sheets. We should come out ahead in about two semesters.

I'd mention Mark, but apparently his relatives have found my blog and are using it to indirectly keep tabs on him. (I keep forgetting that people other than me read this. I also keep forgetting to update it. I'm spending a lot more time on twitter than here these days.)

August 25, 2009

Went with the delay. It's not a _good_ solution, but it's reliable and doesn't require tearing out huge amounts of infrastructure or making unportable assumptions about the host environment.

Also, I have to drop a timestamp file into the snapshotted source directory. I can't use the time of the directory because anything getting written into that directory (which not only the make can do but the install as well) updates that, because the directory is essentially a file of dentry records, and thus when one of the dentries changes the timestamp for when the directory was last modified gets updated. (In theory I could try to compare creation times instead of modification times, but find doesn't give me that kind of control.)

And of course I had to fix yet another instance of bash doing:

while [ $THING == "$(thingy)" ]
  blah blah blah

Because apparently $(thingy) is evaluated once and never again. But if you change it to:

while true
  [ $THING != "$(thingy)" ] && break
  blah blah blah

That, apparently, works. Go figure.

Yes, this is the current bash in Ubuntu 9.04 that still has this bug. That's also the version with the bug where the current headphone volume when you suspend your laptop becomes the maximum volume when you resume your laptop, with the slider moving smoothly between those values. (And of course I feel nervous having the volume 100% up on my headphones because if it _does_ suddenly reset itself to sane behavior, my ears will ring for some time. Also, if I unplug the headphones the _speakers_ are really loud, but plugging the headphones back in it's really quiet again. Yes, I've made sure the little volume knob on the headphones themselves are all the way up, that's not it. It's that the sound driver has independent controls for the volume of the speaker and headphone output channels, but the Linux volume control tools are presenting they're the same thing, even though it let them get out of sync.)

And it's the same version of Ubuntu where kmail and firefox both regularly hang for 30 seconds at a time, refusing to even repaint until they've finished whatever strange internal thumb-twiddling they're doing. (I'm told firefox is garbage collecting. Any garbage collector that takes 30 seconds to run is broken. I have no idea what kmail is doing, it's had the CPU pegged for about 4 hours now. Even when it is responding to me, it's busy doing _something_ in the background. Badly.)

I've stopped even trying to complain about what they did to vi...

Linux on the desktop: smell the usability.

Gave up and did a "killall kmail" (you have to do it twice, the first one gets caught up in the endless wait), and re-launch it. Then it's responsive again. For a while...

Checked the binary package tarballs thing in and posted a message about it on the FWL mailing list.

August 24, 2009

So I'm poking at toybox again, because the "touch" command in toybox does a lot more than the "touch" command in busybox, and at this point it's easier to add -R and -d to toybox touch than to busybox touch.

I'm also looking at the principles behind the program and still happier with its infrastructure than I am with busybox's. I like having a help command, so you go "./toybox help touch" instead of busybox's strange mixture of "./busybox --help touch" and "./busybox touch --help". The "help" command is the bash builtin to tell you about shell builtin commands, it's a bit like man except it doesn't look for data files on the filesystem. I may eventually add some variant of man behavior to toybox (not troff based, though) and have it fall back to help for stuff it can't find. Dunno.

I think I need to write up fresh infrastructure documentation. It's not a big program, and I've still got the old design and implementation docs. But I need to write up a series of explanations about what problems I was solving and why I did it that way, because I keep hitting things in busybox that still do it wrong, which I assumed were fixed because _I_ fixed it. (Yes, "tar cvjfC filename directory" is still broken in busybox.)

If I picked toybox back up and made it my main full-time project, I could do a better job than what's out there. If I picked my tinycc fork back up and made it my main full-time project, I could do a _far_ better job than what's out there. Right now FWL is my main programming hobby project but it's just a hobby I poke at as the mood strikes me, and if I really buckled down I could be doing nightly tests of uClibc, busybox, Linux, and qemu on each target QEMU supports... Which would probably bring in a full time supply of bugs to diagnose and fix ad nauseum...

August 23, 2009

I've been taking notes on the health care thing, and it's to the point where I'm writing up a rant on it just to get it straight in my own head.

Obvious things, like: You're aware Bill O'Reilley and Rush Limbaugh and Glenn Beck and such are _actors_, right? Alan Alda can't perform surgery despite years on MASH any more than Johnny Depp is actually a caribbean pirate. These people are entertainers, who know perhaps as much about the functioning of government as Johnny Carson did when he joked about it in his nightly monologue. The difference is that Johnny didn't pretend _his_ opinions carried special weight, even when millions of people were listening.

Or noting that social security doesn't prevent you from having an additional retirement plan on top of that, (in fact the Roth IRA is even tax free,) so why would you expect social health care to eliminate the ability to buy private health insurance on top of it? (Or simply the ability to cough up your own money for extra attention from the medical profession, such as botox and liposuction.) If you stop and think about it, it's a really stupid assumption. Fundamentally, we're talking about extending medicare to cover everybody. Lots of people have insurance on _top_ of medicare.

(And yes, there is private health insurance in canada as anybody who bothered to type that phrase into google would quickly discover. Admittedly the debate here in the US is quickly poisoning Google's searches, but anybody with half-decent google-fu can throw the word "buy" on the front and similar tricks to filter out blow-hards recycling unsubstantiated opinions.)

The "Death Panel" idiots particularly annoy me. Today the feds provide health care to inmates on death row, and they've done so for decades. These are people the system is actively trying to kill, but they keep them healthy until then. If you think they'll do worse for you, you're NOT THINKING.

This isn't a "difference of opinion", this is one side explicitly _lying_ for political purposes, which is called "propoganda" (a textbook example of the phenomenon). A lot of people are believing stupid things that don't survive thirty seconds rational thought, because they _want_ to believe them. You think Limbaugh's going to go "I want Obama to fail, but this health care thing actually seems kind of reasonable because reality isn't black and white"? Barney Frank's Godwin smackdown about "arguing with a dining room table" seems fairly accurate.

I'll stop before I type up the whole rant _here_. It's like the SCO lawsuit all over again...

Right, computer programming:

Today's design decision: when FWL snapshots a source tarball, should the snapshot be named alt-$PACKAGE or just $PACKAGE? The source it's snapshotting is alt-$PACKAGE, but the cleanup function would be simpler if the name was consistent. (Or if the value of $PACKAGE had the alt- prefix as necessary.)

I've done most of the refactoring necessary to create individual source tarballs. I haven't taken more than a quick glance at unifying the toolchain building code of and, because the "build uClibc first vs build uClibc near the end" issue remains outstanding. I should grab a toolchain from buildroot or something and see what it would take to get to build with that...

Heh. Fun little note about that "find $STAGE_DIR -newer $CURSRC" thing: for packages like toybox that can configure, build, and start the install in under a second, the find may not actually find anything on a filesystem that hasn't got nanosecond granularity. Oops.

Given that I can't require the host to support nanosecond timestamps, potential obvious fix #1 is to throw in a one second wait, which I really don't want to do. (Seems counterproductive to de-optimize the build like that.) Potential obvious fix #2 is to do a touch -R on the source directory to reset its datestamp comfortably into the past, except there's nothing safe I can set it to because there's no guarantee it's been a full second since the previous build finished installing files yet. Potential obvious fix #3 is to do a touch -R on the output directory to set all _its_ files one second into the past, which doesn't work because busybox touch doesn't support -R or -d.

I think that's where I call it a night. I have all the infrastructure in place except for the little detail that it doesn't actually _work_. But I know why, which is something.

August 22, 2009

Despite Rule 34, the internet does not do requests. (I.E. Googling for "entertain me" is kind of disappointing.)

Finally saw "taken", the last movie left over from the "Fade doesn't care to see this" pile during her trip to Seattle. A decent semi-brainless action flick. Not particularly memorable, but kind of refreshing to see a good actor play a protagonist going on an actual, reasonably justified, unapologetically homicidal rampage for pretty much the entire movie.

By the time I managed to get the first pass cross compiler to generate and install libgcc_eh.a properly, it was an anticlimax. Still, finished it, tweaked the wrapper to use it, and checked it in anyway.

This opens the possibility of having one compiler building script, maybe even having call for that part of the build. In addition to beating enough granularity out of the build to perform a working canadian cross (now done in, the big difference was the --disable-shared vs --enable-shared thing, which meant one could build uClibc++ (which needed libgcc_eh.a) and the other couldn't.

Removing code duplication is always appealing...

First I should teach setupfor/cleanup to make tarballs of installed binaries, but unfortunately if you write a file into a directory it modifies the timestamp on the directory itself, meaning a naieve attempt to do this passes the whole directory as an argument to tar and everything's included. I need to filter out non-leaf directories, which turns out to be nontrivial.

I can use something like "find build/host -newer build/host-temp/squashfs -not -type d", and even though busybox find doesn't mention -not in its --help, it honors it. But then to add the sufficiently new leaf node directories (which some packages require) I'd need to do something like "find build/host -newer build/host-temp/squashfs -type d -empty", and busybox find doesn't support -empty.

Hmmm, the -depth option means find does a depth first search, meaning I can check each line to see if the previous line starts with that plus a slash, meaning this is a non-leaf directory and I should dump it. Except sed hasn't got any obvious way to compare the pattern space with the hold space, or substitute the hold space into an s/// comparison. Hmmm, it's got the "empty regex" thing that caches the previous search and repeats it, maybe I can do something with that? No, because I can't get the input data into the search and make it what I'm checking for, not without synthesizing a sed command line in bash. (There's probably a way to do this in awk, but I dunno what it is. It would be trivial to do in Python, of course. Or C.)

Sigh, I'm probably going to have to do it in shell, which makes string handling unnecessarily unpleasant. (Especially anything with whitespace in it.) For example, according to SUSv4's shell documentation, ${varname##pattern} shuld almomst do what I want... except that ${varname##$PATTERN} expands $PATTERN and then treats any * or ? or [] in it as special characters... Ah, I need quotes, ala "${varname##"$PATTERN"/}".

Of course even outputting the result is nontrivial, because if you have a top level file named "-n" or "-e" or "-x", then going:

echo "$FILENAME"

Produces no output, because the quotes get stripped before echo evaluates its arguments. The standard extension to disable command option parsing is --, but echo -- "$FILENAME" treats the double dash as a literal and outputs it. (Yes, both the gnu and busybox versions, and the bash builtin.) Of course I could work around this problem with:

echo "x$FILENAME" | cut -b 1-

Which is only slightly ridiculous. (My first guess was "sed 's/^x\(.*\)/\1/'", which was highly riciulous.)

Yes, shell programming is an area where Linux is every bit as crazy, brittle, and insecure as the Windows API. And you wonder why I've put so much effort into making sure the FWL build does not need to run as root? Just being _aware_ of all these edge cases is a challenge. (And yes, I'm intentionally ignoring embedded newlines in files, which the -print0 stuff is for, which isn't remotely universally supported either..)

August 21, 2009

Brown paper bag bug. Oops. My fault, of course. My testing of hg 803 didn't actually exercise the changed bit. As I posted to the list:

The fact that neither HOST_BUILD_EXTRA nor ./ --extract actually work in yesterday's release is a bit of a brown paper bag bug. Oops. Sorry about that.

Strangely, doesn't hit this because the current version doesn't ./ --extract anymore, but instead relies on building the host arch first to extract all the packages the other targets need. (And if you haven't set HOST_BUILD_EXTRA, it'll never try to extract qemu, which is what breaks.)

The big evil regex in the shell function "noversion", which tries to strip the version number off of the package name, is what's going wrong. Its idea of "version" must start with a dash, and then only accepts a letter immediately following a digit (ala bash-2.05b), so the name "qemu-2d18e637e5ec.tar.bz2" (which is the package name qemu, a dash, and then a git commit id) doesn't work because it has two consecutive letters at the end. If I strip off the "c", it works. (Cut it in the wrong place and never noticed.)

I could cut another release right now over this bug, but it's not exactly a show-stopper if doesn't hit it. It's far more _embarassing_ than it is serious. I note that if this release hadn't been delayed, then the next release would be at the start of October a little over a month from now, so I'm leaning towards having the next release then. (But if anybody really objects in the next day or two, I can re-think that.)

August 20, 2009

And FWL 0.9.7 is out! There's news, a source README and a binaries README.

Huh, and it turns out QT is dual licensed under LGPL 2.1 and GPL 3.0, so it is still usable. That's quite nice.

I go do non-computer things for a bit now.

August 19, 2009

Fade's birthday. She already bought herself an iPhone, and nothing I get her could top that, so I cleaned the house a bit instead, got her up early (9 am, ooooh), we went for kolaches (with coupon!), hit the library, went and saw Ponyo at the Barton Creek mall (it's Miyazaki, of _course_ it's good), wandered around the mall a bit window shopping, came home and I baked her a red velvet cake (well, cupcakes), and then we watched the second half of the "Due South" netflix DVD.

Good day. You can actually see the kitchen and dining room table again, which is kind of odd. Didn't start banging on the FWL release notes again until after she went to bed, but I finally ground through all the commits since last time.

I've updated the news entry (postdated to tomorrow), but haven't actually uploaded the new tarballs yet. I'll go ahead and rsync it with html comment tags around the whole thing, and finish it tomorrow after my in-person interview for this potential job.

August 18, 2009

Poking at the release notes. Wow this is a big fiddly update. No real way to coherently summarize it, I did a _lot_.

Those podcasts mark and I want to do should just be "how to use FWL", because although it's easy to say "download the source, run ./ to see the list of targets, and then run "./ targetname" on the one you like" is simple enough, actually understanding everything it can do and why probably takes hours to explain.

Fade bought an iPhone. I need to go to the t-mobile store and update my phone plan so her line isn't on it anymore. (Yes, the t-mobile guys asked me _six_times_ whether or not I'd seen the new google phone. Not the g-1, the new "mycroft" or whatever it is. They still don't understand _why_ it's not the same thing. I tried asking them when Sims 3 would be ported to the google phone and they never acknowledged the _question_.

August 17, 2009

Phone interview for a job doing processor diagnostic software this morning. It went quite well. It might mean I'd stop having time for embedded Linux development again for the foreseeable future, but oh well.

Then again, I haven't exactly been going at it guns blazing this past month, have I? I will never voluntarily write GPLv3 software, but I'm willing to be paid to write proprietary code. (Haven't decided yet if being paid to write GPLv3 code would be worse or the same as proprietary. Proprietary is at least _honest_ about its intentions.)

It was nice when I could be paid to write GPLv2 code, but the FSF put an end to that by releasing GPLv3. I'm not the only developer who lost enthusiasm after that.

The Open Source developers' policy towards Richard Stallman is similar to the United States policy towards Fidel Castro: isolate as much as possible, monitor crazy man's beard length from afar and wait for him to die of old age, then clean up afterwards. Unfortunately, while Castro no longer has Nukes RMS still exerts influence over things like Samba, QT, and GCC and can make them re-license with anti-tivoization and anti-application service provider provisions that pretty much render the code unusable even to hobbyists like me. Sad.

August 16, 2009

Vaguely suspecting I should have gone to Armadillocon. It sounds like there _was_ some good programming there (Elze's tweet about the orbital mechanics panel sounded like fun), just a lot of depressing stuff to wade through to find it.

Oh well, maybe next year.

The bisection continues, or would if I understood where I left off last night. I must have been _really_ sleep deprived, because when I woke up this morning afternoon I'd forgotten that in order to get 696 to build armv4eb you need to either patch or upgrade busybox, and then to get a usable qemu invocation to run it you need to fix the typo ala commit 734. I vaguely recall my tests were all forcing it to busybox 1.14.3 (the current version) and zapping local patches, and then I had a qemu invocation in the other window that I cursor-up ran, but now I can't seem to reproduce it. Even 696 isn't finding its hard drive, which is weird.

On the other hand, armv4eb wasn't in the last release, so it's not a regression for this release if it doesn't work. Except I thought I _had_ made it work last night, and I'm not seeing how, which is awkward...

August 15, 2009

As I posted to the FWL list yesterday, the only actual regression left from last release is that armv4eb doesn't work. It builds, and boots through the kernel message, but fails to execute even a hell-static init. The odd part is that hello-static runs just fine from qemu's application emulation, so something's funky here.

Of course there's new versions of the compiler, the kernel, and qemu to deal with here. The symptoms don't point to qemu, and I don't _think_ they quite point to the compiler either. (It's building a kernel qemu system emulation can run, and a userspace program that qemu application emulation can run.) The obvious thing to look at (uClibc) didn't change. I don't remember if last release was using squashfs by default yet, so it's possible that's the culprit as well.

For the moment, I've snapshotted a fresh copy of the build scripts and rolled it back to commit 748 which is right before the new compiler went in, to make sure it _did_ used to work and I'm not imagining things or remembering based on something that never properly got checked in. When faced with this many variables, the easy way to track it down is to focus on what actually changed since the last time it worked, find the change that broke it, and then figure out why...

Meanwhile, I've had the "Bruce Almighty" DVD (via Netflix) in my laptop for a week. It's here because while Fade was in Seattle I queued up some DVDs she didn't want to watch, and a week after she came back I'm finally forcing myself to watch it so I can return it.

I liked The Mask, The Truman Show, and found Liar Liar entertaining (if not rewatchable). And unlike those, this one did well enough to spawn a sequel. So I feel I should give it a chance.

The reason I never even considered seeing it in theatres is it seemed like the kind of movie I'd want to pause every few minutes to recover my interest in watching. True to my expectations, the movie spent the first 20 minutes torturing the protagonist. (Dull, mundane tortures that were uncomfortable to watch without actually being entertaining.)

It got much more interesting when Morgan Freeman showed up. (George Burns and Alanis Morisette also had great turns as the judeo-christian God. It must be an interesting part to have on your resume. I mostly think of it as playing Zeus in Clash of the Titans) Alas, then Jim Carrey's character had a somewhat predictable "bad behavior" montage, albeit this time at least with mildly interesting special effect.

And now he's torturing another of his pre-power nemeses. (Nemesises? Nemesi? Oh all right, ctrl-alt-google: Nemeses or (rare) nemesi. Yes, I'm blogging about this rather than watching the movie. I've never found the Humiliation Conga school of comedy funny, and yes I just linked to tvtropes, so don't click on that if you value the rest of your day.) The people who wrote this movie obviously think watching bad things happen to people who don't necessarily deserve it is hilarious and I find it painful. (Weird Al pulled off a humiliation conga at the end of UHF that was funny, but it was so over the top you couldn't take it seriously and he spent the whole movie setting up a cartoonish villain who deserved it...)

So far the movie has been one long excuse for Carrey to mug, and unlike The Mask it isn't actually improved by this. He kept it out of the way in The Truman Show as well. There's a lot of effort going into _trying_ to be funny, without ever being surprising. (Again, I've even seen _that_ done well elsewhere. To me the funniest line in Monty Python's "The Life of Brian" is "he has a wife, you know", the entire _point_ of which is to telegraph what's coming. But that scene's all about building and drawing out comedic tension, the actual individual jokes are almost irrelevant, just excuses for the masterful timing and delivery. The Pythons played each scene like a musical instrument, but this movie's all "insert joke here" paint by numbers crud. It not only never builds, it would never occur to them _to_ build. Nothing's connected.)

I'm down to pausing the movie every 30 seconds and trying to figure out why it's ticking me off in a way "Liar Liar" specifically didn't. Perhaps because a lawyer protagonist is expected to be unsympathetic, so when he did something that annoyed me it worked with rather than against what the movie was trying to achieve at that point?

No, the main problem is that Carrey's character in the previous movies may have been annoying at times, flawed, shallow, but he wasn't STUPID. In The Truman Show, he had to detect, investigate, and then outthink a massive conspiracy. In The Mask he made extremely creative use of the powerful but limited supernatural element at his disposal. In Liar Liar he may have been a slimy lawyer but he was smart, trying to work around his problems and solving his final case around his handicap. But Bruce in this movie is a short-sighted self-centered idiot, a victim of circumstance given unlimited power out of the blue, and making plodding, pedestrian use of it with obvious (predictably bad) consequences.

It's not fun to watch a guy repeatedly step on land mines the audience spots ten or fifteen seconds before he does. (Ooh, and now he gets to commit adultery while his girlfriend walks in on them! Yup, she did. Yes I typed the first of those before she had, no the movie didn't make me wait long.)

It's bad enough watching incredibly predictable moves with predictable consequences backed up by fairly pedestrian CGI. (It's hard enough to do romance at the best of times, but here I don't have a clue what she ever saw in him in the first place.) But the people making the movie obviously keep thinking something about this is funny, and that's making it worse somehow. This movie is constructed out of cliches, starting with "be careful what you wish for", but it's not even plumbing the intellectual depths of Spider-Man's "with great power comes great responsibility". I've seen these cliches done better!

Ooh, rioting in the streets finally got his attention. Hey, he finally teleported. (Yes, he had omnipotence and commuted to work by car all this time. That's less disturbing to me than the fact he worked a day job, which is less disturbing than the fact he spent about half the movie _after_ receiving absolute power attempting to get a promotion. I mentioned the dumb?)

Ah, Morgan Freeman is back. FIXING THE MOVIE. With a mop. (There's got to be bonus points for being able to pull something like that off smoothly and with style.)

The last half hour or so was such an improvement over the previous mess that I'm not going to quibble. It even managed to be funny several times towards the end.

The arm big endian target's still broken though. Commit 748 showed the same problem. 737 was before unifying the armv4l and armv4eb kernel configs, but that didn't work either... Ok, 696 is where armv4eb was added in the first place: confirmed that worked. So it broke somewhere between 696 and 737...

Each build cycle takes just long enough for me to go do something else while it builds, which adds time to the build because I forget and don't promptly come back...

716 can't find the root filesystem, seems like it can't find the block device. 710 can't either. Nor can 700...

August 14, 2009

Armadillcon's opening ceremonies are in half an hour. Trying to work up the enthusiasm to go.

Although I volunteered to help out in the con suite, and repeatedly poked them about it, I just checked my email and the last response I have from the con suite people was two months ago. I kind of doubt they even remember me, let alone have any sort of "counting on me to show up" kind of thing going on.

There's nothing in the panel schedule I want to go see. The closest is the "is Dr. Who speciesist?" panel, which is at least about something I'm a fan of, but do I really want to see a property I like grilled for political correctness for an hour?

Their guests are supposed to be the big draw, but I'm not a current fan of any of them and only even recognize two (one because Fade read one of his books, although she didn't recognize his name either until she looked it up. The other because she's Vernor Vinge's ex-wife, and I know who he is).

In their list of 100+ panel participants I recognize six names. (Seven if you count the fact that Lawrence Person is listed twice.) That's two authors I haven't read anything by since before I met Fade and have already seen at previous Armadillocons, two people I remember from Linucon, a local journalist who I briefly followed on twitter but don't anymore, and the convention's founder who I met at a party once. That's it, out of a list of over 100 people. (Several other names look familiar, but I couldn't tell you who they _are_. Probably the same panelists I saw in 2005, 2004, maybe even 1998...)

Their gaming is all Steve Jackson Games stuff (remember that big bag of games I dropped off at the concom meeting I attended? Either it was useless because they have a MIB supplying them, or it's forming the core of their game room this year). They're good games, but I don't have to leave home to play them.

Eh, maybe I'll go tomorrow. I have a coupon for a $5 haircut that expires today, and to be honest when comparing which I'd rather not miss, the haircut is winning.

Ok, according to and, nine targets are working (armv4l, armv5l, armv6l, i586, i686, mips, mipsel, powerpc, and x86_64). Five are failing: armv4eb, m68k, powerpc-440fp, sh4, and sparc.

I dunno what's up with armv4eb, the kernel boots fine but then it can't run init. The uClibc config has ARCH_WANTS_BIG_ENDIAN=y, and this _used_ to work. If it was a toolchain issue I'd expect the kernel build to be horked.

The m68k problem is a known issue finding a good emulator for it. Charles Stevens is making good progress on that, but it's not going in this release.

The powerpc-440fp target is booting to a command prompt and running just fine, but my first attempt to compile anything (forgetting to cd out of the read-only /) went:

/ # gcc /usr/src/thread-hello2.c -lpthread
/usr/bin/../libexec/gcc/powerpc-unknown-linux/4.2.1/ld: cannot open output file a.out: Read-only file system
collect2: ld returned 1 exit status
invalid/unsupported opcode: 1f - 0e - 02 (7d20009d) 10007f9c 1
Illegal instruction

But when I did a cd /tmp (which has a tmpfs in it) and tried again, it worked fine and "hello world" ran. The odd part is that smoketest is adding -o /tmp/hello to its output, I dunno why gcc would want to create a temporary file in the current directory, or why its response would be unsupported opcode if it did. (Ok, ppc440 isn't _exactly_ a subset of powerpc, so hitting that eventually makes sense. This emulator's a very loose fit for that target. But hitting the problem _there_ is just weird.)

I just ran again on powerpc-440fp and it worked. Something slightly racy going on here, but since qemu couldn't boot it at all before, I'm not calling this a regression, just something that needs work next release.

The sh4 problem is actually that the target doesn't test well. It works fine from the command line but the serial driver eats unpredictable amounts of data during initialization (I think it continuously resets it until there's no more data coming in, so the build script I'm catting in to qemu doesn't get run), and compilation is so slow that even building hello world can time out. (I'm not sure if this is a qemu problem or a kernel problem, it's not that slow running actual code, but waits it does for the emulated hardware devices can take a very long time.)

I suspect that the "cat a script into stdin/stdout of qemu" thing is going to need something more "expect" like soonish, to get the reliability up. Both seem to work fine when tested manually (albeit in the 440's case for a definition of "fine" that merely avoids hitting a reproducible bug of some kind). But they flake a bit from, and I admit dumping most of a kilobyte of data in the input buffer all at once and hoping it survives device initialization and gets read properly through /dev/console is a bit suspect. It needs more of a "wait for prompt, write response" cycle ala expect. Except expect is based on tcl and I'm not sucking in those requirements, next time around maybe I can write a shell script that does it. (I did this once before in Python, but I don't want to introduce that dependency either.)

And of course sparc never worked, although it's closer now than it was.

So the only actual regression in the above is armv4eb. I'll poke at it a bit more, but I'm not holding the release past the weekend.

August 13, 2009

Drove around a lot today, collecting books and supplies for Fade's ACC classes. Three consecutive office supply stores didn't have Moleskine notebooks, but Barnes and Noble did. (And a spanish dictionary. And a copy of "Endless Ocean", which would be a game about molesting fish if I was entirely convinced it was actually a game.)

Armadillocon is tomorrow.

At Elze's new year's party, I shared a couch with three of the Armadillocon organizers, who mentioned how "tired" the concom had gotten and in need of new blood. So I thought I'd volunteer, put my energy into that instead of launching another of my own (as Stu and I were toying with a year or so back).

The con chair this year turns out to be somebody I knew (she ran the Linucon art show back in 2004), so I emailed her about my desire to volunteer (and the _complete_ lack of contact information on the armadillocon website). She poked the appropriate person to post contact info on the website, but never got back to me about volunteering.

I tried again after a few months, volunteering to help out with programming, and exchanged a few emails with the head of programming (once I asked the con chair why he hadn't responded and she got his attention for me, anyway), but he never offered me any schedule slots to fill or agreed "that sounds like a good panel idea, go do it". Our discussion never came down to anything _specific_ I could do for him, and our discussion trailed off into unreturned emails.

About three months ago, I tried again by volunteering to help with the con suite. It's really really easy to come up with concrete stuff to do there, and you don't have the "coordinating with fog" problem. You bring ingredients, you prepare food, it gets eaten. Fairly straightforward.

My average was about three emails to to one response, but I eventually did talk to the nice lady on the phone who's running it this year, and explain lots of ideas that she seemed to think were good ones. I arranged to go to a concom meeting ot meet her in person, but when I went (last month), she didn't show up to that meeting. I haven't heard from her since.

After the meeting we all went out to dinner at a restaurant across the street from the convention hotel. I talked a lot with the guy in charge of programming I'd been emailing earlier (who had of course done the schedule himself already, staffing it with pretty much the same panelists as last year, you can see it here), and he mentioned that nobody had submitted a bid proposal to become next year's con chair yet. I started speculating about it, and the two of us adjourned to the bar at the con hotel and spent hours specing out a bid proposal. (Things like "this person will be the fan guest of honor anyway because the board already has its mind made up, so propose them now in your initial guest list to avoid an argument".) He was actually quite encouraging. I put together two complete guest lists (one a more ambitious set of selections and a fallback set of more conservative ones), and I wrote it up, and I emailed it off right before we went to not-sword-camp.

Two weeks later, when I got back to Austin, I hadn't heard a thing back about it. So I emailed him to confirm he'd _gotten_ the bid proposal, and asked how the meeting to discuss bid proposals had gone. The reply was that the meeting had been postponed, because someone already on the con staff who had been thinking of submitting a bid had decided to submit one, and they were waiting for that one. Apparently, it hadn't occurred to him to inform me of this.

So I told them to yank my bid. I didn't say it was because the person they've all known for 30 years is obviously going to get it and there's no use wasting time pretending otherwise. (Instead I made some excuse about having to make work plans for the coming year and being unable to hold off scheduling stuff any longer, which is also true.) I assume he got this email, I haven't heard anything from them since.

Not hearing back has always been the status quo in this relationship. On average I've had to make about three attempts to get their attention each time. I tried four distinct approaches (ask the conchair where they could use me, try to help with programming, try to help with consuite, submit a bid proposal to become next year's con chair), and got a nominally encouraging response at each stage. But it was never a "here's a concrete way you can help", and it immediately lapsed into silence as soon as I stopped poking them. I got exactly zero follow-up initiated by the Armadillocon concom or staff, to anything.

I wanted to volunteer at this thing, and I tried repeatedly to find a way to contribute, but new people expressing interest doesn't seem to be something they're equipped to deal with. They're just not set up to handle it. They seemed to _want_ to use me... and didn't know how.

Fade and I are still thinking of going. (I may show up to the con suite and try to volunteer in person, although there's probably not too much I can do at this point. Last time I was there, which was probably 2005, they had chips, two liter bottles of soda, and some M&M's out.)

Fade looked at the guest list and commented that it has authors, and its descriptions are what awards they've won without ever mentioning the name of a single book any of them wrote. (The awards are more important than the books?) And she was astonished they don't have their own website, but a sub-page under some other website. (It turns out they do, containing the same content, it's just the 10th hit on google. That's how much they promote it.) She said the schedule seemed like the con was "a bunch of 50 year old men who worship Heinlein". She might stop by on Saturday anyway if I'm already going, but won't head there on her own.

Here's a review of last year's Armadillocon. It talks about how innovative they were in 1982. It interviews someone who's been a panelist for 14 years. It points out that Armadillocon "isn't past evolution" and gives as an example the new "Campfire Tales" event with two authors named Joe (one born in 1943 and the other in 1951) "swapping anecdoes for an audience". It closes by hoping that the man "scheduled to close Sunday's programming with a reading, just as he's done for 25 years" will recover from his bypass surgery in time to do so.

Armadillocon's always been a fun con, I'm just wondering why anybody under the age of 35 would even find out about it, let alone want to go to the thing. (Taken there by their parents, I expect. Or grandparents...)

Wikipedia thinks last year's attendance was around 400. The first Armadillocon I attended (in 1998) was at least twice that big. The first year of Penguicon got 500 and the second 800 (although that was up in Michigan and leveraging the attendee mailing list and timeslot of the defunct ConTraption). The first year of Linucon got a little over 300, in Austin, from a standing start. I believe Ushicon's hit triple digits its second year (again in Austin), mostly 12-18 year olds, although that's a bit of an apples-to-oranges comparison with Anime vs literary SF.

Eh, it's the convention they want to run, which is not what I'd prefer it to be. Oh well, accept it and move on...

In FWL, taught the powerpc-440fp target to use the kernel and qemu invocation from the normal powerpc build, since the 440 is more or less a subset of the full powerpc instruction set, and the 440 toolchain builds a root filesystem that'll run under full powerpc emulation. (A little like running 386 code on x86_64: a proper specific emulator would be nice because running under that doesn't guarantee it'll run on real hardware, but Mark's already booted it on real hardware. I need to switch to 64k pages for the new stuff, though. That'll probably require separating the kernel .config and tweaking uClibc again.)

I also got sparc to build a static version, which booted to a shell prompt under qemu (with busybox ash, anyway; bash hung), then gave me bus error when I did "ls -l", and then hung when I did "echo *". Still, progress! Checked it in.

And I checked in the old Tin Can Tools hammer/nail board config I had lying around as a hw-tct-hammer target. That's an armv4l derivative.

I'm trying to get the design flaw at least somewhat mitigated before cutting a release. Adding the $ARCH prefix to WRAPPER_TOPDIR should solve part of it.

August 12, 2009

Poking at some loose ends in FWL. Upgrading various versions of the kernel, busybox, and FWL, plus needs to build the weird QEMU snapshot out of source control that actually runs powerpc. (Sigh. I want to build a kernel that works with current qemu, but they're more interested in accurately emulating hardware than having something you can actually boot Linux on. You'd think this wouldn't be a contradiction in terms, but no...)

August 11, 2009

Woah, water pressure today is up to full. I dunno if the city guy who came out did something, or because I didn't get out of bed until 3pm. (No, I didn't sleep through mark's morning call, I just went back to bed afterwards. He got 4 hours of sleep. Yay sleep.)

West campus is sort of dead right now since last year's students have moved out but the new ones haven't quite moved in yet. (Lots of ripping out of carpets, furniture winding up in dumpsters, and cleaning service vans lurking in alleyways looking suspicious.) The rezoning of everything from 3 stories to 5 stories back around 2005 means all the old houses got torn down and replaced by big apartment buildings, and I was assuming the water mains were overtaxed by the increased urban density. But according to the email I just got, they replaced the building's water meter, which was only putting out 20% of what it should have. (Yay, water pressure!)

Finally got armv4l building with soft float. THAT was a pain.

The script reorganizing thing was just too brittle and didn't fix the real problem, which was in gcc. So I dug through gcc to figure out how to get it to build libgcc_eh.a in --disable-shared mode, since .a files are static libraries, not shared libraries.

The control logic for this goes like this: when you run ./configure, it makes a file "Makefile" in the top level directory of the gcc build, more or less using "" as a template. You have to feed 8 gazillion command line arguments and environment variables into ./configure, which is the records in the resulting Makefile. In theory you can then run "make configure-gcc" to perform configuration in the various subdirectories (the most interesting of which is the "gcc" subdirectory), and it'll pass along the gorp you fed to the top level configure. In reality, it doesn't always pass all of it along. Worse, that only performs about half of the necessary configuration, and that doesn't include the decision whether or not to build libgcc_eh.a.

No, that decision is done when you run the "make all-gcc" step, and it compiles lots and lots of stuff, and mixed in with that it eventually decides it needs the file "gcc/mklibgcc". This file is a shell script, which is created from by this chunk of gcc/Makefile (which was itself generated from gcc/

mklibgcc: $(srcdir)/
        CONFIG_FILES=mklibgcc CONFIG_HEADERS= ./config.status

So right, we're running the file "config.status", possibly out of the gcc subdirectory or possibly out of the top level directory (depending on whether gcc/ is a recursive makefile invocation or an included makefile segment), which generates a shell script. Here's the difference in the resulting shell script based on whether or you passed in --disable-shared or --enable-shared in the top level ./configure invocation:

--- enable-shared/gcc/mklibgcc	2009-08-11 04:41:41.000000000 -0500
+++ disable-shared/gcc/mklibgcc	2009-08-11 05:13:52.000000000 -0500
@@ -109,7 +109,7 @@
 # Disable SHLIB_LINK if shared libgcc not enabled.
-if [ "yes" = "no" ]; then
+if [ "no" = "no" ]; then

So various tests in the generated shell script are hardwired in by the process that generates it. (Wheee.) But they're still if statements.

So anyway, after the Makefiles invoked by "make gcc-all" stage generate this shell script, they then run it, passing it in over 100 lines of environment variables to tell it what to do. This script outputs a new file, "". That's right, the configure sage creates a top level Makefile which creates a gcc/Makefile which calls another script to create a new shell script which when called generates a new makefile!

If this sounds like a horrible tangled mess of evil, welcome to the gcc build process.

After drilling through this crap for days, I found out that forcing it to always create libgcc_eh.a turns out to be this patch:

--- enable-shared/gcc/mklibgcc	2009-08-11 04:41:41.000000000 -0500
+++ disable-shared/gcc/mklibgcc	2009-08-11 05:13:52.000000000 -0500
@@ -223,8 +223,8 @@
   if [ "$LIBUNWIND" ]; then
+  libgcc_eh_a=$dir/libgcc_eh.a
   if [ "$SHLIB_LINK" ]; then
-    libgcc_eh_a=$dir/libgcc_eh.a
     if [ "$LIBUNWIND" ]; then

Note that the above gets it to build this library, but doesn't actually get the "make install" stage to copy it to the target. (No, that would be too easy.)

Now here's the fun part: that library doesn't contain the soft-float code. It turns out, that for the "arm/t-linux" target, _nothing_ contains the soft float code, because if you're running Linux obviously you have a floating point coprocessor! Nope, the target that contains the soft-float code is the "arm/t-arm-elf" target, I.E. the one that essentially says you're building code to run on the bare metal without an OS.

So I dug up the previous patch that was applying to 4.1.2, and now with a better understanding of what it was _doing_ (I.E. convincing gcc not to make unfounded gnu-bigoted assumptions about Linux, with a hammer to the forehead if necessary), tweaked it to apply to 4.2.1. And lo, it worked!

Running a big test build to see what else this broke. I'm going to go whimper in a corner for a bit, and then I have to pick Fade up from the airport.

August 10, 2009

Up until 5am playing Sims 3, slept past noon, the water went out during my shower (to the point where the pipe was sucking air _in_, which can't be good), but the condo association people say that the city of Austin water thingy isn't currently having known issues, and that the city is sending someone out to re-enact Flanders and Swann's "The Gas Man Cometh", but I probably don't have to be here for that.

Still felt tired, lay down to nap for a bit. It is now 9pm. Feels like I'm recovering from a sunburn. Wonder how I managed that?

Fighting with armv4l. This would be easier if I could pick an approach, but there are at least three horrible ways to fix this, all of which suck, and each time I step back to summarize the problem the grass looks greener in one of the _other_ approaches. (Well, not kernel-side soft float, that code's slow and buggy. It's buggy because it doesn't get much testing, because it doesn't get much use, because it's fundamentally so slow. If it was _just_ slow I might go for it, but add in the buggy...)

But I keep going back and forth on the "patch gcc to do this _right_" vs "reorder my scripts and hope I can sequence my way around the sheer horror of gcc" approaches, especially since the second makes it harder to fix the design flaw in my scripts. (Each layer is supposed to be orthogonal, but I recently figured out that has some fairly deeply baked in assumptions that it's using a cross compiler with my ccwrap.c. Unless you BUILD_STATIC, anyway.)

Finally watching the most recent bleach disk (volume 18). This entire disk feels like filler. It's not that it's going slowly (this often happens in anime), it's that NOTHING IS HAPPENING. (Very energetically, of course, but there are reaction shots for everything. A scene went by where every single person in a group of about 8 people had a "stop and say somebody's name in shock" moment. I should fire up my cell phone and see if the "Overtaking the Manga" entry on tvtropes mentions this episode by name.)

Ah, an episode and a half in they finally get their theme music power up, and the Vampires of Gratuitous Renaming turned to dust about sixty seconds later. There were maybe fifteen minutes of plot between the two episodes, combined. Ok, episode three is back to being merely slow. The "long artistic pauses in semi-romantic dialog.

Ah, and the last episode on this disk seems to be the start of a new season in Japan. (There's no obvious correspondence between where Viz the importer split the DVDs and anything the Japanese are doing.) You can tell because the credits changed again, and now the opening theme music is in english. Must boggle.

The Japanese seem to treat english the way we treat latin and french. Throwing the occasional english phrase into Japanese produces a certain "Je na sais quois", QED. This isn't just a phrase or two, the lyrics to this entire song are in english, and it's not exactly playing to their strengths. The singer seems to be doing some kind of vague Bruce Springsteen impersonation that winds up sounding like he has has a mouth full of novacaine. It's english with an andalusian lisp (excuse me: lithp) and occasional glitched bit of syntax, ala "to start up new day". (For some reason the fact the lyrics are nonsense like "tonight love is rationed" doesn't bother me much, because they always were. So are most of ours, really.)

I saw a chinese article the other day about their mars probe, which had a sentence "the mars and the earth", and couldn't immediately explain why it's just mars and not "the mars", but it's ok to stick the in front of earth. I don't blame people for getting english wrong, we do all the time and the british (who invented it) point and laugh at us even when we think we're doing it right. It's an ad-hoc language at best. Latin again.

That was a "<span id=entertainment>" tag.

Spent a couple hours on the phone with Mark (who is in California for the third time in a month). He called stressed out to the point he was unable to sleep. I had him email me a data dump of his complaints, and I'll try to sort through it and organize something he can present in the morning. I told him to call me when he gets up, and I'll give him a nice coherent summary of it. (Psychologically, this means it's now my problem, so hopefully he can stop worrying about it long enough to get some sleep.)

August 9, 2009

Wrote up a longish description of the current armv4l situation, then yanked it from here and am redoing it as a post to the FWL mailing list. I should really post this sort of thing somewhere people can _reply_ to it. This blog is really easy for me to do (especially when LJ is ddosed or I'm not on the net), but I've never sat down to do a reply infrastructure. (Heck, I've been putting "span" tags in the source for years, this paragraph is <span id=programming>, but I've never sat down to teach the rss feed creator to do searchable feeds. I know how, I just haven't done it yet. What was that about the cobbler's children having no shoes?

Went over to Mark's today so we could work on podcasts. Wrote up a couple dozen potential podcast topics, and he set up his little camera thingy and we found out we're both quite camera shy when the little red light is on. (No, turning the little red light off would not improve matters.) What happens during the recording is _not_ natural conversation, and I'd like to credit Terry Pratchett for being able to blame "bloody quantum" for it. We need to try again with scripts.

Succumbed to the call of Sims 3. Pretty sure nothing else getting done today.

August 8, 2009

Biked to chick-fil-a again. Only brought one water bottle this time, and it was frozen. It was still just about frozen solid when I got here, which was somewhat unexpected.

It's only 80 calories/gram to melt water, you'd think it would happen faster, but apparently no. The fact it was carbonated so I couldn't _open_ it until it was sufficiently melted without spraying water everywhere didn't help. About 3/4 of the way here I got sick of the "twist the cap slightly, watch the air gap shrink, close it again" approach, which was accomplishing nothing, and opened it on the sidewalk and let it fountain for almost a full minute (quite impressive considering how thin the layer of water around the edge had been), after which there was essentially no liquid left in there, and it was still full of ice. I'm still not quite sure how that works.

Got here. Got a large diet Dr. Pepper. Proceeded to drain it and do an almost cartoonish flop-sweat, which probably means I did something unhealthy again. (I'm in air conditioning now! Stop with the sweating!) Oh well. Yay for refills, I shall go get my third now.

Charles Stevens says that the main problem with the FWL sparc target is dynamic linking, and if you statically link busybox you get a mostly working root filesystem (modulo some unaligned access errors and such which should be fixed upstream in busybox).

Unfortunately, setting BUILD_STATIC in sources/targets/sparc/details turns out to have a sequencing issue: STATIC_FLAGS is set at the end of sources/, but the architecture config file doesn't get read until after sources/ returns.

Currently only and pay attention to that flag, so if I move setting STATIC_FLAGS to the check_for_base_arch function things will work. But that means that things that don't check for an architecture (I.E. can't have BUILD_STATIC support added in future. I think I'm ok with that, though.

Oooh, mood crashing. Yeah, I gave myself mild heatstroke biking here. (The "stop sweating until you drink more" thing was a bit of a hint. Tired, don't wanna do nothing, must nap now. I'm over an hour's bike ride from home, though. I have penguin mints to see me through this.)

The wireless here at chick-fil-a went down, so I fired up my cell phone internet, and it worked for about 20 minutes before Ubuntu's kernel went:

[269092.820941] kobject (ffff880041a71e18): tried to init an initialized object, something is seriously wrong.
[269092.820954] Pid: 32420, comm: bzip2 Tainted: G        W  2.6.28-14-generic #47-Ubuntu
[269092.820957] Call Trace:
[269092.820961]    [] kobject_init+0x9e/0xa0
[269092.820989]  [] device_initialize+0x2d/0xc0
[269092.820996]  [] hci_conn_add_sysfs+0x70/0xc0
[269092.821000]  [] hci_event_packet+0x70f/0x1430

And so on, and so forth. I need to reboot now, but I left my cell phone charging cable at home and I've got plenty of stuff to do that doesn't require internet. Oh well.

I'm always amused by the people who assume constant internet connectivity will make all apps migrate to "the cloud". Even when there aren't distributed denial of service attacks crippling half the social networking sites out there, and nobody's put a backhoe through a major fiber trunk this week, I'm usually happy to have a battery on my laptop (and the commercial guys use UPS battery backup thingies) because after 100 years we still can't reliably deliver electricity, let alone internet. How does adding a second point of failure to the equation improve matters? I've seen three datacenters flooded, and my local stuff still needs to connect to it so local potential for failure doesn't go away either...

And the Chick-fil-a is closing up, time to go home.

Biking home, I was stopped by two pretty and athletic girls who needed directions to their hotel. Luckily my laptop and cell phone were willing to talk to each other after a reboot, and Google coughed up an address.

Partly because they were seemed with Austin, and partly because they were too drunk to walk straight (one of 'em clocked herself but good on a traffic signal box and didn't seem to notice), I escorted them the dozen or so blocks there. (I was out biking for exercise, doesn't really matter where specifically I go.)

From their drunken arguments along the way, I learned that they're both seargents (army, I think; it explained the camo baseball cap the blonde was wearing), they're shipping out to Iraq for a year-long deployment in the next day or so, and one of them was trying very hard to hit on the other to the point where even I could spot it. (Unsuccessfully, I think, but I didn't ask and couldn't really tell. They were _really_ drunk, and alternated fighting with each other and hugging.)

I left them at their hotel (this being me I forgot to ask either one's name, but gave them both business cards anyway since I still have half of the box Kinkos printed for me 5 years ago), and biked home. On the way home I regretted not passing along Fade's advice about avoiding hangovers (Drink lots of water and take 2 ibuprofen before going to bed)

Of course back at the condo, two unrelated buff naked women were frolicking in the pool, giggling loudly and continuously while their boyfriends (who were both wearing bathing suits for some reason) took pictures. Usually this doesn't start until around 2am, but it _is_ Saturday...

I of course went inside to be mobbed by four cats and watch Mythbusters on my laptop (escape from Alcatraz episode, working my way through unabridged first season instead of the subset available on DVD) while working on fixing native toolchain creation for armv4l soft float. (I've got the design elements worked out, but the sequencing of the whole script is getting kind of horrible and I need to back up and rethink it.)

Yeah, I'm a geek. I know. How we ever managed to reproduce frequently enough to get this far is a mystery. Recessive genes, I expect.

I am amused that Fade's two cats have figured out that small pieces of bologna are edible, but my two ex-feral cats sniff it politely and move on. It's not kibble. They've retired from the whole "eating things that are not kibble" business. Although dragon likes tea, diet coke, and tomato sauce, and both will bug me every time I open a can because of the small chance I might pour tuna water in a bowl for them to drink, whether or not the can contained peas to begin with. But those are beverages, which are a separate issue.

I believe Fade gets back from Seattle on tuesday.

August 7, 2009

So twitter still won't let firefox users post to it. (No, it's not just me, it's continuing fallout from the denial of service attack thingy.) You can try all you like, it just gets filtered and times out. There are a number of twitter clients for Linux, but I don't want one. I like using the web browser for this.

Home alone. Bored. Fade remains in Seattle. (She can post to twitter from her mac, though.) I want to go bike somewhere but it was 107 outside today.

Decided to invite a neighbor out to a restaurant, but the cute next door neighbor with the marvelous singing voice (Aiesha) is busy packing because she's moving out tomorrow. The next neighbor down (Nimra) isn't there. (She's also moving out this month, but her router still shows up in the local wifi list so apparently it's not quite final yet?) Two doors the other way, the two visiting german guys were apparently only here for the summer, and left while Fade and I were at not-sword-camp. The unit at the far end had a "for rent" sign on it until yesterday, and another unit is listed with real estate agents (and thus google maps brings it up every time I map this building). The unit that had the neighbors who kept jumping into the pool loudly at 3am has been dark for days when it used to be party central 24/7, I dunno if they're still there either.

That's a downside of living this close to campus, you never really get to know your neighbors. It's August, everybody moves. I've owned this condo since 2003, but the only other people I've known in the building (or any of the adjacent ones) were the other people living here in this unit.

Today I submitted a resume to a team lead job here in Austin that pays six figures, and looks like something I could pretty comfortably do. I also picked up an application to teach SAT/GRE prep at the place in the Dobie Mall which is unlikely to pay even half that. I believe my job search can now officially be considered "eclectic".

I _like_ teaching, and want to get back into it, and the commute would be about 3 minutes by bicycle and less than ten on foot. (One of these days I need to take another stab at grad school, so I can get a day job as community college faculty.) But I also admit a change of pace would be nice.

I have yet to search for actual embedded Linux jobs, because I want to take a weed-whacker to BusyBox and trim out massive overgrowth, I want to yell at the uClibc guys about still not having any kind of release schedule, and want to yell at the kernel guys for days on a number of topics (sysfs needs a stable API so userspace and do device enumeration without needing a kernel-version-specific tool, using in-band signalling is stupid and wouldn't be nearly as big of a problem if the scsi layer didn't intentionally discard transport type information, putting perl in the kernel build was just horrible). I have nothing left to say to the gcc developers and hope something comes along to obsolete that project, but unfortunately tinycc won't be it. I'm still sort of engaging with the qemu guys on and off, but it's frustrating and regression-filled and even THEY don't know what their license is anymore... (Does "gpl" mean v2 or v3? They want to take code from both the kernel and binutils, and hope if they close their eyes and don't talk about it the fact the two projects' licenses are now 100% incompatible will just go away.)

It's too much, and it doesn't change anything. Remember the line in "Men at Arms" about Sam Vimes "vibrating with the internal anger of a man who wants to arrest the gods for not doing it right"? (Page 372 according to Google Books.) I feel a bit like that at times. I've spent years being driven by a frustration with the status quo and a vision of how it SHOULD work, but since I started things may actually have gotten worse. The uClibc project once had regular releases. Of all the things you could say about BusyBox back then, "it's too big and complicated" wasn't at the top of the list. Building your own kernel and running it on your system was a fairly low hurdle. Nobody expected the FSF to emerge from irrelevance like a thrice-killed horror villain and needlessly fork the community with the GPLv2/v3 split.

It's tiring. If there was one GOOD project I could point to and say "This I like, let's build out from there", that would be one thing. BusyBox used to be such a project, until Bruce Perens poisoned it by being his usual Schlameil self and drooling GPLv3 onto a list he'd never previously posted on. (Yes, I am still bitter.) I tried switching to tinycc but the existing community was too stuck where they were, and wouldn't come along, and no new community would form while the old one exerted its gravitational pull to "stay where we are, and add better Windows support to that".

So a vacation from all that might be nice.

"Change to what" is still a pointed question. Not a lot of Lua stuff out there, and most of the Python jobs want you to know .NET as well, which ain't happening. (I don't care how burned out on Linux I get, I don't do Windows. I'd become an accountant first.) I'm stale on Javascript and CSS but wouldn't mind coming back up to speed on that. Still haven't got a mac, so can't learn proper mac programming...

I know XKCD does a better job mocking the very idea of Linux on the desktop than I'm likely to, but I just hit one more little bit of fun and I need to vent.

I try to back up a couple of directories to a 16 gigabyte USB key every once in a while. I'm using rsync, which means A) it's ignoring the existing files completely and copying the entire contents of the file over again, even when they're mbox files that just got appended to, B) it's doing so at 83k/second for a file that's about a gigabyte. I went out to lunch and it wasn't finished with the file when I came back.

Apparently, the new version of Ubuntu A) broke rsync somehow (so it's copying the whole file and can't be convinced to append to the existing one), B) regressed the USB drivers so it's mounting it as USB 1.x instead of 2.0 (thus being insanely slow).

Wheee. More stuff that (I think) used to work, and doesn't anymore. (We regret regression test nothing!)

Of course the behavior of the Linux kernel when memory is filled up with disk cache that flushes to a slow device is "fun". And its decisions about how much memory it will allow each device to pin assume they all write at the same rate.

August 6, 2009

Today's twitter feed is full of comments about some kind of denial of service attack on twitter. I tried to respond to three different tweets, and post a new one, and none of them went through. Livejournal times out trying to post as well. (Both sites _read_ just fine, but I expect there's static cacheing going on in both cases.) Oh, and I had to substitute out Road Runner's DNS with the one I have memorized ( to get to anything to resolve. The internet is unwell, it seems.

The new observation was merely a standard cat post: "Aubrey has recovered enough from her surgery to shed at full capacity again. Probably a good sign." I can write it here, in my little text file I edit with vi and rsync as I remember, and the modern high-tech sites will presumably recover and start working again at some point.

I also wanted to twitter a link to this marvelous interview with Adam Savage of Mythbusters, and couldn't.

Eventually, people will figure out that Windows being a horribly insecure pile of garbage that's the perfect incubator for botnets is a problem. Can we change the law so people can legally portscan computers for obviously known vulnerabilities, infect them, and shut them down? If your computer is a danger to itself and others, some kind of law enforcement effort to find and stop vulnerable machines becomes inevitable at some point. Can we skip to that part please? We don't allow drunk drivers on the road, but we do allow unpatched windows machines on the net. Amazing.

And yes, go after vulnerable Mac and Linux machines too. Not that either is a big enough target to make botnets out of, but the principle's the same. I'm not saying wipe people's data, I'm saying switch the machine off and have it boot up to a simple explanation screen when the human is back in front of it. You left a vulnerable machine attached to the net. It probably already has viruses on it. Go get it fixed before connecting it back to the internet.

We don't need laws passed to do this. We don't even need law enforcement to do this. We need laws to explicitly _allow_ this, so ISP people defending the net from flawed machines don't get prosecuted for doing so.

Biked to chick-fil-a (oh _WOW_ am I out of shape). Stopped by the remaining bike store to get the seat replaced first. (It was broken when my bike got hit by a car, a little metal thing under the seat that used to be a loop no longer looped, and although I bent it back several times it rattled at best and didn't stay bent. Seat wanted to flip up like a trash can lid, which is unpleasant while sitting on it.) Dropped $35 on a new highly squishy one, which they installed for free. It is more buttocks-shaped than I'm used to in a bike seat, but I suppose that makes a sort of sense.

Made it here, with several stops to rest in the shade and drain a bottle and a half of tea (one of which was frozen when I started). According to the internet, it's 101 farenheit. Welcome to Texas in August. (It's too hot to _swim_ when the sun is up. I've learned to spot the nauseous out of breath feeling that's the first sigh of heatstroke for me, and means _stop_now_ for at least a couple minutes.)

The new "Fedora 11 doesn't work" bug seems to actually be "distcc and FWL don't play nice together", which actually makes sense if you stop and think about it. The clever things distcc is doing break when builds try to do strange things, and both and uClibc's build are the height of strangeness in this regard. I'll see if I can work around it.

I seem to have actually recovered some enthusiasm for poking at toybox again, which is odd. There's no _point_ to it, I'm not working towards a goal. I just haven't fiddled with any actual programming stuff in C lately, and I'm feeling a bit rusty. It's like whittling a stick to make nothing but sawdust, or building a house of cards. It doesn't need a point, it's something to do.

I wonder if I should do the buffer applet I've been pondering for a while. (As with "count", which became busybox's pipe_progress when I submitted it, there wasn't one when I decided I needed this. Since then I've seen somebody write a halfway version that doesn't do everything I want it to do. A proper buffer command needs ratelimiting too.)

Ok, this is one of the reasons I despise gcc. If you write a file containing one line, "#include <zlib.h>", and you write it as temp.c and run "gcc -E temp.c", it spits out hundreds of lines of the contents of zlib.h. Which it should, because -E says spit out the preprocessor output. But if you go:

echo "#include <zlib.h>" | gcc -E -

It outputs:

# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "<stdin>"
include <zlib.h>

Why is the behavior different? The file is coming from stdin instead of from a filename, and the preprocessor completely melts down and doesn't do its job.

I mention this because squashfs requires zlib to build, and I want to test if it's installed and skip it cleanly (or die with a more explicit message) if it's not there. I also don't want to assume that the host's headers are at /usr/include, and there's no obvious way to get gcc to dump its header search path. ("gcc --print-search-dirs" shows the library search path, and the executable path (which should just be $PATH but this is gcc so of course they reinvented the wheel). "gcc -dumpspecs" doesn't show it, and note that "gcc --dumpspecs" doesn't work because it's gcc. The uClibc build uses a dirty trick, "gcc --print-file-name=include", to get the gcc internal #include directory. Note that in this case "include" isn't a file, it's a directory, and it's only coincidentally installed in a directory that's in the library search path, but it prints it anyway and all anybody cares is that the trick _works_. But that's not the directory that would contain zlib.h.

Have I mentioned I hate gcc with the flame of a thousand suns? Recently, I mean? This stuff is SIMPLE if you understand it. I can only assume the people who implemented the gcc front-end don't understand what a compiler _does_. It's not brain surgery!

Ah, it turns out the twitter/livejournal/facebook attack was political, some Russians trying to silence a Georgian blogger, and attacking US services to do so. I wonder what the Feds will do in response? (Hands up everybody who says "nothing".)

August 5, 2009

Locked myself out of my car again, buying groceries from the 24 hour HEB on 43rd street. I feel _really_ stupid. The American Autoduel Association says it'll be an hour. I'm glad I didn't buy the ice cream.

(When it's going on midnight, realizing you haven't actually had anything to eat all day except a slim-fast and some penguin mints, going grocery shopping might not be your best move. Now arranging dinner on the hood of my car, rummaging around in the things I bought to find things edible in the near term. The bananas/carrot sticks/granola bars are ok, but these microwaveable bagel sticks are a lot better when you have access to a Microwave.)

Before that, I went to Mark's and fed his cats again and changed the litter box so it wasn't completely disgusting, and wound up staying there downloading anime from his server until it got dark.

Before _that_ I was answering FWL bug reports. No, I haven't checked any job postings in two days. My definition of "job hunting" has not yet involved a lot of hunting for jobs, I need to get on that...

Got another report on the mailing list of somebody else trying to build on Fedora 11 and not being able to. Fired up qemu and trying to reproduce it. Building fine for me so far. (It's only up to binutils, but they said it was the same distcc problem. I thought I'd rsynced my changes up to the server. It's the same script that updates my blog, which has admittedly been a bit patchy lately.)

Somebody else tried to use the m68k target (for which qemu doesn't work), and then fell back to using the powerpc target (qemu again). I pointed 'em at mips, which is a nice big endian test environment.

Yay, the locksmith is here.

August 4, 2009

Saw Fade off to the airport for a week in Seattle, re-read half of "Wintersmith" by Terry Pratchett, then sat down to get some programming done.

Got the powerpc kernel build and calling userspace, but it hangs two letters into outputting "hello world" when it gets to userspace. Pinged the qemu list, but I haven't got a _clue_ what's up with that. (This is with today's git version. This is a kernel derived from the supposedly working debian .config, minus all the modules, plus squashfs.)

Heard back from Charles Stevens about the status of m68k, some discussion on the list about that. I need to ping the qemu list about the status of the m68k work, but I need to catch up on reading it first.

Taught the rest of to build static binaries. The only dynamic binaries now are hello-dynamic and ldd (which is a uClibc build issue that seems hardwired, might have to patch the makefile). This could help sparc, since dynamic linking seems to be the main problem with it.

Still haven't tackled armv4l-soft yet. An externally imposed schedule on this project would be nice, my release date's slipped over a month already. (Feature creep is no excuse.)

August 3, 2009

Ok, job hunting. My resume is up to date, sent off a few emails to the recent "low hanging fruit" contracts, and pulled up several more things I haven't replied to yet... Now I'm at the "ok, what do I want to _do_" stage.

Obviously I could take any arbitrary 6 month contract that paid well enough, but I'm tired of the career equivalent of one night stands.

I may also be getting tired of Linux. Not sure yet. I know I'm _frustrated_ with it. I'm taking my various "how Linux on the desktop failed" snippets and writing a longish hopefully difinitive rant to post on my livejournal account (where drama belongs), but the point is, do I want to focus on embedded Linux or try something new? Embedded Linux is still more fun than regular Linux, and I'm really good at it, but it's still less fun than it used to be.

It's clear where the most _money_ is. I wouldn't get anything like my normal consulting rate doing things I'm not an acknowledged expert in. But money has never been the primary motivating factor in my career, and I can always become an expert in new things. Learning and growing is better than being stuck in a rut...

I used to like gui development at Quest and WebOffice and such (and before that on OS/2, under DOS, even a little on the Sun workstations at Rutgers). I'd enjoy getting back into that. Plus I have a lot of potential enthusiasm for learning Mac programming, currently stymied by my continuing lack of excuse for buying my own mac. (I accidentally left my backpack with my laptop in it at the restaurant at lunch, but they had it behind the counter when I went back for it. I actually had mixed feelings about this. Yes fade and mark have macs I've been able to use, but it's not the same as having your own dedicated mac laptop with you all the times.) A day job working on macs would be just about ideal for both goals, but there's a chicken and egg problem getting one.

I suppose I could look for a job working in Lua. Or Python 3.0. Those I've halfway learned already and would love an excuse to really bear down on. (I learn really fast when motivated, I've just been unmotivated for months now, due to falling out of love with Linux.)

The fun jobs tend to be the ones where other people find me, which is never entirely controllable (often coming in floods where I turn people down followed by droughts where nobody askes), and hasn't happened much the past few months (possibly due to the recession, possibly because I haven't been going to conferences and posting on lists much so I'm not out there for them to find). I've only even been updating my blog somewhat intermittently.

The downside of most of my technical contacts and resources being through the internet is that the work I know about tends to be happening hundreds if not thousands of miles away. I've moved for jobs (spent a year and a half in Pittsburgh for TimeSys, and Oxford tends to come up with per-diem contracts which means you get a tax break for living way away from home), and had various other half-hearted offers I haven't really pursued in canada and california and so on. I have cats, we can't just pick up and move easily.

I keep trying to get a master's degree (on and off, last time was 2005) because of the various types of work I've enjoyed, teaching seems to offer the best mix of fun, stability, and money. You don't have to worry about finding a new project twice a year, it doesn't have the starvation wages of most writing gigs, it doesn't go into overtime crisis and eat your life regularly... But it needs credentials, not just the ability to do the job. I need to sit down and spend two years of prep work, and it's hard to focus on that when there's so much else I could be doing. (If I enjoyed the classes I wouldn't mind so much, but so far graduate computer science puts me firmly in ivory tower academia where what I'm learning is _useless_ and I know it.)

I should get back to work poking at powerpc and fixing armv4l-soft for FWL, but as with toybox, it's increasingly hard to _care_. Bad sign, that. I need something that matters to _other_ people for a bit, so I can borrow their enthusiasm.

August 2, 2009

Poked at powerpc a bit, trying to get a kernel that works with the current qemu-git, rather than the a months-old repository snapshot that was between the bugs in different release versions. Aurelien Jarno was kind enough to send me the .config of a system that's booting for him, based on debian's powerpc kernel, but as with most distro .configs it's enormous, has 8 gazillion modules, and expects you to have an initramfs find the actual boot device so some fairly vital drivers aren't statically linked. (And for some reason, console=ttyS0 doesn't work with this even though I got at least that much working with current -git, now it's ttyPZ0 for some reason. *shrug* Ok?)

Progress has been made, but it's one of those whack-a-mole evenings. Might get more done at the laundromat this evening.

As of August 1st, I'm officially job hunting again. Monday I should actually _do_ some.

August 1, 2009

A day of fail. (I accomplished nothing, and will now describe it in detail. Skipping this entry might be a good idea if you value your sanity.)

My attempt to deliver Fade her morning can of coffee, heated up, resulted in smacking her in the nose with it and leaving a noticeable scratch and bruise. Then we went to the UT informal classes registration desk, and found they've adjourned for two weeks until the semester starts. (This is the class that had already filled up the previous three times we tried to register for it, and which I registered for on thursday.)

Fade wants to learn Flash programming and the normal Flash authoring tools for mac are insanely expensive, so we went to the campus computer store to try to exercise a student discount. They're closed for the weekend.

Next up, trying to figure out why her white macbook's battery suddenly stopped working at sword camp. The battery tester button on the bottom lights up all five lights green, but the laptop itself is convinced there's no battery. So we went to "Happy Mac" on fifty-threeth street and they said that white macbook batteries commit seppuku all the time, and the problem was so bad they used to have a web page to get a free replacement, but it expired. He borrowed a battery out of another laptop he was working on and booted Fade's laptop with it fine, so the problem is clearly the battery, but he didn't have a spare to sell us. (He just does repair, he has to order parts.)

So we headed to the Apple Store, up in The Domain, for which the phrase "die yuppie scum" was invented. (It used to be IBM's main austin campus, back when things like PowerPC chips were manufactured in the US. These days, it's a bunch of retail stores underneath apartments that are _way_ too expensive to afford if you work retail. Doesn't make any sense to me either.)

It turns out that the pretentiousness of Apple Stores and the pretentiousness of The Domain stack, to the point where I wanted to beat the guy who asked us if we had "an appointment with a genious" to death with his own clipboard within ten seconds of meeting him. He actually did the looking down his nose thing at us, with a snooty accent. His performance literally rendered me speechless, which means I didn't tell him that according to the original Lewis Terman definition from 1916 I made it with 4 points to spare on the last IQ test I remember the result of, which was when I was 10. (I also made it into Mensa, which is where I met Kelly and Steve, although I haven't paid dues since the late 90's so my membership's long expired.) I also didn't say "I don't have an appointment with your trained monkey doing tech support in a retail location, but I just wanna either buy a battery or get in on the replacement program". (Not that either reaction would actually have been helpful, but DUDE. My reaction was strong enough I almost got PHYSICALLY ILL.)

It was also overwhelmed. The guy broke character long enough to tell us the barton creek Apple Store is "closed for remodeling" so The Domain one was packed wall to wall, and we more or less ran for our lives.

I note at this point that Apple has some strange Hippie/Yuppie alliance going on. (The Happy Mac guy was a young not-quite-hippie, very friendly, quiet little shop full of electronic clutter and a dozen half-assembled machines including an old "Karate Champ" upright arcade console. Happily diagnosed our problem without charging us, and apologized for not being able to fix it. I was sorry we couldn't give him our business.) The thing is, the hippies provide the technical expertise and the yuppies provide the money. I think what I reacted to was snooty yuppies with pretensions of technical competence. (Yes, Apple has it, but not so much Apple _stores_.)

Next stop was Fry's, which had a nice little shelf of batteries with a neatly labeled empty space for the one we needed. They had every other type of Apple battery in stock, just not this one. And according to their tech, the respawns are fairly random. Fry's was also swamped, Fade's theory was back-to-school shoppers, but the demographics didn't match up. (By this point, Fade was Unhappy, and by the third set of synchronized slow walkers side by side blocking the whole aisle for no readily apparent reason, I was getting a bit annoyed myself.)

Stu called while we were heading to lunch (Chic-fil-a, also packed solid with a long line at 2pm, no explanation for that either) and suggested we try "batteries plus" on the northeast corner of Burnett and Anderson, and we tried that next. Their computer said they had the battery we needed ($150, ouch) but their actual inventory did not contain such a battery. They were apologetic. (They also have battery disposal/recycling bins, which is good to know.)

At this point, we cut our losses and went home. Six locations, three goals, zero progress.

July 31, 2009

Yesterday, twitter linked me to two articles. The first was about how Borders is forcing its employess to sign updated contracts forbidding them to blog. The second was about Barnes and Noble offering free Wi-fi in all its locations. I find the juxtaposition interesting.

Fade thinks that the reason Borders has been imploding is because they lost their focus on books years ago by distracting themselves with music, and drove their book customers away. The rest of their flailing merely adds insult to injury. I personally see a pattern of pervasive stupidity, but then my friend Chip back in New Jersey used to work at a Borders and he was amazed they'd survived as long as they had.

Aubrey's surgery went ok, she should be ready to go home around 6:30 or so. (It's a 24 hour vet, so we can pick her up any time after that.)

Trying to associate my cell phone with my new bluetooth adapter. (Lost the old one, went to Fry's a day or two back to get a new one.) Hitting the "oh wow Ubuntu's bluetooth support sucks" event horizon again.

Attempt 1: it saw my cell phone, associated, but couldn't talk to the modem. Went into my phone and told it to forget all the old devices. Attempt 2 saw a half-dozen other devices in the area, but not my phone. Obviously it needed to re-assocate with the phone, but the laptop thought it already was associated and the phone wasn't. Went in to the bluetooth GUI and found where the device association list was (this is progress; the Gnome and KDE guis for this didn't have such a list I could ever find), removed it, went back and ran the association wizard again: and now it's seeing no devices. Unplugged and re-plugged the USB bluetooth thingy, still seeing no devices. (No indication that it's scanning for them, no progress indicator, just a window that previously, after I let it sit for a while, had stuff magically appear in it. Now nothing is magically appearing in it.)

Let it sit there for five minutes, nothing appeared. I don't think it's scanning.

I need a mac.

Ah, if "visibility" is on in the laptop, it can't scan for devices. That's kind of stupid. Ok, now it's found the phone again, re-associated with it, and there's no old one of the same name to confuse things. So, only about three attempts to make it work, that's greatly improved, it merely sucks now.

July 30, 2009

The text mode dvd install went without a hitch. So the older infrastructure left over from the days before Red Hat spun off Fedora works fine, and the shiny new livecd which is the only thing the fedora project points you at doesn't work at all.

I note that the text mode install never let me set the system name or select packages to install, so it's lost some functionality since the old days, but oh well. It gave me something that boots. This is progress.

The Fedora 11 boot has three overlapping progress bars, presumably because it made the boot asynchronous to try to make it suck less. (Rather than trimming out gobs of unnecessary stuff, run 'em in the background. Then block and wait for them to finish. Right.)

The sad part is that all three progress bars have hit the right edge after 58 seconds, at which point the install sits there for multiple minutes, unchanging, before suddenly resizing the window to go into graphics mode and start bringing up X11 and the desktop. (Well, that's what the livecd does, the disk install is giving me a text mode login: prompt.)

Why even _have_ a progress bar that's spends more time waiting at "done" than it did advancing? Rehan showed me that you can hit escape here to replace the progress bar with the older list of unnecessary things Fedora is actually doing, such as launching exim, complaining about the "system message bus", setting firewall rules... So once again they've replaced "ugly and overcomplicated but functional" with "pretty, hides the complication without actually removing any of it, does not actually work".

Anyway, I have an installed system. Login as root and see what's installed:

[root@localhost ~]# gcc
-bash: gcc: command not found

Of course not. That would have been too easy.

Wow, I'd forgotten how utterly crappy installing rpms from an iso actually was. It doesn't autoresolve dependencies, it says things like:

error: Failed dependencies:
	kerhel-headers is needed by glibc-headers-2.10.1-2.i586
	kernel-headers >= 2.2.1 is needed by glibc-headers-2.10.1-2.i586

And then you just add more and more stuff to the rpm -i command line until it stops complaining. Each new thing you add has its own dependencies, which it'll complain about, and eventually, without warning, the whole mess will either just install or die with some strange internal error.

Luckily I remember from my Red Hat days (which ended back around 2003) that the naming is pretty consistent. For non-lib packages use kernel-headers-[tab], and for library packages strip off the lib, use the library name, add a dash, and tab complete. (Doesn't _always_ work, but it's a good first guess.

I have no idea why it's complaining about:

warning: gcc-4.4.0-4.i586.rpm: Header V3 RSA/SHA256 signature: NOKEY, key ID d22e77f2

It's both complaining there's no key, and giving a key id? Um... Ok. (Only package it has problems with, and this is Red Hat's own DVD iso. Oh well. It's not like I'm trying to use this thing long-term, just reproduce a bug with it.)

Heh, I note after all those insanely verbose init scripts, it didn't bother to run "dhclient eth0". But it set up exim by default (I vaguely recall that's an email server or something). Ok, stop making fun of the insane defaults, point established, moving on...

Installed gcc and binutils (which sucked in a dozen other packages). Installed wget. Ooh, Red Hat hasn't got "which", and my scripts react badly to that lack. (Dumped symlinks to "as" and such all over the current directory, that's bad. It didn't abort the script because the call to which was embedded in a "$(blah)" string, so it didn't set an error code I could test for.) Maybe I can move the busybox build up so which is already there? (Presumably they test on Fedora already...) Of course I need to install "make" before busybox will build...

Huh, that could explain the bug Rehan is seeing. The symlinks to the host toolchain were missing and the build didn't abort, so once it adjusts the $PATH to exclude the host tools the compiler isn't there, and thus can't make executables...

Yup, that's it.

Unfortunately, Aubrey's 4pm trip to the vet to get the lump on the back of her neck checked involved the words "malignant", "aggressive", and at 7am tomorrow will involve the word "surgery". I'm still hoping to avoid the word "metasticize".

Not really in a programming mood this evening.

July 29, 2009

So Rehan Khan posted a bug report to the FWL mailing list, which boils down to failing under Fedora 11 because the distcc ./configure stage says gcc can't build usable executables (even though we're using the host toolchain at that point). I haven't used Fedora in years, so my first step trying to reproduce this is to install Fedora 11 under qemu.

Got Rehan on irc and confirmed that it's the 32-bit version, which is the only obvious option seems to offer anyway. A livecd that acts as an installer. Ok, download that and fire it up under qemu.

Booting Fedora from the livecd takes about 7 minutes to give me a desktop. If you press escape you can watch it slowly set up LVM and user quotas, oblivious to being a livecd where such things are hilariously inappropriate. You can also watch the init scripts fire up the CUPS network printer daemon and NFS and so on, when the association with the wireless network is done by a GUI desktop app these days so it's launching servers before it has a network. This is a distro that clearly can't decide what it wants to be, so it tries to do everything, at once, very slowly.

Once you've got a desktop, you can double click on the installer whereupon it takes about 5 minutes to grind through painfully slow dialogs to the point it gives you a page which doesn't explicitly tell you it didn't find any hard drives, it just shows you an empty list and asks you to select from it. Some of the selections are what to _do_ with the nonexistent drive, such as "use entire drive", "replace existing linux install", "use free space", and so on. Note that you specify what to do with the drive before you select the drive to do it with, to add an extra little bit of confusion when the drive list is empty.

The reason it didn't find any drives is due to a qemu bug: current qemu-git only notices the first drive listed on the command line, and installing an OS requires both an -cdrom and an -hda argument. I git bisected it to 9dfd7c7a00dd within the qemu repository, mentioned it on the qemu list, and got pointed at a fix. This was quick and relatively painless, if I put the -hda before the -cdrom it either fires up the bootloader or it doesn't seconds after launching qemu, so reproducing is easy. And if you build qemu with ./configure --target-list=i386-softmmu it doesn't take very long, and you can even run ~/qemu/git/i386-softmmu/qemu without installing it first.

The next bug is that Fedora 11's mouse pointer wouldn't move in the current version, but in the one I booted earlier it did. I naturally assumed it was another qemu bug, but this was slow to bisect because it only shows on the desktop, so the test takes longer than the compile. Nevertheless, I very slowly bisected that one to a revision that couldn't possibly affect it (renaming a config symbol), and on a hunch booted the same qemu version three times and got the mouse frozen twice and working just fine the third time. So it's probably not a qemu bug, it's a Fedora bug. Of course.

It's not the only Fedora bug. Fedora seems to be constructed entirely out of bugs. Attempting to install Fedora from the livecd is an exercise in frustration. What a piece of garbage. I started with a 1 gigabyte virtual drive, and it complained it needed 3072 megs to install. Ok, make a 4 gig disk image (dd if=/dev/zero of=fedora-11.img bs=1M count=1 seek=4095) and reboot (7 more minutes, plus 5 to get back through the installer)... and the auto-partitioning ate itself. I think it wanted to create a swap file bigger than a gigabyte, when I was already feeding qemu 512 megs of ram to make sure it wouldn't have problems without _any_ swap. (For comparison, I first booted Red Hat 5.2 with 16 megs of ram. It ran fine.)

Checking the checkbox that I want to review and edit the partition layout apparently does nothing, you have to select custom partition layout from the pulldown. Ok, do that, new partition, mount at /, the default filesystem to format it with comes up as ext4 (huh, that's new) so accept that, hit next... and half a minute later I get a pop-up saying that the boot partition can't be ext4. (On irc, Rehan says grub doesn't support it yet.) Ok, back up, go through it again selecting ext3 this time, save it... and now it says that the root partition must match the "install image" type, which is ext4.

I hate Fedora.

Ok, presumably the "boot partition" above is /boot, and the root partition is /, so I need to make two partitions and format them with different filesystems to humor the hideously overcomplicated Fedora 11 installer. Go through the custom partition gui, make a /dev/sda1 that's 200 megs and ext3 and mounted as /boot, make a /dev/sda2 that's the rest of the space and ext4 and mounted as /, hit next, tell the stupid pop-up that I know I haven't got swap and I'm ok with this, tell the second pop-up that if I didn't want to write the partition layout to disk I wouldn't have hit "next"... It grinds for a bit... And it pops up an error message saying it couldn't save the partition layout. There's a "details" pulldown which says... that it couldn't save. The details provide no additional details. Great.

Ok, fire up a terminal window, become root (sudo /bin/bash doesn't work but su - does), run fdisk /dev/sda... It's managed to save the first partition but not the second. How do you do that, I created two primary partitions and the primary partition layout is a single structure in the boot sector of the hard drive! It's less than 100 bytes for all four of them combined, how do you get something that small _partially_ wrong?

Ok, kill the installer, use fdisk to create the partitions by hand, fire up the installer again, 5 more minutes of grinding, tell to use the existing partitions but format and mount each one appropriately. Re-confirm the various nanny dialogs (wow the Fedora UI is fond of pop-up dialogs)... Yay, it's started formatting. Five minutes later it's still formatting. Fifteen minutes later, it's still formatting. This is the 200 megabyte ext3 partition, even under qemu it should take _maybe_ 10 seconds to format.

Let's go back to that root window and run "top". That says 83% of the Fedora system CPU is beign consumed by "user", and the biggest user is X11 with 4%, and just about everything else is idle. Lemme guess, selinux is enabled, isn't it? Of course it is, this is Fedora. So root is even more crippled than a normal user. (Top doesn't show any processes that don't belong to root, because selinux won't let it see them. Isn't that special?) Ok, new tab running as the install guest user, run top in that, and that can still see all those root processes but now top's fourth and fifth processes are running as the install user... and they're gnome infrastructure crap using about 1% of the cpu each at the moment. What's eating 80% of the cpu? Top can't see it.

Right, run ps and look at the list. There is a mke2fs process, which is a zombie. Is the formatting progress bar still there? Yup. Check again, the zombie is still there.

I hate Fedora.

Ok, kill the installer AGAIN. Format the partitions by hand with mkfs.ext3 and mkfs.ext4 (takes about five seconds each)... And the gnome automounter pops up filesystem browser windows for them! I didn't ask for this, go away! Umount! Right.

On irc, Rehan points out a Fedora bug where certain types of video cards would cause the installer to hang while formatting drives. I stop to boggle at how do you get _that_ wrong? The video card is a piece of hardware the X11 server talks to, the install program is an x11 client. They're separate programs talking to each other through a library API. And the format itself is a child process of the X11 client. How do you build in an incestuous relationship which would _allow_ that to fail? The level of incompetence require to tangle that together and leak information between the two of them so a video card screws up your ability to format a hard disk is just AWE inspiring.

Ahem. Re-run the installer YET AGAIN. (By my count, this is the 13th time. Maybe it'll be lucky.) By about time 8 I was tempted to skimp on some of the details, such as setting the timezone to Chicago (there's no Austin setting) and leaving system name as localhost.localdesktop instead of "tiffanyhat" (what better name for a nonexistent Fedora system than Tiffany Aching's virtual witch hat from Wee Free Men/Hat Full of Sky?), but I fill it all out anyway YET AGAIN despite the incredible slowness of this GUI. (Qemu was fast enough to play "Frozen Penguin" under using a knoppix boot disk back in 2005 on my previous laptop. But it's not fast enough to comfortably navigate window dialogs Fedora 11 running on a system that's more than twice as fast and feeding the emulator 4 times as much memory. Hands up everybody who considers Fedora 11 an improvement over 2005-era Knoppix?)

At this point, the installer is no longer partitioning, because it can't competently do that. The installer is no longer formatting, because it can't competently do that either. (Here's hoping it can copy files! And install the bootloader, because I haven't installed grub by hand in a couple years and don't look forward to looking up how to do that again.) I still have to go into the manual partitioning thingy and tell it where to mount each partition, but I'm NOT MAKING ANY CHANGES so hopefully it... of course pops up a dialog warning me that it's saving the changes I didn't make. It's Fedora.

Yay, it's copying files! Leave it to that while watching "The Daily Show" webcast... And it died with another pop-up window. What this time? Couldn't mount /boot. It copied the files into root, and it couldn't mount /boot. The details say the reason was "input/output error".

Deep breaths. Deep breaths.

Rehan suggests yet again I use VirtualBox. This is A) not the problem, B) not happening. I'm not going to swap an emulator I'm familiar with (and can actually fix or get help with if something goes wrong) for one I've never used. Fedora has BOOTED TO THE DESKTOP with the livecd under qemu, if something was significantly wrong with the emulation that would have caused problems. No, these are Fedora bugs I'm hitting, switching emulators won't help.

Ok, screw this. The fedora install is too brittle to manually fiddle with anything. Every single step where it actually has to _do_ anything breaks. The only codepath that ever got any tested seems to be the fully automated one, and that needs a disk larger than 4 gigabytes. Ok. Kill qemu, zap the 4 gigabyte Fedora-11.img and create an 8 gigabyte Fedora-11.img. (I've got 40 gigs free, and it's a sparse file anyway.) Reboot (7 minutes), grind through the install to the part about hard drives (5 minutes), tell it to autopartion and "use entire disk", sit back...

And it's formatting /dev/sda1. It's been formatting that partition for 15 minutes. For a variant of "formatting" that does not involve the disk usage light coming on at any point. The barber pole progress bar (no idea how long it'll take, but here's a spinny thing!) keeps twirling, but nothing is happening. Been there, done that, not a new bug at this point. The autopartitioning codepath _doesn't_work_either_.

At this point, Rehan tells me that he couldn't get the livecd to work and he installed from the DVD image, using the text mode installer. Of course makes no mention of a DVD iso, but has it if you know to look for it. Apparently, the DVD has more install options than the livecd does, and doesn't need to talk to the network while installing.


I really, really, really hate Fedora. I'm going to bed now.

July 28, 2009

Back in Texas, but not home yet. At another chick-fil-a in... "Destin" or some such, just past the Oklahoma border and lake Texoma. I am uncomfortable with most things ending in -oma, probably a holdover from when I was pre-nursing in college.

Once again, there is a lack of wireless internet here. Austin spoils us so...

Another fun little bug in xfce's Terminal emulator: when you "detach tab" from an 80x25 terminal, the new one ain't 80x25. They're usually more like 77x21. (Took me a little while to figure out why terminals were spontaneously resizing themselves so small that menuconfig would refuse to run.)

I've mostly stopped noting the random bugs/regressions/UI deficiencies in Linux/Ubuntu/Xfce anymore. Yesterday it froze and I had to hard reboot it and it was just another day. Windows users don't really comment on it either. Linux will never have a significant number of nontechnical end users, just a few holdouts like me remembering the glory days of when it coulda been a contendah, just like OS/2 or the Amiga. Linux runs the servers and routers and so on that Unix has spent the past 40 years running, like Solaris and AIX and Xenix before it, and the desktop-inclined of us turn our attention to MacOS X.

Pondering yesterday's chicken-and-egg problem with uClibc, it occurs to me that the library search paths don't get modified until _after_ the new compiler gets built. So currently, building uClibc first doesn't do any good, the one out of the supplied compiler toolchain is the one getting used.

In theory the reason for building it first is because the supplied cross compiler toolchain could link against a different libc. But in practice, the way we redirect the library search paths is to set an environment variable that points the wrapper script at a different directory, and if the supplied toolchain isn't using our wrapper script doing that early isn't going to help. (Plus I _think_ you can feed it multiple directories with a colon separated search path, but it's been a while since I tested that. It needs to check the new directory and then fall back to the old one, and throwing canadian cross stuff in here with two compilers, potentially both using our wrapper script, gets ugly. I'd need to put the architecture name in the environment variable name to distinguish between them, and since you never need to canadian cross to build anything _other_ than a compiler, I'd rather not add that complexity to the infrastructure for this one case.)

The alternative is to accept that the canadian toolchains we build will be built with the --disable-shared toolchains we build. I.E. the uClibc we build for the canadian toolchain will have the same binary API as the uClibc in the target toolchain supplied to the canadian build. That's true now, and if it becomes an assumption we can _rely_ on, then the uClibc build can move back to after the binutils/gcc build...

Darn it, except there's NO GUARANTEE THAT NEW TOOLCHAIN RUNS ON THE HOST. (Whole point of canadian cross.) So building a new toolchain that's --enable-shared and thus has soft-float support doesn't help us then build a uClibc that depends on soft-float, because we can't run it as part of the build!

I continue to hate the gcc developers. And cross compiling in general. I'm motivated to provide a sufficiently generic solution to the cross compiling problem that it GOES AWAY and nobody has to do any more of it. I'm aware this is difficult, but cross compiling is an inherently horrible problem. I've been doing this for years and the tracking the multiple contexts still bites me repeatedly during the design phase, especially after a couple weeks away from the problem space.

Darn it, I know how to do this, and it's a special case. I don't want it to be a special case. Is there any way to make it NOT be a special case? I want a simple API, insane amounts of granularity, and a calling script that's simple to the point of being disposable. These goals conflict a bit. Hmmm...

Morning rush hour's over, time to hit the road again. Dallas should be navigable now. (Yes, I said that with a straight face, but I'm just trying to get through it and out the other side.)

Back home. Dinner with Mark and Fade. Feeling somewhat unwell now, no beverages in the house, off to The Donald's for beverage and programming time. I should download my mail and a fresh Fedora 11 image to reproduce that bug report with first... Ooh, and the bbc news podcasts.

July 27, 2009

I have half a can of "bad girl" energy drink left in my car. It's pink, and tastes like bubble gum. I suspect my sister would rather I _didn't_ mail a case of it my niece.

Firing up Firmware Linux and looking at the code again. Where did I leave off? Fixing armv4l, that's right.

To get armv4 working again (an example soft-float target), I've been poking at adding the soft float stuff to libc.a like I was doing for 4.1.2. This is because gcc only puts soft float support in and libgcc_eh.a, and for some reason it doesn't build libgcc_eh.a for the --disable-shared version. Yes gcc is clearly broken, but patching each new version is a pain. The patch I have for 4.1.2 doesn't apply to 4.2.1, and trying to unwind its pathological makefiles already ate about three evenings with no obvious progress. (Horrible, horrible code.)

I've been looking at genericizing the uClibc miniconfig stuff, comparing the various target configurations. The uClibc configurations are mostly identical, which makes sense since they started out with a single hand-rolled one and got modified for each target. However, they do vary a bit.

The i586, i686, and x86_64 targets use the small MALLOC version, the others use the full one. The armv4eb target has UCLIBC_LINUX_MODULE_24 switched off, which is the old 2.4 kernel module support that doesn't apply to the 2.6 kernel anyway. The armv5l and armv6l targets have several additional symbols switched on: DOPIC, LDSO_LDD_SUPPORT, UCLIBC_HAS_ADVANCED_REALTIME, and DOSTRIP. Mark added those targets. And sparc has FORCE_SHAREABLE_TEXT_SEGMENTS, I remember needing to add that but forget why.

Other than that, the TARGET_blah symbol for each target (and corresponding CONFIG_586 and similar sub-symbols), ARCH_WANTS_LITTLE_ENDIAN, and UCLIBC_HAS_FPU vary per target, as you'd expect them to.

There are also ARCH_HAS_MMU and ARCH_USE_MMU variations per target, which is weird because none of the targets are NOMMU yet. Possibly some of that is miniconfig filtering out symbols that can't be switched _off_ for various targets, and thus there's no reason to record the symbol because it doesn't matter for that config. (I'm not quite sure what the difference between the two is, actually. I can understand the kernel caring about the difference between having an MMU and using it, but I don't think userspace does. The test/build/ script has a test "Make sure nothing uses the ARCH_HAS_MMU option anymore", so apparently USE_MMU is the only one that matters? Except it hasn't been removed from current -git... I'll ask on the list.)

The actual breakage in FWL is building uClibc in Right after building a --disable-shared compiler, it builds uClibc with that new compiler so it has a C library to make a toolchain that can build actual executables. Building uClibc with a compiler that hasn't got _either_ hard or soft floating point support breaks, because the uClibc config for each target enables UCLIBC_HAS_FLOATS (and even DO_C99_MATH), which doesn't work on a no-FPU platform without soft float.

It looks like the _easy_ way to get armv4 working again is to build a stripped down uClibc just for the --disable-shared cross compiler, one which doesn't have floating point support enabled. If I factor out the target-specific bits of each uClibc miniconfig and append the generic ones to them, I can supply a slightly different generic bit based on context, hence looking at what would be involved in factoring all this stuff out. Then you can canadian cross an --enable-shared cross compiler right afterwards, and build the rest of userspace with it. (This would make armv4l require the canadian cross toolchain step to build its root filesystem, but it would avoid patching gcc.)

Darn it, no it wouldn't. When you're doing the canadian build of the --enable-shared compiler, it tries to build the uClibc build with the existing cross compiler, _before_ doing the new --enable-shared gcc build. (The canadian shared cross commpiler build is based on the build, not the build. The sequencing is different and it builds more stuff.) It _has_ to do this, so has a to link against. So you still need soft float working before you can build uClibc, and you need uClibc built before you can build the shared compiler, and soft float is only enabled in the shared compiler. Throwing in the canadian cross stuff just makes the dependency _circular_. Great.

I really, really, really hate the gcc build. This would be _trivial_ if I could tell it "build libgcc_eh.a but not", but of _course_ that's not orthogonal. Sigh.

I'm at a Chic-fil-a in Kansas City. It has the same layout as all the others (so I reliably known where electrical outlets are, and can recharge my laptop a bit), and due to some promotion they're running they even gave me a free extra chicken sandwich. But although there's a linksys router here it's not actually routing. Oh well, can't have everything.

Kansas City is actually in Missouri, which is abbriviated "MO". Presumably either Michigan, Minnesota, or Mississippi got MI, but using MO for Missouri seems unfair to Montana. So far, based on my general experience with Kansas City (until finding a national chain restaurant I happen to like, anyway), they should have used "FU" or "NO" as the abbreviation, or perhaps the kind of punctuation associated with swearing in newspaper comic strips. It wasn't rush hour when I arrived in town, but it became rush hour an hour or so later, while I was still stuck in town, and I'm happily sitting here in this chick-fil-a waiting for rush hour to subside.

July 26, 2009

I meant to leave either last night or this morning, but I wanted to help my sister clean her apartment first. (Her kids helped her "unpack" by dumping the contents of boxes on the floor when she moved in a few weeks back.) It's hard to keep _up_ with the mess made by 4 kids, let alone tackle unpacking, and this weekend her husband had the kids so we could clean.

Yesterday we both slept late (we needed it), then picked up Ian from his father's and drove to Minneapolis for the D&D game, and got back late (it's about an hour and a half each way). So we didn't get any cleaning done.

Today, Steve called and said he was having an attack and couldn't take care of the kids, so Kris had to pick them up around noon. Shortly after the breakup Steve's new girlfriend Lara got him to go to the doctor and get diagnosed with Multiple Sclerosis (a mostly non-progressive variety of it that he'll have for life but which is very unlikely to be what kills him first). He's also on anti-depressants, so whether it's MS acting up or depression because his girlfriend is out of town this weekend is an open question.

The fallback plan if Steve wasn't feeling well had been that he could take them down to the public pool and watch them while they swam, but he wasn't even feeling up to that. Kris had to come rescue him from his kids, so we didn't get time off to clean. (Did I mention that if they divorce, he wants custody of the kids? They live with their mother almost all the time and he physically _can't_ watch them even for a weekend, but that doesn't enter into it apparently. They didn't even get fed until Kris picked them up at noon. I told her to email her social worker with today's story. Steve still doesn't seem to believe she's going to divorce him.)

Thus my departure was delayed. In the hour or so we had, I helped Kris reorganize her storage closet so she can Ebay her sewing pattern collection for extra money, then Kris went and picked up the boys (finding out when she arrived that the kids local grandparents, Steve's parents, had already come and picked up the girls, who are generally much better behaved and easier to watch). Ian went off to his corner with his X-box, and I took Sean down to the dollar store.

Ian saves money to buy big things, he still has $20 his grandfather gave him which he was saving towards an Xbox, and is now saving towards Halo 3 if he can ever convince his mother to let him get a T for Teen game. Sean immediately spends any money he gets at the dollar store, and he'd accumulated $3 in change that was absolutely burning a hole in his pocket and needed to be spent NOW. Kris gave me directions to the dollar store we'd been to earlier in the week (where Carrie got her "Princess" baloon), but Sean wouldn't drive three blocks without his car seat and it couldn't be installed in the front seat (the back is full of travel stuff), so he got his bike and I walked while he rode. It was good exercise. Boy am I out of shape. (A block is a bit longer in New Ulm than in Austin.)

This gave Kris time to clean, and I did what I could to help when we got back. The living room floor is now clear, and the hallway to the bedrooms, which is _progress_.

Finally got on the road at 6:15 pm. Sean came out to hug me one more time on the way out. I love the little... ahem. I totally don't understand him, but he's my nephew.

Minnesota is full of corn and soybeans. The entire state. The entire time I was here I saw exactly one wheat field (and it was some kind of dwarf wheat a foot high but fully ripe and ready to harvest) next to my sister's public storage unit. (The big one offsite, not the closet.) Everything else: corn and soybeans.

Kris says that the money's in corn, most profit per acre (as cattle feed and due to really stupid ethanol subsidies), but crop rotation is important and soybeans are nitrogen fixers so they're the most profitable other crop you can rotate in. The entire state appears to be owned by farming congolmerates that min-max everything. The munchkin approach to farming. (Oh, and it's all genetically engineered stuff with little labels at the edge of the fields saying what it is. They come right up to the edge of the road, too. Not a square foot wasted.)

The other thing Minnesota is full of is anti-abortion billboards. Apparently you can't get one around here, but they still feel the need to rail against it about twice per mile. (In New Ulm the blonde, blue-eyed inhabitants also refer to President Obama as "the nigger", which takes rather a lot of the small-town charm out of the place. Kris didn't choose to move to this place, Steve did, and now the kids' friends and school are there. Sigh. This is the kind of place Sara Palin and co were attempting to appeal to.)

Passing the border to "I'm from Iowa, I only work in outer space", and the terrain suddenly. Now there's some actual nature mixed in, and crops other than corn and soybeans. Still lots of anti-abortion billboards, and ones claiming to be from God (who buys billboard space, apparently, and needs that advertising channel to communicate with people).

The "you're about to run out of gas" light came on during a stretch of I-35 surrounded by corn and the smell of cow manure, and I pulled off at the next exit (something like "Smithville") to find what claimed to be a town but wasn't really, a mile off the interstate down a small unlit road. The guy in the first mobile home I came to ("Nascar fans here", the sign said) said that the town didn't have its own gas station, and the closest gas was either 15 miles down I-35 the way I was going or 7 miles back the way I came. Fifteen seconds later his brain caught up and he it suddenly occurred to him "or you could use our gas station", and gave me directions to it along various dirt roads. I thanked him, and got back in my car. (Nice guy. Very polite, in a "yes officer I am sober" sort of way.)

Said gas station was 5 miles away (in the next town over), the car made it, I got gas, heaved a huge sign of relief, retraced my steps, and headed back down I-35 with a full tank. The next exit along the highway, about 3 miles into the "15 miles that way", was labeled with the name of the town I'd just been to, and had a sign for gas of the brand I'd just gotten.


Found the Iowa "welcome center" sometime later. It's a little over 1/4 of the way into the state according to the map, which seems a strange place for a welcome center but it has wireless internet and an electrical outlet so I'm not going to quibble.

July 25, 2009

Spent most of a week at my sister's new place in Minnesota. Today's entry is entirely about that, I haven't seriously poked at any programming things since leaving Austin.

Due to her impending divorce, my sister recently moved into subsidized "Section 8" housing (yes I keep thinking Klinger from MASH), and just got approved for food stamps. This is my sister. I need to go back to full-time work and earn a LOT of money and send it to her.

The backstory is she got postpartum depression after her fourth kid, and instead of getting her any sort of help her husband found a new girlfriend, the wife of some guy who's in the military deployed to the middle east. When Kris (my sister) dug her way out of it (after about a year), she told Steve (the husband) to break it off, but he wouldn't, so they're separated.

As a stay-at-home mother living in a tiny rural town an hour and a half from minneapolis, surrounded by Corn and Soybean fields. Kris' career prospects kind of suck at the moment, even without the need to take care of 4 kids into account.

He does not expect her to divorce him, what he really wants is one wife to take care of his kids and another for fun. If she does divorce him, he wants custody of the kids and not to pay any child support.

Anyway, I have two nieces and two newphews, which makes four "relatives there is no convenient grouping term for". Kris suggested "Siblings' offspring".

The oldest is Ian, and the next oldest is Sean, and they have a dynamic almost identical to me and my younger brother. Geek vs non-geek. Except where I was hyperactive with borderline Aspbergers and went largely undiagnosed and unmediciated, Ian's probably got full-blown Aspbergers. (The drugs his school prescribed made him a zombie, so he's off 'em again. I tried caffeinating him, which hits the exact same neuroreceptors as ritalin but is milder and easier to control the dose. It seemed to help, but he can taste caffeine the way I used to smell celery from the next room. He thinks chocolate penguin mints are nasty. During today's D&D 4E game after about the sixth time the GM reminded him that it wasn't his turn (which was around the 20th time they asked him to quiet down) I found out he'll drink black tea with enough sugar. Admittedly then his system's full of sugar and the result isn't exactly "calm", but afterwards he became about the third most frequently admonished of the group of five pre-teens, rather than outdoing the rest of them combined by about a factor of 2.)

Oddly this makes Ian the newphew I understand the most. He started my visit by trying to get me to read a Star Wars expanded universe book, then telling me at great length about the fanfic he was writing (for a definition of "writing" that does not involve actually writing any of it down) based on the video game "Halo". (Ok, that's an oversimplification: there's a website full of Halo fanfic which Ian's read all of. He's writing fanfic for his favorite stories in that. So technically, he's writing fanfic for fanfic based on a highly derivative video game, and it is TOTAL mary sue.) Then it segued into the "who would win" questions about his lego fictures recreating various halo units. He was jumping up and down in place pretty much the entire time. That was just the first hour of my visit. The bouncing and yelling continued pretty much constantly for about three days, except when he target locked and paid such close attention to something (lego catalogs, the eeepc I sent so Kris would have something to web surf and read email with) you couldn't get his attention without physically grabbing him.

Ian's 10th birthday is on the 29th, so I got him an Xbox 360 and Halo Kitty "Halo Wars", which is apparently Command and Conquer with a shorter story but significantly better graphics. (It's what he wanted. I feel dirty for voluntarily giving Microsoft money for the first time since 1992, but oh well. I note that the Xbox 360 at home was A) used, B) purchased by Fade not me. It's a subtle distinction, but I stand by it.) Ian mostly spent the rest of my visit playing Halo Wars, either bouncing up and down in his chair and chanting about what he's was doing, or target locked so strongly he was barely breathing and didn't notice the rest of us. Aah, memories...

Sean I don't understand, but then I didn't understand my brother either. He's a social butterfly always surrounded by a group of friends, and he plays sports. And he seems to be the emotionally neediest of the four by a large margin. He's deeply jealous of Ian's Xbox (birthday present status notwithstanding), and spent the rest of the week either crying about it or trying to tease Ian with various things he bought at the dollar store. (Sean's 8th birthday is in November sometime, I'll have to send him something just to balance the scales.)

Sean doesn't read yet. He apparently can (at least at the sounding things out level), he just shows no interest in it. I need to send him my copies of Robert Asprin's "Another Fine Myth" series, which are easy reads (ok, total popcorn: I finished three of them in an afternoon when I first got them, I think I was about 10) and are what finally got my brother reading for fun.

The next down the list is Carrie, who is 3 but comes across as 5 or 6. (She's _enormous_ for a 3 year old. Ian and Sean look a year or two older than they are but Carrie's twice the size and maturity you'd expect. She may grow up to be Sigourney Weaver.) She wears pink, everything she owns is pink and half of it proclaims she's a princess (most recently a mylar balloon she absolutely adores), and she manages to vamp and flirt at age 3. It's kind of impressive. Otherwise, she's a complete tomboy constantly covered with dirt and engaging in various athletic activities. So basically, she's a tomboy in a pink dress that says "princess". Kris expects Carrie to grow up to own a pink motorcycle, with pink leathers. (Possibly an 18th birthday present.)

I tried to spend time with her, but she's the most self-sufficient of the lot and my newphews outright demanded my attention the whole time. Sean is emotionally very needy and Ian's just plain a handful. And then, of course, there's the baby, Sam.

Samantha is no longer a baby, and hitting the far end of "toddler", but this is the first time I've made it up there since she was born. She's 1 and change (her second birthday's at the end of October I think), but entered her Terrible Twos shortly before I arrived. ("Do you want to say no?" "No!") She's been speaking in full sentences for months now (although it's hard to tell sometimes because her pronunciation's on par for her age, not much you can do about mouth shape). In general she seems every bit as precocious as Ian or Carrie, but she didn't remotely trust me until about the third day so I couldn't tell much about her. She warmed up to me on the third day at the park when she brought her half-deflated kickball (it's _her_ ball) and wanted to be goalie in the little league soccer net while I threw the ball at her so she could block it. (Sean insisted she couldn't use her hands, because it was soccer. He's 7, she's maybe 1 and 3/4.)

Sean is apparently the least smart of the 4, which makes him about average, possibly a little above. Ian is VERY smart and bored out of his skull by school (which is painfully familiar to me). Carrie isn't in school yet, but easily could be. Sam managed to enter the Terrible Twos several months early, speaking full sentences and holding reasonably coherent conversations (although she has to stop and think before answering a lot of questions and you can see the "wow, I never thought of that!" look go across her face quite frequently).

This is probably a contributing factor to Sean being... well... like my brother. I don't want to say "emotional black hole", because that's not it and it's pretty much expected of 7 year olds anyway. He's different from me, and like my brother: he attracts groups of followers from the local children, and he picks and chooses who is "in" and who is "out", and they do what _he_ wants to do (which in a couple explicit cases during the week weren't what they wanted to do, but he won the arguments because his approval was the goal.)

It's not just that all of Sean's social interactions are amazingly shallow and petty, either, because he's 7 so that's normal. Ian's excitement over his mary sue fanfic is about what you'd expect from a 10 year old ("she has a 100 mile long ship that can blow up ANYTHING... well, except suns and bigger"), and that doesn't bother me because he's enjoying it and it's a stage you have to go through to get better at writing anyway. (I did encourage him to actually write it down, although that was partly self defense so he'd stop telling me about it.)

Possibly it's that Ian and Carrie and even Sam have their own little projects to pursue, but everything Sean does is relative to "beating" other people. For example, they all share one computer (the eeepc I sent, although they may get one of Steve's castoff desktop systems soon). I got it for Kris, but Ian tends to spend more time on it than anyone else. (I meant to teach him C, but there just wasn't time.) Sean and Ian share a bedroom (the same way Carrie and Sam do, it's a 3 bedroom apartment with 5 people in it when I'm not there.) The point is, everybody tends to wake up at the same time.

One morning, Sean grabbed the computer as soon as he and his brother got up. I noted that Ian had been reading "order of the stick" (a webcomic I introduced him to the first day), and we'd accidentally lost his place yesterday, so could he open a new tab? Sean immediately closed the window and giggled. Then he insisted he needed to look something up on Google but "couldn't remember what", and needed help. He sat there at the Google screen for several minutes, holding the computer but not using it, but talking about how he was using the computer. He then went to the bathroom as soon as Ian was done with it, forgetting to take the eeepc with him, and Ian grabbed the unattended computer off the kitchen table and started going through the archive to find his lost place.

When Sean got out of the bathroom he threw a screaming fit. He'd been USING the computer (even though he hadn't been able to come up with anything specific to do with it), and Ian taking it away from him was unfair. So I offered my laptop to Ian so he could read Order Of the Stick on that, and gave the eeepc back to Sean. At which point Sean went into a full blown crying tantrum, and his mother called him on it: he hadn't wanted the computer to use it, he'd wanted it to keep it away from Ian, because that was the way to "beat" him. When having it didn't stop Ian from web surfing, he didn't want it anymore.

This was by no means the only example of this, it was just the most obvious one. He did it to Carrie too, and it was an important part of the "in crowd/out crowd" dynamics of his circle of friends. My guess is that Sean's real complaint about Ian's Xbox was that it was something Sean couldn't take away from him. (During one proto-tantrum I asked Sean if he wanted an Xbox of his own for his birthday in Novemeber so they could play games through the network, and he went "Noooooooo!" and started crying because I just didn't understand how UNFAIR it was that Ian had an Xbox that he wouldn't let Sean use.)

It's possible Sean will do quite well in life. I'd point him at a business degree, and see if he makes it as an executive. The point is, I think I understand my other three niecephews, at least well enough to relate to them, but I just don't understand him. He's like my brother, the one who went into sales, which is a profession I also just can't wrap my head around. (I know what they do, and why, but I've never understood how.)

July 20, 2009

It is morning, and it is Wisconsin, and I have locked myself out of my car in yet another McDonald's parking lot. Waved at my keys in the ignition so they didn't get lonely. Called the American Autoduel Association and they're sending somebody out to let me back in. Attempting to enjoy a Sausage Egg and Grease McGristle in the meantime. I remain confused by the concept of "hash browns". Why would anyone do that to a poor defensless potato?

Ooh, there's internet here!

July 20, 2009

So not-sword-camp was fun. (Officially this was the "Summer Weapons Retreat 2009", since Sal broke up with Heather and took Aegis with him when he moved out, and Heather's new organization is Polaris. Remember the kitten not-george back in Pittsburgh? Yeah, like that.)

Very long and tiring. I have many bruises. It started with a blow to the head that had me out most of the Wednesday, then I screwed up my right leg on Thursday by lunging over uneven ground, then I landed on my right wrist on friday and that was it for the rest of the day, Saturday was sort of cumulative bruising and fatigue...

I still don't know what to do after legging opponents (getting under the shield to stab opponents in the leg so they can't stand anymore), but I'm getting fairly good at that and apparently I flail memorably. I'm now a danger to myself and others on the battlefield, which is progress. (For example, I got Eric with a beautiful shot to the throat that would have been truly spectacular if it had been _intentional_, instead of just the result of keeping my sword between him and me while I retreated wildly and having him misjudge a flail and walk right into it. It also would have been caught on tape if the recorder hadn't been pointing at the other half of the room at the time, but oh well.)

As far as I can tell I'm at a strange skill level where I'm almost as likely to take out skilled as unskilled opponents, simply becuase _nobody_ has any idea what I'm doing, including me.

Camp wrapped up yesterday, and we all saw "Harry Potter and the Half-Caf double with a twist, flavor shot, and sprinkles", which Fade called "The Empire Strikes Back" of the series. (Quite good, not remotely standalone.) They should probably have split it into two movies, like they're doing with the last one. The casting is truly impressive, how could they have known that Draco would grow up to look like Grand Moff Tarkin's nephew?

Tracy couldn't make it to the movie last night because she had to work a night shift, but Fade and I managed to meet her for breakfast this morning. She remains quite possibly the coolest person in the entire state of Michigan. Then I dropped Fade off at the airport and started driving west towards my sister's place in Minnesota.

I type this from a McDonald's in Chicago, largely because I don't want to go back out and drive through any more of Chicago just now. (It's 9 pm and it's still rush hour. That's INSANE.) I've managed to avoid paying a toll on any of the roads so far, through sheer peverse determination, but apparently I have to follow 94 to 41, take that to Rosencrantz, then get on Guildenstern or something and then some other road I don't remember to bypass a gratuitous little toll section they put in there JUST TO MESS WITH YOU.

There are six wireless networks visible from this McDonald's, all requiring WEP keys. I still haven't found my bluetooth dongle to use my cell phone internet, and didn't make it to Fry's to buy a new one on the way out. Oh well.

Only fired up my laptop twice all week, during the quick trips to Wendy's when I caught up on twitter and downloaded my email.

July 13, 2009

Off to Michigan this morning, so no time for a real blog entry.

Instead, here's the big long rambling stream of consciousness cover letter I sent to the Google guys in London, just before driving off to be incommunicado for 3 days.

I resisted, repeatedly, saying "I assume you're going to Google all this anyway, you being you, but here are a few high points so you don't have to spend quite so much time fighting off Sturgeon's Law". It's true, but apologizing for writing them the cover letter seems counterproductive, somehow. Their web form _asked_ for one...

I also tried to resist mentioning that I know it would take twice as long to edit this mess down into something coherent than it did to write it, but it's now 5am and I have to hit the road in a few hours. Coherence is sort of an optional extra under those conditions.

To be honest, I don't really expect them to express any interest, because Google's contacted me on and off since 1999 but never really seemed to want to do anything specific. (Possibly this was because I didn't want to move to California, which put their recruiters off script. They also contacted me maybe three times about a potential Austin Google campus that never materialized, which is hard to explain as anything other than spite, but oh well.)

I don't see how me initiating the contact is actually likely to help matters, but still, it was just an email (ok, web form), I type fast, it would be fun, and it's not like sending it is worse than not sending it. Besides, having recently submitted an Armadillocon proposal bid thingy which (if I can get the guests I'd be going for, all of whom I've successfully invited to previous cons) might be able to double their membership in a year, I could see Google moving me to London instead. If I'm going to set myself up for failure, "possibly getting a cool job which would preclude this other opportunity" seems like the way to go about it.

Yes, I want contradictory things. What else is new?


Hi, I'd like to work on Google Chrome OS.

I started my professional career working on OS/2 for the PowerPC at IBM, and later did about half the installation software for OS/2 4.0. Then I bounced off Java (even spending 6 months on a strange JavaOS port to PowerPC at IBM) before switching my system to Linux in 1998. (Initially I waited for Java to follow me over to Linux, but as I learned more of Sun's internal and external politics I switched to a combination of C, Python, and bash for most of my heavy lifting, with excursions into other languages as necessary. I really need to get up to speed on Python 3.0, but currently I'm learning Lua instead.)

In the Linux world I wandered into embedded development more or less by accident. In part it's because I break everything, and wind up debugging down into the low layers of the system anyway. (It's not just that I've broken simple commands like "echo" and "cat", but I've had other bugs that seemed to be in those commands but turned out to be in procfs and the Linux tty layer and such. And compiler bugs. And library bugs. Among mountains of bugs in my own code, of course.)

I also have a tropism for simple code: I started programming on a commodore 64 when I was 12 and still remember how much you can fit in 38911 basic bytes free. I don't mind using more, but you should get a decent bang for the byte and most software really doesn't. After I found out the gnu version of "cat" was 833 lines at the busybox version was 65 (half of which was the license notice), I wound up doing enough work on busybox to accidentally become the project's maintainer for a couple years. I also extended the sucker to work as part of a development environment, and replaced all the gnu tools of "Linux From Scratch" except the compiler toolchain itself with BusyBox and uClibc. (The smallest self-bootstrapping Linux system is seven packages: linux, busybox, uClibc, gcc, binutils, gmake, and unfortunately bash 2.x although busybox ash might finally be almost able to replace that now. Someday I hope to get that down to busybox, uClibc, linux, and probably tinycc, but at least busybox and tinycc need serious work to make that happen.)

What originally attracted me to Chrome was that it was based on webkit, which was based on the hacked up version of Konqueror that Apple did for Safari. Konqueror was my main browser for many years (until I had to leave KDE behind to avoid KDE 4 a few months ago). It's about 100k lines of code, and Firefox is millions. Even though it's in C++, it's simple. (Ok, that and the Scott McCloud comic. I still can't believe the Penguicon 3 guys didn't invite him to be a Guest of Honor after Neil Gaiman recommended our event to him and vice versa. Sigh.)

I've done lots of other things, I co-founded two different "combination Linux expo and Science Fiction Convention" events (Penguicon and Linucon). For Linucon I even implemented an online registration system (credit card transactions and a very simple registration database) in python, which might actually be relevant here. I do most of my cgi in Python give a choice, because filling out a dictionary and then doing sys.stdout.write("""lots of html""" % thedictionary) is just so easy, and I could actually get the web people to look directly at my code without their eyes rolling up into their heads.

I wrote a stock market investment column for The Motley Fool for 3 years, still not entirely sure why. My favorite series is linked from the end of (with a mirror of the rest of the stuff I wrote not very well hidden under there). I actually enjoy writing documentation. I am weird. I have to learn a topic to write about it, and often I write about a topic as I'm learning it by expanding my "notes to self" into a tutorial. (I then refer to my own tutorial repeatedly.)

I used to teach night courses at the local community college, haven't found the time in a while. (My retirement plan is to get a graduate degree, make teaching my day job and turn open source programming back into a hobby, although this requires me to get a graduate degree, and returning to ivory tower academia can be a bit of a gear shift after you've been doing it professionally for a while. Moore's law is often unkind to theoreticians.) In the meantime I give talks and tutorials at various conferences, some of which have been recorded and put on the web. I collected the ones I could find at .

I've done GUI programming, and admit I miss it a bit. More than one job of mine involved taking a Dreamweaver mockup page or a hypercard stack of drawings of what java applets should look like, and implementing that. (Back on OS/2 there were dialog editors, and editing IDL by hand.) Haven't had time recently, although part of the appeal of Mac and iPhone work would be its GUI nature...

I've been -> <- this close to just giving up and buying a mac for a couple years now, since sometime after I co-authored (I can even tell you what's wrong with that analysis: the ratio between high and low end of the memory range started stretching around the time of the Celeron back in the mid-90's presumably due to the 56k modem bottleneck, and it continued to stretch until today low end systems have 1/16th the memory of high end ones instead of the 1/4th they had for the first 20 years of the analysis. I missed that the first time, and it fuzzed up the timing of this transition by over a year. Eric and I started writing a follow-up paper in 2007 with an updated table, but there didn't seem to be much of a point because nobody'd really listened to the first one and what would they do about it anyway?)

My wife converted to Macintosh rather than Linux, because Ubuntu wasn't a good fit for her and Linux had nothing else on offer. I'm the one who suggested the Mac laptop, and later bought her the big shiny iMac monolith. Today I use xubuntu (with XFCE and a drop of Retsin) because KDE 4 was unusable, and a month back on Gnome reminded me why I left it years ago. But that means I'm using the third most popular desktop on the third most popular OS, which is a rounding error. As with OS/2 and the Amiga before it, I'm essentially alone in my workstation choice again.

I had such high hopes for Ubuntu, but as with Open Office and Firefox it's spent years struggling to be an also-ran and still doesn't understand _why_. Death of a thousand cuts, one of the larger ones being that "separating user interface from implementation" and delegating policy to nobody is exactly as much of a virtue as chopping up your kernel into a thousand microkernel processes and hoping you can abstract the hardware away until it disappears. Sounds great in theory, DOES NOT WORK in practice. Plus any time "shut up and show me the code" is _not_ the correct response, such as user interface issues reported by nontechnical end users, the open source development process tends to turn on itself and projects fork to death. Maybe this could be overcome with strong enough leadership from somebody with impeccable aesthetic judgement, but Linus doesn't do UI and Steve Jobs doesn't share.

My current main hobby project (when I'm not wasting time reimplementing various SUSv4 unix command line utilities from scratch, maintaining a compiler fork when I know _nothing_ about compilers, doing weekly summaries of the qemu mailing list when I'm out of my depth there too, and so on) is Firmware Linux.

Technically this means I maintain my own (yet another) Linux distro, and have for a long time now. (I started writing up the history once, see for an unfinished mess, but I didn't expect anyone but me to care. Yes, computer history is another hobby of mine, see and someday I may write a book on it if I ever get hit by a meteor made out of pure free time.)

But the real point of the project is to make cross compiling go away, by providing a native development environment for various hardware platforms which can then be run under an emulator (such as qemu).

I have a whole talk on why cross compiling A) doesn't scale B) never will, C) is largely unavoidable anyway. But there are alternatives, and the one I've pursued is doing the absolute minimum amount of cross compiling necessary to bootstrap a native development environment, then running that environment under an emulator. Instead of cross compiling (with two contexts mixing together and a ./configure stage that asks questions about the host it's building on and uses the answers for the target it's building for which is just conceptually wrong, plus libtool failing to do nothing correctly, plus...), you instead compile natively under emulation where there's just one context.

(To make the speed vaguely feasible, you can then have distcc call out to your cross compiler through the virtual network, moving some of the heavy lifting of compilation outside of the emulator and taking advantage of SMP or even clustering for part of the build, _without_ re-introducing most of the cross compiling complexity to your build process. But really, just throw hardware at the problem instead of engineering time, an 8-way server with 32 gigs of ram and a raid-0 mirrored terabyte of disk is about $3k these days, or at least that's what my friend Mark paid for ours and that was months ago...)

My friend Mark bootstrapped gentoo inside the native environment, so you can leverage an existing distro package management system if you want. Mostly a proof of concept so far, although we can scale that up if anybody cares. I've been mostly focusing on getting an identical environment build for x86, x86-64, arm, mips, powerpc, sh4, sparc, m68k, and all the other targets qemu supports, and then automated regression testing of each new nightly snapshot of linux, busybox, or uClibc on each of those hardware targets. (Note quite there yet, and more breaks than you'd think, because very few people have these testing environments and those who _do_ use real hardware wired up to lots of serial consoles and a big switch, which is darn expensive and hard to share between multiple developers. This is a pure software testing environment, anybody can regression test the lot of it on their own laptop.)

Public examples of my code can be found in busybox, toybox, firmware linux, a few quick and dirty build scripts at, and other places. In the past year I've poked at (and at least submitted bugfixes to) the linux kernel, uClibc, and qemu. (Probably others too, but that's what comes to mind...)

I'll stop now. Yay Chrome OS. Looks like fun.


July 12, 2009

Attended the Armadillocon concom meeting today. Dropped off the big bag of games Fade culled from her collection (which still takes up an entire bookcase to itself, she mostly got rid of ones she playtested so many times she's sick of them). The con suite person wasn't there but I talked to the guy who was, who might be her brother? Nothing new there, but I showed the flag.

Then we went out to a restaurant and they started talking about how nobody's put in a proposal to chair next year's Armadillocon, and one thing led to another and I now have about half of such a proposal written up, with guests to invite and bits of a budget and list of positions that need to be filled and so on.

This may not be entirely advisable. (It certainly wouldn't mesh with spending the next few years in England, although I doubt that will actually happen...)

July 11, 2009

Went to a condo association meeting today. Apparently, of the twelve units in the building, one is being foreclosed on (either 103 or 104, I forget) and two others (102 and 201) are for sale. The one comparable to my condo is only going for $119k instead of the $145k these places were going for two years ago, but it's a weak real estate market and that's still $30k more than I paid for mine. (If any developer wanted to buy the whole building, now would be a good time. Doubt they will.)

I also have to cough up an extra $800 or so for a "special assessment", to clean/repair/replace the siding on the building (which was disgraceful when I moved in back in 2003, and has not improved). It needs to be done, but I should really start looking for a new contract when I get back from Michigan.

July 10, 2009

Fade rescheduled her flight to be a return flight instead of a flight out, so I can visit my sister after sword camp instead of before. This means we don't leave until monday, so I have today and the weekend to work on stuff, and can attend all the things scheduled for this weekend. Yay!

I feel so much less stressed now...

The cron job thingy seems to trigger when _any_ unstable package is selected. The "none" build works fine, but busybox, uClibc, and linux unstable builds all fail at the same place (binutils getting confused about makeinfo).

This is progress: it means it's not a busybox issue. (So it's not that the unstable busybox in build/host is screwing stuff up.) It also explains why I've been having trouble reproducing it: it doesn't happen on the first build. I have no idea WHY it doesn't happen on the first build, but I have a new place to look.

July 9, 2009

Google is hiring in London, and they're hiring for people to work on Chrome OS. This seems interesting, both because I've vaguely wanted to spend time in England since I first became a Dr. Who fan in the early 80's, and because the OS itself sounds interesting. I've never particularly had the patience to jump through Google's various recruiting hoops (such as when they sprinkled puzzles all around Austin, and Mark solved 'em for fun and then never applied to the recruiting contact info at the end), but I admit this is tempting, and apparently just requires a resume.

That said, I didn't send them a resume because I have to go drive up to Minnesota when they'd be replying. Fade doesn't want to visit my sister after sword camp, so I have to do it now.

I also really wanted to go to the Condo association meeting on Saturday, but I can't. I was traveling last time they had one, too. I haven't been able to go to a condo association meeting since before we got back from Pittsburgh. The signs giving the towing information for the parking spaces got grafitti'd in the past couple days, one of them is wrenched off the wall and in the flower bed. It might also be nice to ask them about the constant 3am pool parties that have been keeping Fade awake. But I can't, as I won't be here.

I can't go to the Armadillocon meeting on Sunday either. This is the first one I've known about since I tried to get involved again. I could talk to the programming guy, talk to the con suite guy I've volunteered to help out, drop off that big bag of games Fade culled form her collection, and generally meet the concom. This is the last meeting before the actual convention. Oh well, maybe next year.

Mark and I supposedly hear back about the Linux plumber's conference talk proposals on the 15th, and presumably he'll phone me if there's anything I need to do because I'll be on the road. I can't get the smashed taillight fixed on the car before I leave, because they have to order the part...

It's gone beyond bad timing into crit-fail timing. Getting a bit stressed about it.

July 8, 2009

So the cron job on securitybreach isn't working:

WARNING: `makeinfo' is missing on your system.  You should only need it if
         you modified a `.texi' or `.texinfo' file, or any other file
         indirectly affecting the aspect of the manual.  The spurious
         call might also be the consequence of using a buggy `make' (AIX,
         DU, IRIX).  You might want to install the `Texinfo' package or
         the `GNU make' package.  Grab either from any GNU archive site.
mv: cannot rename '.am11916/': No such file or directory
make[3]: *** [/home/nightly/firmware/build/temp-i686/binutils/bfd/doc/] Error 1

I don't know why it's having this problem. It only crops up with the alt-busybox build, not with stable busybox, but when I run the alt-busybox build on my laptop, or from the command line as my normal user on securitybreach, it doesn't reproduce.

It's not a specific version of busybox-git with a transient error, because it's happened three nights in a row. It's not something to do with gentoo because the one without alt-busybox builds fine. It might have something to do with being the second build to run (after the build with all stable packages), but I don't see how since I move the build directory to triage.$STAGENAME after copying out the needed stuff so any leftover debris from a previous build would be in an entirely different directory (and the build does an rm -rf on the temp directories anyway). It might have something to do with running from cron context, since it's not happening when I run it from the command line, but what would that be, tty initialization causing a busybox command to abort? Seems unlikely.

Sigh. I have to head to my sister's soon. No way I'm getting a FWL release out before then, and my next two weeks are spoken for. This entire trip has utterly horrible timing.

I can't even debug it from the road because I can't find my bluetooth USB dongle, so can't use my cell phone internet. And of course this problem only manifests on the server, not on my laptop...

July 7, 2009

Went to Mark's yesterday and got the cron jobs working. Only the "build all stable packages with the most recent scritps" build is doing anything useful, the unstable busybox, linux, and uClibc builds are all horked for various reasons. Need to fix that.

Being upset at XFCE because middle-click on a Terminal tab closes it, and there doesn't seem to be any way to stop that. My laptop has two mouse buttons and hitting them both at the same time emulates middle click, but they're right next to each other and meant to be hit with the thumb so accidentally clicking both happens a lot, such as when I'm trying to switch tabs and accidentally close the tab I'm trying to switch to instead. I want it to stop. It never occurred to the XFCE designers that this might be a problem. (Or the Mozilla designers either. There's already a little "X" icon on every tab to close it, but they had to make another "kill my program with no warning" option. That's just brilliant.)

I think this might be a window manager option, except the window manager thing only lets you change keyboard shortcuts, and specify what double-click does. It won't let you specify what anything _else_ does.

It's very Unixy that even after fifteen minutes of Googling I still can't figure out whether fixing this problem would require a change to X11, to the window manager, to the desktop program, to the actual Terminal program, or whether it's controlled by something else entirely like the touchpad driver or something. "Delegate it until it goes away" is still the Linux desktop motto, and I still need to buy a mac.

For some inexplicable reason, the busybox developers reverted netcat to the old upstream version in 2007. This means that -f went away, -l no longer announces its dynamically allocated port number, and so on. No idea why they did this, but my plans to switch from using toybox netcat to busybox netcat have been derailed by the fact that busybox netcat can't do the "tar c blah | netcat $(netcat -l tar x)" trick anymore. They broke it for no apparent reason.

July 5, 2009

As Weird Al said, Happy Fifth of July!

Yesterday the fraternity across the street was having a big party, lots of splashing and girls screaming (they've either installed a pool or a slip and slide behind their fence), with an enthusiastic but somewhat off-key band doing covers of songs like "Sweet Home Alabama". The frat guys regularly broke out into chants of "USA, USA", which is not actually normal behavior even for them. (Ordinarily it's "CHUG, CHUG"...) Fade and I didn't immediately figure out why they were doing that, because we'd honestly forgotten what holiday it was.

Busy night last night. I was up until almost 6am, and didn't get up again until 1:30. But I got a lot of work done, and it's mostly working. And now I've got to disentangle it so I can check it in.

The overall point of the changes is to update so it can do static cross and native compilers via canadian cross. Previously could do that, but couldn't, which was wrong because really should be calling (Implementing two versions of the same thing in two different places is wrong because it's unnecessarily complex, but it's sometimes hard to figure out how to simplify it.)

So once this is in, you should be able to do:


Or even:

./ i686 && STATIC_CROSS_COMPILER_HOST=i686 ./ powerpc

So all ./ has to do is call ./ for the host first (i686 or x86_64, or heck if you really _are_ building on a powerpc host...) And then do the FORK=1 thing to build for all the _other_ non-hw targets in parallel, and then build the -hw targets.

But the changes to are the _last_ checkin in the series. The other stuff is infrastructure changes to prepare for that. So what comes first:

Lots of the work was to teach the various build stages to use $STAGE_NAME as the target directory for the build. (Mostly, which gets called repeatedly to create the static and native compilers as well as the root filesystem. Also update check_for_base_arch and create_stage_tarball to use STAGE_NAME for the tarballs, and change the === announcement lines to use STAGE_NAME.

I also taught to check for both prerequisite compilers for the canadian cross stuff.

I removed the NATIVE_TOOLDIRS and /tools support, because it's just not being used (and updating it for the rest of this is too fiddly). The isolation purpose it serves for Linux From Scratch is done by here, so it's somewhat redundant. I replaced it with a ROOT_NODIRS flag that just controls whether or not to create the full set of directories (/usr, /tmp, /etc, /mnt, /home, /proc, /sys, and so on), or just let the install create what's actually used and leave the rest out. (Disabling that behavior is good when canadian crossing the static compilers, the result is simpler. The compiler directory doesn't have to pretend to be a complete root filesystem.)

The changes to and are still underway. (By the way, one of those files having a dash and the other not having a dash is ugly, but I'll worry about it later.) I broke up the old doforklog so the forking and the logging behavior are orthogonal. Added a maybe_fork to, and a maybe_quiet that you pipe the output through to filter out everything but the === lines. (Alas, this bit's still fiddly because I want to log the output _before_ maybe_quiet, but that's just getting the sequencing right: tee to the log first, then pipe the output to maybe_quiet.)

Rushing to get this done before heading to Mark's, so I can get the cron job installed. (Step 1 is moving it over to the user so I have a clean environment with a known hook I can hang functionality on...)

July 4, 2009

The main reason isn't calling directly is that doing the canadian cross stuff has prerequisites that need to be completed before later stages can continue. Specifically, the two source toolchains (host and target) before you can canadian cross a static uClibc toolchain.

However, one possible way around that is to call for the host first, and then call for all the other targets afterwards.

Passed a dozen or so college students setting off firecrackers. They gave me one. Fun. (I meant to go down to Zilker Park to watch the big fireworks, but I thought it was starting at 9:30 instead of 9:00, and then I thought it was only a half hour instead of an hour. Should have just gone. Oh well. Next year.)

Note to self: using the "spray" instead of "stream" option on the paint stripper squirt bottle is a bad thing when standing in a niche with a lot of wind. That stuff _stings_. On the bright side, I got the graffiti outside the window at Lava Java mostly removed. Ran out before I could quite get it all.

Biking to Whataburger late in the evening, I passed an impressive broken water main (North loop just east of Lamar) with two 10 foot long cracks with water shooting up in inch-high fountains. Stayed until the fire truck arrived.

Redoing to integrate the extra features had. I've wanted to do this all along, just couldn't figure out how at first.

Unfortunately, doing this has triggered a largeish untangling of things like NATIVE_TOOLSDIR. Right now's a fairly bad time to do this, of course, because I'm meeting Mark tomorrow to get the cron job stuff installed and I have to drive up to Minnesota on the 8th (although there's a homeowner's meeting on the 11th... Timing. Grrr. I couldn't go to the LAST one because I was out of town too. Highly tempted to just bump the visit to my sister to after sword camp instead of before it, but then I'd be dragging Fade along because she's flying up and driving back with me...)

Ok: combine $CROSS and $NATIVE_ROOT into STAGE_DIR. Make and use it. Then decouple the two things NATIVE_TOOLSDIR is doing: moving the top level directory into a subdirectory (changing the dynamic linker path and such), and populating a full set of directories (etc, tmp, proc, sys, home...)

What I really need for the static cross compiler is to not do _either_ of those things. No full set of directories, and no relocating the top level directory either.

Ok, replace NATIVE_TOOLSDIR with ROOT_TOPDIR and ROOT_NODIRS. Replace $TOOLS with ROOT_OUTPUT... Sigh. Do I really want to keep the old Linux From Scratch /tools directory support? If you really want one of those, you can build it yourself. And if you do a static busybox and static toolchain, you can move it anywhere you like. (Well, modulo the boot scripts saying #!/bin/bash instead of #!/tools/bin/bash...)


Possily ripping out more functionality than I really should, but it got too complicated. I want simple. (You can tell it's 4am, can't you? Technically this should be the July 5th entry. I go pass out now.)

July 3, 2009

Took my last antibiotic pill last night. That's round 6 for this sinus infection. The fact I had a sore throat before I'd even run out of pills isn't filling me with confidence, but oh well. Hopefully it's an unrelated (viral) cold.

Woke up tired. Took a nap for 4 hours. Woke up tired again, but this time covered in cats. So there's that.

This episode of Bleach is trying to be comedy, and none of the cultural references translate unless you're japanese. I can tell they're parodying specific things, just not what. The entire episode is designed to be an aside and not affect the main plot at all. (I think it's a special for their 50th episode.)

Hmmm, episode 52 introduced new credits again. Ah, "if you continue you may die before you reach $PLOTPOINT". It's Ichigo being addressed. Hands up everybody who remembers his limit break.

Ooh, did red haired guy pick up Ichigo's limit break? Nope, apparently not... Ah, I see, he's channeling Ichigo. Well, that works. No, it didn't. I'm starting to think a flashback montage right before an attack is a bad sign in this universe.

Do Japanese people really throw random english phrases in conversation whenever they want to emphasize something? I suppose it's a bit like english speakers with Latin and french...

So where did I leave off yesterday? I found out that the reason the "FORK=1 ./" script was failing was that formatting a dozen virtual partitions (to create the "hdb" images to mount on /home inside each qemu instance) was causing so much I/O contention that the timeout was elapsing before any of the actual tests could even start. At a guess, mke2fs has some fsync() calls in there, and having an 8-way server with 32 gigs of ram doesn't help if flushing to disk is the bottleneck.

The easy way to deal with this is to add an option to _not_ create the hdb images, because in this case they're not needed... Ok, done.

Ok, I like this president. Watching the video of Obama recording the voice for his Disney hall of presidents incarnation:

Obama: Are these like wax figures, or holograms?

Disney guy: Audio animatronics.

Obama: Animatronics.

Disney guy: Robots.

Obama: They're robots. Well that's kinda cool.

Geek president!

Hmmm... Ok, when I run "./ --extract" I want to log both the stdout and stderr output, but for I only want to _display_ the stderr output. I can merge the streams to put them into the same log ala "./thingy 2>&1 | tee blah" easily enough, but can't separate the streams again afterwards to display only one.

July 2, 2009

Oh wow. Carnegie Melon actually has a process improvement process, with presentation slides. That's so meta it's almost recursive. (I'm reminded of Dilbert, "you can't just jump into the pre-meeting meeting cold".)

Came up in the context of a not-job-offer. (Recruiter sending me a posting to see if I'm interested in interviewing for it. I should do this thing.)

Ok, Obama's speech at the Radio and TV correspondents dinner was totally, totally upstaged by John Hodgman's follow-up.

There really, really, really should be an easier way to do this:

function genocide()
  local KIDS=""

  while [ $# -ne 0 ]
    KIDS="$KIDS $(pgrep -P$1)"

  KIDS="$(echo -n $KIDS)"
  if [ ! -z "$KIDS" ]
    genocide $KIDS
    kill $KIDS

I suppose I should probably call it something other than "genocide", even though that's what it does to a process tree.

On an unrelated technical note, I wonder how close wifi is to taking over? Nobody offers ethernet jacks you can plug your laptop into anymore. (Well, a few hotels and such have them left over, but nobody's installing new ones in public places.) Homes wired up for ethernet don't seem all that common either, most people just get a wireless router, plonk it next to their cable modem, and call it a day. They don't bother running cat5 through the walls.

But offices are still full of ethernet. Servers need the bandwidth (both for raw speed terms and for density). Wireless is also less attractive when you're worried about security.

But the question is, where does the money go? Who buys so much of a technology that it becomes a ubiquitous dirt cheap commodity? I don't have enough information to know what the financial incentives are for upcoming generations of technology. The existing wired stuff is already amortized: 10baseT is essentially disposable, 100baseT is dirt cheap, and Gigabit is less than a dollar per device. There's 10 gigabit in the wild, I haven't really heard much about it in the past couple years but I assume it's still doing its thing.

Usually I can predict where a technology is going over the next few years, but here it just hasn't come up in a while. I haven't seen a lot of demand. For normal consumer use 802.11g (54 megabits/second) is fast enough for half a dozen people to watch HDTV resolution video simultaneously, and that seems to be about where demand stops at the moment. It's "good enough".

Gigabit ethernet is a little over 100 megabytes per second. My laptop hard drive is doing about 60 megabytes/second. So the cheap wired consumer networking is faster than cheap consumer storage devices, and although you can get a RAID NAS cheap these days, it'll be a while before anybody outside of a data center can really make use of 10Gig ethernet.

Meanwhile I vaguely recall 802.11g is 54 megabits/second, which is a bit more than 6 megabytes per second, although possibly a router retransmitting it in between means half of it's devoted to sending it to the router and half of it's devoted to receiving it from the router; it should be possible to send it point to point but I dunno if that's what normally happens. I should run some benchmarks...

On the one hand, my laptop hard drive is almost 10 times as fast as the theoretical maximum of 802.11g. On the other hand, my extra-fast cable modem (for which I pay extra) maxes out at a megabyte and change per second, so the wireless is at least 3 times faster than that. So it's a bottleneck going between two local computers, but not going to the internet.

The upcoming 802.11n standard claims to be 600 megabits/second, with typical throughput around 144 megabits/second. (That's from wikipedia so take it with a grain of salt.) So somewhere between 100baseT and gigabit.

When I was over at Mark's cat-sitting and grabbing the Mythbusters episodes and kitten porn anime (which has more than one episode devoted to litter box training) it was only doing a megabyte or two per second. Downloading a whole directory of episodes took an hour, while plugging in gigabit ethernet would have given me up to ten times that speed. But I just didn't bother to rustle up a cat 5 cable and find a place to plug it in; I wasn't in enough of a rush to make it worth the effort. 802.11n would make plugging in gigabit cat5 even less attractive, although I'd have to buy a new laptop and a new router to make use of it, and would that come with 10gigE for its wired option?

My question is, will wired ethernet stop getting sufficient financial incentives to keep ahead of wireless? I'm not interested in what you can do by throwing money at the problem (I could wire up my home with fiber if that was the case). Will 10gigE get commoditized to the point it displaces gigabit the same way 100baseT drove 10baseT to extinction and gigabit is now doing to 100baseT? Or will 10gigE remain a premium product costing extra, and not installed by default in new budget laptops? In which case, 802.11n will outcompete gigabit cat5 in just about all consumer installations.

Dunno. I haven't got enough information, and am not entirely sure where to go to look it up...

July 1, 2009

The xfce terminal program lets you change font size (either with the preferences toolbar button, or the edit->preferences menu entry), and when you do this it affects every open terminal. Both KDE and Gnome's terminal programs let you change this per-window.

Having a dozen different implementations of each major piece of functionality isn't much use when they're not designed to mix and match. I'm using kmail with xfce but when I click on a link it pops up abiword, not a browser. (The way you change kde file type associations is through the web browser, konqueror. No, it doesn't make any sense, but it's what you do. Unfortunately, if you install Konqueror on a non-kde desktop, it crashes when you try to bring up its preferences dialog.)

Oh, and the first time I tried to change the font size, all my open terminal windows (with multiple tabs) closed instantly:

[145829.419872] xfce4-terminal[3417] general protection ip:7f7d8a9d4e9d sp:7fff93cb7700 error:0 in[7f7d8a9b6000+11f000]


June 30, 2009

I learned a new thing about how shell scripts work! (Translation, I spent a couple hours trying to understand a really strange bug that bit me.)

In shell script, a for loop works like:

for i in one two three
  echo $i;

The problem with that is if you want multiple related arguments (for i in "one 111" "two 222" "three 333"), you have to split them yourself, which is fiddly. (You can do it with awk, with cut, and a couple other fiddly ways that tend to be brittle.

The dirty trick to get around it is to do this:

echo -e "one 111\ntwo 222\nthree 333" | while read one two
  echo $one $two

The problem is when you try to set a variable in the loop for use after the loop, or return from a shell function inside the loop. The problem is the second half of the pipe runs in a subshell (it's a separate process), and the entire body of the loop is considered the second half of the pipe. So you're setting variables in the subshell or returning from the subshell, and that has no affect on the parent process.

That's subtle and fiddly and evil. I understand why it's doing it, but it took me by surprise anyway.

Watching mythbusters season 1. Highly entertaining. (One of the episodes you can get on dvd through netflix mentioned the rocket car, but the dvds didn't actually contain that episode. Crazy, eh? It's such a chopped down subset, with the individual episodes edited differently than they actually aired. You have to go to bittorrent to get the real ones.)

June 29, 2009

When it gets to be 5:30 am, and meanings get reversed and then unreversed in the same bisect, it's hard to work out which of good or bad your next bisect should be. Screw up once and your result is useless.

Resuming the next morning isn't a whole lot of fun either. Where did I leave off? (And now you know why I keep this blog!)

For bonus points, the squashfs bug occurs intermittently. I've run a kernel with it twice and had it run to completion the first time and die with the squashfs bug the next. That's going to be fun to track down if 2.6.30 still has it once I've got the boot bug out of the way.

Tracked the bug down and posted about it on the list. It turns out that there's a new config symbol I need to switch on (which is described in the commit message). So all that work was to change a single config symbol. (Not even a code patch.)

Par for the course, pretty much. If it seems like I'm doing a huge amount of work for very little progress... It's pretty much all like this. (Pretty masochistic hobby if you stop and think about it. Oh well.)

June 28, 2009

It occurs to me that I'm primarily blogging about what happens when I'm on the computer, which probably gives a distorted account about what I do during the day. I don't blog about reading books (finished Guns, Germs, and Steel, rereading Wee Free Men) because I'm not doing it with the computer in front of me.

Several people pointed out to me in email that I can go "git archive HEAD" to export the current version. The reason that didn't work when I tried it? Because "head" doesn't work, it must be all upper case, even though "master" is lower case. (Yeah, that's git.)

I also read the man page for git bisect. (git help bisect == man git-bisect, therefore if the git man pages aren't installed, which they aren't by default, git has no help.) Anyway, it turns out there is a way to specify a skip result from the run script, which is to exit with the magic error code 125. No rhyme or reason behind it, because this is git, but it's nice that there is a way. (I'm still not using it, since it's trivial to do by shell script.)

The part that isn't trivial is that my script has so far done bad, good, bad, skip, skip, skip, skip, skip, skip, skip... It may go through all 1398 remaining revisions individually? Now it's done it twelve times. It takes several minutes per run, I'm watching Bleach while it does this...

Bleach continues to be weird, not that I expected different. Up to disk 12 now. Episode 46 was a giant gratuitous flashback for secondary characters, and now 47 is apparently internal politics on the part of the antagonists, and now we're back to protagonist boy Ichigo dueling with the antropomorphic personification of his sword. (Yes, really. While werecat woman watches.) Episode 48, more internal politics with long dramatic swaths of recycled footage and the same three songs, and now the catgirl is wondering why Ichigo isn't levelling up during training (because she hasn't pushed him to limit break yet, you've got to beat him silly before he levels and the current session hasn't even drawn blood yet). Well of course they shortened the deadline, that's plot 101. And now episode 49, and it's flashback land again. "Don't worry, it's just a reconnaisance mission". Oh yeah, to quote ghostbusters "this chick is toast"...

And the video froze again... And the disk icon won't let me eject it because "mount volume" is where eject normally goes. (Why would that be related? I want to eject the unmounted volume, why does the gui not give me that option? Make it orthogonal, guys. Of course the eject button on the actual drive still doesn't work.)

I still need to buy a mac. It might also be buggy crap, but then if I complained about it there would be someone to complain to who wouldn't imply its lack of quality was somehow be _my_ fault. Also, the mac has had an impressive improvement trajectory since the introduction of MacOS X, while Linux has pretty much been treading water on the UI front.

Back to Bleach. Ah, whatserface didn't die, just the red shirts. Ok, hang on, she's posessed? And slaughtering everybody. It's kind of odd the huge power disparity amoung the soul reapers, the nameless ones are one hit kills. ("If we help him, what will become of his pride?" Sigh, they've gone all insanely japanese again. And he's dead because of it. Oh _now_ you're willing to kill him. Sigh.)

Ok, this bisecting isn't getting anywhere, it's gone through well over a dozen iterations of skip and every one it looks at still build breaks with:

/home/landley/firmware/thingy/build/temp-powerpc/alt-linux/arch/powerpc/include/asm/highmem.h: In function 'kmap_atomic_prot':
/home/landley/firmware/thingy/build/temp-powerpc/alt-linux/arch/powerpc/include/asm/highmem.h:98: error: implicit declaration of function 'debug_kmap_atomic'

What I probably need to do is grab the current revision it's looking at and the last known "bad" one (which reproduced the error I'm looking for, meaning _this_ error was fixed), find the fix, make a patch, and bisect between the last known "good" and "bad" applying the patch to each iteration that's horked by this bug.

This brings up another annoyance about git: "good" means the old version for the bisect and "bad" means the new version. If you're looking for where something was _fixed_ so you can apply the patch yourself, you have to call the broken one "good" and the fixed one "bad", which is not only really easy to confuse but tends to screw up the next few bisections you do because you can't what "good" currently means when the definition keeps changing every couple of searches. (And if you answer a single question wrong during a bisect the result is useless. You have to remember to test the last one yourself to make sure it reproduces the problem you're looking for, and test all the parent commits to make sure they don't.)

Since you have to _start_ a bisect with one good and one bad, git should be able to tell from that which one means "before" and which one means "after". But that would fall under user interface issue, meaning it's the user's problem. They give you 160 bit hexadecimal hash codes you can cut and paste from window to window, that should be enough for anybody, right? After all, needing to understand how a piece of software was implemented in order to use it builds character. Sigh.

I'm sure at some point in the future the git developers will silently change all this. Knowing them, instead of autodetecting before and after they'll add a --reverse-bad-good flag you have to pass in by hand. Either way it'll be my fault for not keeping up with all the random changes they make to the user interface.

Anyway, back to _this_ bisection. I've got the last commit bisect found that manifests the problem I'm actually interested in, and I can grab any of the commits it's been stuck on which show the problem I'm _not_ interested in (but which git bisect skip is too stupid to get past), and then bisect those two find the patch that _fixed_ it, then apply that patch to re-test broken versions in the original bisect and hopefully actually make _progress_.

This is another fun case of "good" and "bad" having to be redefined, since each is showing a different bug:

git bisect good 18b41f1cd537168a886c43237297692ba8d0a143 # shows build break
git bisect bad a0e0404fb06164100991cacf8e055f6b30f87cc9 # shows hang

So here, "good" means "the bug I'm trying to find the fix for in the short term", and "bad" means "the bug I'm ultimately trying to find a fix for". Wheee... I hate git.

And hey, one of the intermediate commits shows the squashfs bug, but not the build break. (It's e0cf8f045b2023b0b3f919ee93eb94345f648434 and yes, that's how git expects you to refer to it. You can use any arbitrary consecutive subset of those digits, but assuring you're using enough to avoid collisions with any of the tens of thousands of other hash codes the tree is made up from is your problem.)

Anyway, that commit shows _NEITHER_ of the bugs I'm looking for (it instead shows a third bug unrelated to either do; did I mention that the general testing level of non-x86 kernels is crap?), and that's progress! The last "good" showed the build break, so this is after that was fixed. The last "bad" showed the hang, and this is before that was introduced. So depending on what I answer here, I could track down either the fix for the build break or the introduction fo the hang, but since what I want is the hang and this proves that it wasn't introduced in the range of commits covered by the build break, I can answer "good" here and change what I'm looking for (mid-bisect) to track down that hang!

Did I mention git's user interface sucks rocks and it's essentially unusable if you don't know how it was implemented?

Late. Tired. Cranky. Feeling kind of nauseous actually (15 penguin mints in one night may be a bit much, even of the chocolate ones). Finish up in the morning...

June 27, 2009

Bisected a powerpc problem and reported it upstream. Found out I'd bisected it wrong, and that the list I was supposed to report it to is A) not on but on ozlabs instead, B) unlisted in their index.

Wrote a script to automate the bisection. Somebody pointed me that git's had yet another random feature bolted to the side of it in a hap-hazard fashion that I might have been able to use instead. Decided I didn't care. (And I decided that _before_ I found out that there's no way to indicate a "maybe" response to git bisect run, which makes it kind of useless because I've hit two unrelated bugs already in the bisect. One build breaks the kernel, the other one screws up squashfs some time after the root filesystem is mounted. The one I'm looking for is distinguishable in the kernel boot messages, before init runs.)

Ok, current Ubuntu's uselessness factor continues to rise. Attempting to suspend my laptop, I get the pop-up "Request to do policy action: multiple applications have stopped the policy action from taking place. vlc: Playing some media. vlc: Playing some media."

First, no running software should be able to the system from suspending. Second, there are no copies of vlc running anymore, they exited before I tried to suspend but this status _leaked_.

Right, BIG HAMMER TIME. Ooh, it' s been a while, what do I need to do... switch to a text console, become root, "echo -n mem > /sys/power/state"...

Back from suspend. (It worked.)

Ok, here's my objection. It's not that the system was just complaining about nothing, it's that a sus

The status leaking might be because doing something simple like extracting a linux tarball paralyzes the rest of the system for a minute or more at a time. (For example, in the above sentence, vi hung after the "m" in system and nothing else I typed displayed until the period after "time". I could have typed _way_ more in that gap, but I stopped because cursoring back up to fix typos three sentences back is annoying.)

This means that a bisect script which uses "git archive" to feed a tarball into my build system can't just run in the background and be ignored, it makes the rest of the system INSANELY unresponsive. Firefox also hangs for upwards of a minute at a time (not only can I not scroll the page, but if I move another window over it it won't _redraw_). And it hangs like this about every 3 minutes. On a dual core 1.7 ghz laptop with 2 gigabytes of ram.

Yes, the "Linux on the Desktop" people have been trying for 15 years now, and this is what you've come up with. Moore's Law says they've now got 2^12th times as much memory, processor power, and disk space as they started out with and this is still the best they can do. (How many years did they waste insisting that the display driver (X11) needs to run as a normal process so normal system load can freeze your mouse pointer, making it hard to even queue up actions for the system to do when it stops playing with itself?)

June 26, 2009

An invocation I've found useful:

git archive $(git show | head -n 1 | awk '{print $2}') --prefix=linux/ | bzip2 > ~/firmware/thingy/packages/alt-linux-0.tar.bz2

It would of course be nice if "git archive" could just spit out the current version without having to be told what version that _is_, but this is git we're talking about. There's probably a way 300 lines into a seemingly unrelated man page, but you'll never figure it out yourself.

In that way it's very much in the spirit of vi or emacs. You don't learn how they work by playing around with it. You learn by grabbing a spellbook and reading through hundreds of pages of incomprehensible jargon in ancient dead languages to become worthy of discovering the mystic incantion you will NEVER figure out otherwise. And in reality, you learn about 5 functions it does and ignore the other million settings on the Sonic Screwdriver unless you're the guy who _built_ it.

And this is why I vastly prefer Mercurial.

June 25, 2009

I am way, way, way off a day schedule again.

The reason powerpc is giving me so much trouble is that too many different things have changes since the last time I had it working. The qemu version is different (largely because of laptop hard drive failure and subsequent reinstalls), the linux kernel version I'm building (and thus trying to boot) is different, the compiler I'm building it with is different (4.2.1 instead of 4.1.2), and my build scripts have changed. When I run across any particular symptom, _and_ of that could be the cause. (Or it could be one of the other things that changed like the busybox tools developing a subtle sed bug, been bitten by that before.

Dug up the mailing list thread from last time a random openbios change broke powerpc, and it says svn 6657 used to work. That translates to git 2d18e637e5ec, so let's fire that up and build it verbatim... And it works. Yay.

But in this case "works" means it boots the _old_ binaries. The ones I'm currently building don't give me any output at all, which implies the serial console is horked. Hmmm...

Now I'm iterating through different versions of my build scripts (much faster) to see where I introduced whichever tool broke it. (That way I'm only testing along one axis.) The 0.9.6 release worked. The last version right before I switched to gcc 4.2.1 worked, so it wasn't the infrastructure changes since then. Commit 752 (first "expected to work" version after the compiler switchover fallout had shaken out) worked. So it's not the compiler. Now we go more slowly. Next commit (753) updated busybox version... and that worked, so it wasn't busybox screwing up the build tools. The actual kernel switchover to 2.6.30 was 754... Which doesn't build because the powerpc config needs the gratuitous new symbol to avoid build breaks. Ok, tweak that, and...

invalid/unsupported opcode: 00 - 12 - 00 (00009024) 0000938c 0

Now it's failing with a _reason_, which is an improvement. But it's definitely the kernel update that's doing it.

Of course now we're back to "two changes which can't be separated" territory; is the new kernel horked or does the config need to be changed more? (These are sort of aspects of the same question, but still...)

Now I need to bisect through kernel versions, of course. (Bisecting through things that take several minutes to build is way slower than it seems, because I switch to other windows and get distracted and don't come back promptly. It can take me an hour to go through three versions if other things are interesting enough that I forget what I was doing...)

June 24, 2009

Non-computer things today. (Ok, read through some webcomic archives.) Did rather a lot of laundry.

June 23, 2009

Ok. Dug up the release binaries of system-image-powerpc, which was known working with some previous version of qemu as of 3 months ago (albeit with some patching). The two new problems (the -hda argument actually setting /dev/hdc, and init segfaulting with the kernel trying to access memory it shouldn't) are both in the old binary, which means qemu is what changed.

Bisecting through qemu revisions that need to be patched to have any chance of booting anything sucks. Especially when different ranges of the history need different patches...

Ok, 179a2c1971 can't find _any_ hard drive with this bios, 2fbc409571 can. Looks like 513f789f6b18 might be the relevant change between those... And it is. That's where qemu switched from using hardwired layout info to querying openbios (and hence the device tree) for it. The new layout has hda/b and hdc/d swapped (the primary and secondary IDE controllers, apparently), and the panic due to accessing some invalid resource seems similar. (It's not the kernel, I'm testing a binary that used to work.)

June 22, 2009

Went to the doctor and got another cat scan (we have four at home, but the one they have is specially trained) which showed that I still have a sinus infection but A) not as bad, B) in a totally different place. I'm callin this progress! Got another 10 days of antibiotics. Feeling better already, although that's probably the placebo effect.

Went over to Mark's in the evening and fiddled with getting the nightly cron job stuff working. Played whack-a-mole with things that almost, but not quite, actually work. (I hate git. The tarballs git archive creates contain no version information, ala mercurial's .hg_archival.txt file. For years the busybox tar command complains about some weird comment record when I extract these tarballs, but when it comes to actually trying to _get_ version info, nothing.)

I asked on the busybox mailing list, and they suggested they could tweak their cron script that creates 'em to put a version string in the directory --prefix. Better than nothing, I suppose. My build scripts are carefully designed to _ignore_ this information at the moment, of course. Sigh...

Mark had a series of videos called "Chi comes home" (and a second season called "Chi's new address") which are hard to describe. "Kitty porn" comes pretty close. Kitten porn in this case. Or perhaps "Kawaii diabetic attack". I DARE you to get through even the opening credits with a straight face, let alone a full episode. (There are 104 of those in the first season. That's episode 2. Yes, it's a series of 3 minute cartoon videos about a kitten, and it is horrible yet awesome. I have not yet shown Fade because she might go into shock.)

The first episode isn't quite representative of the series, since it's the kitten wandering away from its mother and hence three minutes of way too cute _angst_ instead of pure uncut refined cuteness like the others. Although if you want angst done inexplictably well, there's a blog telling the story of a homeless family of sims. It's a father and daughter pair of sims set up on an empty lot with no money and nothing but 2 park benches as "home", allowing Sims 3's "free will" option to dictate their behavior as much as it can (everything but aspirations and opportunity pop-ups), and writing up their story with screenshots and prose. It is FAR more emotionally involving than it has any right to be, although you really have to start at the beginning to appreciate it.

There's a weird sort of yin/yang in the above two paragraphs, now that I come to think of it.

June 22, 2009

Felt totally out of it this morning. Slept through an hour of Fade and two cats trying to wake me up. (Well, sluggished and didn't get out of bed, anyway. Only actually _slept_ through part of it.) Went out and got rather a lot of Fade's old games mailed (she's culling the herd of her gaming collection, I've got a big bag of stuff to donate to the next convention with a gaming room, too). Then we went to the Kolache factory and I smashed a taillight in my car against a dumpster backing out. (That was enough to convince Fade to drive, despite her massive aversion to driving.)

Went home meaning to nap, but by that point I was up. I work based on momentum, even when "work" isn't relaly an operative word. Lurch, perhaps.

This was, however, enough to get me to make a doctor's appointment for the darn sinus infection that's been screwing up my sleep. Maybe the _sixth_ round of antibiotics will do something useful.

Got the perl removal patches resubmitted for 2.6.31. (It's technically still the merge window, let's see if they make it in this time.)

Even though the changes from 2.6.29 were one minor cosmetic tweak and rediffing so the offsets matched up, it took hours to do this because I'd totally forgotten how (and I had to write up new descriptions). My own script seems to have gotten lost sometime in the past 3 months, so I had to write a new one. (It puts a patch in the cannonical format for linux-kernel. Yes, it has my name and email hardwired into it, it's my script.)

Now at Starbucks. Still bisecting the qemu powerpc stuff. Building qemu is slow...

Hey, my daily google alert thingy just pointed me at my Firmware Linux page on the impactlinux website. Um, yay?

So bisect narrowed down 9d479c119b42b8a548f8d79a8e5a1c1ce2932d91 which was the openbios update that made the sucker start booting again, but which was _after_ whichever change swapped hda and hdc. (And presumably whatever caused the powerpc image to start panicing with a null pointer dereference right after userspace got control even when you work around that.)

It's possible the panic is a kernel change and not a qemu change, but lemme try 0.10.1 with the other openbios and make sure I can reproduce the problem here before trying to bisect the kernel build.

Mark's epic adventure in iPhone recovery made slashdot, and from there has been picked up by several other sites. If you're going to carry around an orwellian tracking device with an ambient audio pickup (hello, speakerphone mode and no LED to tell you when the microphone's powered up?) you might as well get some use out of it.

Speaking of slashdot, I see that blue-ray still sucks. Apparently 9% of americans own a blu-ray player, but 11% have HD-DVD (a format that's been dead for a year). The really odd part of that survey is that only 7% of respondents said they owned a blu-ray player, apparently the other 2% didn't even realize (or care) heir PS3 plays the things.

Of course 83% of the population owns some kind of DVD player, so conventional DVD sold 6 times as many disks as blu-ray last year, and that's with the blu-ray people rebuying all their _old_ videos in blu-ray reissues.

Yeah, I stand by what I said the last few times I mentioned this topic: digital video files trump this. The CD doesn't matter much once you've got an ipod or similar, and attempts to introduce a new physical media music format wouldn't get much attention these days. Video is at most 10 years behind audio, the writing's been on the wall for _years_. Even DVD barely made it in time.

(A digital disk you could burn 250 gigabytes of arbitrary files on for under $1 might be interesting, but by the time any of the formats supply that capacity or that cheapness, flash will have won.)

June 21, 2009

Apparently Kwaj has a web page now. Doesn't say anything about visiting, but it shows pictures of George Seitz middle school and such. (And says it handles kindergarten and first grade now. Ivy closed? And goes up to 6th grade, when it used to only go up to 4th and then you went to the high school. I'm trying to figure out if I just don't _remember_ Macy's west, but I think it wasn't there yet...)

Last night my laptop did one of those "everything freezes but the mouse pointer" things so I had to power cycle my laptop and lost the debugging context for powerpc. (Ubuntu 9.04 is like Windows 95 when it comes to reliability. Oh well.)

This morning (up before noon!) I'm rebuilding powerpc with gcc 4.1.2, and if the behavior differs I'll build the various packages with each toolchain and see _which_ package is having a behavior difference. (It's also possible that it stopped working when I reinstalled Linux at the start of the month, which means the qemu it's using now and the one it used to have are very different. This is really to see which of those two ratholes to go down.)

And it's the qemu one. Ok, fire up git... And I have to go --disable-werror now because the Werror disease has spread to qemu. (Different gcc versions produce different warnings. Differently patched distros produce different warnings, such as not checking the return value of asprintf(). Apparently that's a warning now. So is ignoring the return value of chdir("/"). Maybe that's an Ubuntu thing, I dunno, but in general -Werror is _useless_ these days.

Repairing my desktop again while it builds. I don't know why some of the things I add to the xfce taskbar survive reboots, but others don't. This is the... 5th time I've added the CPU governor plugin, I think? (The battery icon doesn't show you current CPU speed, or let you change it, which is one of the main ways to control battery life. You have to go dig up a separate control for that.)

I suppose I don't really mind having a separate little fiddly plugin for everything. What is it about volume controls that Linux desktop developers have such trouble with? The KDE one was good: that had a little speaker in the tool bar that gave you a slider pop-up when you clicked on it, and went away when you clicked on anything else. The Gnome one had a slider pop-up that never went away unless you explicitly clicked on the speaker icon a second time to dismiss it, and displayed above everything else. That was annoying.

The Xfce one doesn't have a pop-up, instead it creates a new window every time you click on it. Did I mention that creating a new window takes about 5 seconds, during which you have no acknowledgement that anything is happening until the new window suddenly appears? Did I mention that you can have a half-dozen of these old volume windows lying around if they fall to the back of the z-order instead of getting closed? Did I mention they take up almost 1/4 of the screen for what is essentially a single control slider? Yeah, I know. Oh well.

I want the xfce general desktop with the kde volume control. Even if I could mix and match them (they don't seem to have a common docking API), the KDE thing would suck in 100 megabytes of shared libraries to twiddle the volume. The "let 1000 flowers bloom" approach only works if there's some way to integrate the result, and after 20 years of X-based GUI development we've got nothing in that department.

Ok, so the first problem is that -hda is showing up as /dev/hdc. That's odd. Is this because I'm using the -git version of qemu? I think I was using the 0.10.1 release before, lemme try building that...

Great, 0.10.1 dies in the bios (I traced this down once; doing so sucked), and in current -git the bios is fine but the emulated hardware has at least two obvious things wrong with it. Maybe I can git bisect a sweet spot where everything worked...

Meanwhile, upgraded the kernel to 2.6.30. The perl removal patches are unchanged except for applying at an offset now. I am officially nervous about this, but can respin and resubmit them Monday morning.

June 20, 2009

Biked to The Donalds, smelled the usability upon stepping inside, went across the street to Wendy's where I can breathe. I need to find a good regular work environment, I miss Metro and at home I'm mobbed by cats. (After about a half hour, a homeless guy wandered through Wendy's and hovered over me for several minutes, possibly watching the Bleach DVD over my shoulder, but for some reason despite being visibly filthy to the point of looking shiny and sticky, he didn't have a noticeable smell. Neither did Expletive Woman half an hour later. *shrug*)

So, is reporting that armv5l, armv6l, i586, i686, mips, mipsel, and x86_64 are passing. This leaves m68k, powerpc, powerpc-440fp, sh4, and sparc reporting failure.

Sparc's been broken forever, and m68k isn't supported by qemu yet. I checked sh4 by hand and the problem is it's so slow the new timeout logic is triggering (which is realy a minor qemu problem, very slow emulator for that target). Run by hand with no timeout, it works. It just takes over a minute to boot up and compile "hello world". (It seems to be some kind of timeout in the emulated hardware, actually. Once hello world is built, it runs essentially instantly.)

The powerpc target seems to be building fine, it just won't run. Or more accurately, the kernel boots but it can't find the hard drive to mount the root filesystem. This sounds familiar. But how did chaning the _compiler_ break this? (Build break due to pointless extra error checking rejecting a construct that used to compile just fine, I can at least understand. But this is a change in _behavior_.)

Hey, the sound just died on my DVD, 17 minutes into playing it. Backing up a bit didn't help, fiddling with the volume control didn't help, killing and relaunching VLC didn't help... Sigh. Time Invoke Knowledge a non-technical end user wouldn't have: dmesg has no clues, check and see if the youtube videos I predownloaded have audio... And the flash plugin rectangles are grey and unresponsive... but the netscape plugin isn't dead. Ok, killall npviewer.bin, relaunch vlc... yup, and the sound is back.

When a single failed program can stop any other program from producing sound (but the volume and mute controls insiste everything is fine), it's not really a "mixer", is it?

Hey, after half an episode of running around Ichigo got stabbed and is near death. Finally. That means he gets to use his limit break, which is to level up. This is why he constantly takes on purples. (And now a dream sequence training montage on the importance of levelling your equipment and not just your character. Yes, I'm up to disk 10.)

Beautiful, now most of an hour later the VIDEO died. (Frozen and not repainting, when I drag the window off the edge of the screen and back the obscured portion comes back solid black) But the sound is continuing fine! I'm actually giggling from the MST3K-style horribleness of this crap. I mean honestly, it's staging its failures to cycle through every aspect of its functionality. Luckily a simple restart of VLC fixed that one...

June 19, 2009

Slept horribly last night. My sinus infection is making me tired all the time (it's like swelling in my head is cutting off circulation or something), but when I lie down it hurts so much it wakes me up. I've been through a half-dozen courses of antibiotics on this thing already over the past couple years, but it looks like I need to go back and be insistent about it because this is ridiculous.

Biked out to Whataburger for exercise shortly before midnight. Apparently when my bike got hit by a car, the seat broke so the little metal tab holding the front down came unhooked, and now it flips up to the back, which is disconcerting when you're sitting on it and lean back slightly. Grabbed a pair of pliers and got it semi-fixed for now, but I need to take it back to the bike shop to get something more permanent.

Counted about 15 graffiti tags on the ride, pondered taking a spray bottle of paint stripper, gloves, and a scrub brush with me in a plastic bag next time, but that would pretty much take up the night because I'm not really _looking_ for them. I'm sure I'd find a lot more if I was. (I'm also slightly concerned that if the spray bottle leaked the paint stripper might eat through the plastic bag and get on other stuff in my backpack, such as my laptop. I don't THINK it would, but...)

Got here and realized I left my little bluetooth USB key at home. (Yes, this laptop still doesn't have built-in bluetooth. Insert $NEEDTOBUYAMAC rant, I know.) So no internet at the moment. My plan to start posting on livejournal again doesn't mean I'd stop posting here, just that the longer coherent-ish pieces like the history things should have a place I can link to 'em instead of buried in a long file with #anchor tags.

Right before I left home, I fought with my laptop for 20 minutes to get it to suspend so I could head out. It kept timing out the freezer after 20 seconds and booting back up to the desktop, first complaining that the nspluginviewer.bin wouldn't freeze. (Well, actually some white text flashed by on the display to quickly to read it, and then gave me the desktop back with no explanation. As with any user interface issue on Linux you have to know to fire up a shell prompt and run dmesg.)

So I figured out what dmesg was trying to say (the multi-page stack dumps didn't actually help here, they were just a distraction from the list of processes it couldn't suspend at the end) and killed all instances of nspluginviewer. Next it complaining cifsd wouldn't freeze, so I tracked down a dead samba mount that hadn't prevented it from freezing when I was at Mark's place earlier today, but was suddenly a problem now. (At a guess, the samba thread only blocks uninterruptibly waiting for a response from the server when it doesn't hear back from it, so when I was behind Mark's firewall it happily suspended because the server said it was ok, but when I wasn't it wouldn't. I gloss over 5 minutes of trying to get it to umount when one of my open terminal tabs had a shell prompt open with the current directory under that mount point. Well, actually the terminal tab had a root prompt at /, but if you exited _that_ the parent process was a shell prompt with its current directory under the /mnt in question. Linux: smell the usability.)

Fade has suggested that rather than buying my sister a new laptop, we just send her Reepicheep the eeepc, freeing Fade up to buy an iPhone. She's got the eeepc for school but hasn't actually needed a laptop in any of her classes (literature major), and has wandered back to using her mac laptop the rest of the time she's not on The Monolith.

The fact that there exists a "Sims 3 for the iPhone" (which she's been told is "the best Tamagochi ever") might be a factor in Fade's decision. I suspect she'd plead the 5th if I asked.

Her buying an iPhone would continue the trend of Fade having way cooler hardware than I do. She gets all the new tech toys, while my laptop is on its second set of ram and third hard drive. (Speaking of which, I suspect that the "Ubuntu eats hard drives" issue is not limited to the hitachi drive the laptop came with. I should set the hdparm -B 255 thing in the init scripts again. Er, make that the acpi resume scripts, since just putting it in the init scripts isn't enough. SMELL THE USABILITY.)

Next time I'm connected to the internet I should probably track down and install the tool to tell me whether or not Linux is slowly destroying my hardware again. I think it was smartctl or some such? I've upgraded ubuntu three times since this issue first surfaced, but it's not like I can expect Ubuntu to actually fix something as trivial as a literally hardware-destroying bug given a mere 2 years to do so. (I suppose on twitter this would be "#smelltheusability".)

I'm gradually wandering back into BusyBox development, although this currently consists of posting on the list a lot and getting into arguments with the other developers. (Yeah, par for the course I know.)

Some prominent TODO items I have, off the top of my head:

There are 8 zillion other things on the todo list (biking here was "get exercise"), but those are the "might feel like working on them if I sit down to do something now.

So, attacking the FWL broken architectures. x86, x86_64, and mips (but not mipsel) are working. Let's see what's wrong with armv5l, that's usually a fairly reliable one:

  CC      networking/dnsd.o
networking/dnsd.c: In function 'process_packet':
networking/dnsd.c:428: error: lvalue required as unary '&' operand
networking/dnsd.c:430: error: lvalue required as unary '&' operand

Oddly enough, I'm pretty sure that's _not_ the same error that happened on securitybreach (the 8-way gentoo server), but let's get it working here first before debugging it there.

Step 1, try to build busybox from the command line with the cross compiler. So:

cd ~/busybox/git
mkdir ../temp
git archive master | tar xvC ../temp
make allyesconfig KCONFIG_ALLCONFIG=~/firmware/firmware/sources/trimconfig-busybox
PATH=~/firmware/firmware/build/cross-compiler-armv5l/bin:$PATH make CROSS_COMPILE=armv5l-

And it... built to the end just fine. Ok, what I built was more or less current -git, rather than the release version. (I only have one outstanding patch, for pgrep, and that's almost certainly not relevant here.) So maybe they fixed it, and maybe this just isn't reproducing it. (I note that git tag -l doesn't even show a 1_14_0 tag, the newest is 1_13_4, but I think that's an artifact of the svn->git conversion. I should ask on the list.) So, try again with the busybox 1.14.0 tarball the build was using, and...

Yes, it reproduces the error, and the git version of busybox doesn't. Do a diff between the git networking/dnsd.c and the one in 1.14.0 and... they're identical. So something in some header changed. The error is happening on a call to move_to_unaligned32(), and that's defined in include/platform.h so let's see what the diff is between those two...

And the file is _full_ of unnecessary whitespace changes. BRILLIANT. Ok, try again with diff -b...

Well, that explains why x86 and x86_64 work:

#if defined(i386) || defined(__x86_64__)

Now I'm wondering why mips worked. (Presumably the guts of gcc 4.2.1 didn't complain about the lvalue thing on mips.) Which brings up the question of why mipsel _didn't_ work... yup, it's complaining about the same lvalue thing.

Ok, all architectures brought down by the same busybox bug, which seems to be fixed in source control. It's git 3be23086 in the busybox repository thing, applied on April 17th. (I note that "git blame" is kind of useless if something else, like a pointless whitespace change, has touched the file since. With mercurial you can subtract one from the commit number you want to see before and do an "hg annotate -r commit filename". But you can't do previous/next math on git commit hashes. Smell the usability.

June 18, 2009

It's possible that twitter is reducing my incentive to blog. But where else do you learn that Adam Savage of Mythbusters appeared in a Billy Joel video when he (Adam) was 16?

My problem with pkill's weird error message was that goes through the normal sources/ setup and thus sets the $PATH to use the build/host tools if they exist, meaning it was the _busybox_ version of pkill spitting out that error message, and that indeed didn't implement -P. Went and implemented it, then caught up on the busybox list (since the start of the month, anyway).

Now I'm plotting and scheming how to migrate busybox to the toybox infrastructure for early setup and option parsing, since if I'm going to do significantly more work on busybox I don't want to _know_ there's a better way to do stuff and not use it.

Toybox let me work out some significantly better infrastructure for doing a swiss-army-knife program, and proved that whatever claim Bruce Fscking Perens may have had to early Busybox (even if Red Hat's Nash didn't do exactly the same thing, even if he hadn't clearly gotten the idea from gzip without attribution, and even if there wasn't any of his code left by the time I bothered to check (well he'd pissed me off, but he's good at that)... Anyway, toybox proved that I can, and did, reimplement at least that much from scratch. (And properly from scratch rather than merely gluing together applets other people wrote.) So I no longer need to worry about that Schlemiel.

And without the _aversion_ of Bruce Disease, doing toybox as a separate project doesn't make all that much sense. BusyBox has a ten year headstart, a couple dozen reasonably active developers, and is easy to incrementally improve.

That said, I did a lot of good work in toybox, and a lot of what it _does_ do it does better than busybox. I want to fold that back into busybox, and that's a pretty big task in and of itself.

That said, I need to get the perl removal patches for the linux kernel updated to apply to 2.6.30 and submitted during the current merge window (which is a week old already). And before I do _that_ I need to test it on all the FWL arches which means I need to figure out why only x86, x86_64, and mips are currently building with gcc 4.2.1 and fix it. And to do _that_ I needed to stop hanging on sparc, which is why I needed pkill to work to asynchronously enforce a timeout on qemu-system-sparc.

Got a bit frustrated about the stack getting longer as new tangents kept inserting themselves in a "getting tomato juice to Colnel Potter" fashion. (MASH reference, don't worry about it. "Swallowed the spider ot catch the fly" would also work.) But this has been my normal working methodology for as long as I've been working on Firmware Linux. (Since my initial research for it involved pestering people in person at Atlanta Linux Showcase in 1999, you could say I've been doing it for a decade now.)

June 15, 2009

Huh, that'll probably confuse people's rss readers. Yesterday's entry was mis-dated (today's the 15th) and I just fixed it.

So yesterday I got x86_64 building all the canadian stuff with gcc 4.2.1, and checked it in. Then I told the 8-way server to build all the targets and went to bed, and today I find that half the architectures went boing.

Sigh. I _really_ need to get linux-2.6.30 building today so I can update whatever may have changed in the perl removal patches and submit that upstream, but this is being stroppy. Armv5l died with this during the cross-static build of uClibc:

libc/stdio/_READ.c: In function '__stdio_READ':
libc/stdio/_READ.c:39: error: 'LONG_MAX' undeclared (first use in this function)
libc/stdio/_READ.c:39: error: (Each undeclared identifier is reported only once
libc/stdio/_READ.c:39: error: for each function it appears in.)
make: *** [libc/stdio/_READ.os] Error 1

But usr/include/limits has it? So what exactly is it complaining about?

But first, I need to get to kill its child processes. I want a "kill --recurseive 12345" that takes out all the child processes and their children and so on, but apparently I've hit yet another thing that the standard unix tools never bothered to implement.

I am so _sick_ of fighting the shell. The sparc target hangs when you try to test it, so gets hung up on that one. I can run a background process that does "kill $(jobs -p)" after a timeout, or does a "pkill -P $$", I can set a trap "pkill -P $(jobs -p)" EXIT, and so on. But so far, nothing but "killall qemu-system-sparc" actually kills the qemu instance.

I'm running a script that's running a script that's calling qemu, and the child processes don't get killed when the parent process goes away. After the qemu instance hangs it doesn't get killed when its parent shell script gets killed by a signal, and the traps I'm putting in won't propogate down to it. (The buggy error messages don't help: "pkill -P pid" complains "no signal P" when the pid in question has already exited, which is _not_ helpful...

Such crap. Such total crap. This is not the bug I'm trying to fix, I keep hitting OTHER bugs that won't _let_ me get to the bug I'm trying to fix. And then I hit othe bugs when trying to debug that.

Linux is buggier now than it was under Red Hat 9. Why am I bothering?

June 14, 2009

Friday I bought Mark a steak dinner as a bribe so he'd get my cell phone internet working again. I also copied zillions of bittorrented files he had lying around, partly because there's no other way to get _all_ the episodes of mythbusters or MST3K because the first only put out a random subset of the episodes on DVD and the second had licenses expire between the intial broadcast and the invention of DVD. (I have netflix. I'd happily get this stuff through there, the way I'm doing with bleach. I just can't. And as when Hulu stopped being fun due to stupid watermarking and ads over the video, I'm fine with that.)

I spent the whole of yesterday watching Fade play "Sims 3" and reading through the archives of A Girl and her Fed.

Today I made it out of the house by 5pm to head out with my laptop and finally get some work done. The 2.6.30 kernel is out (and yes I typed "0.9.30" on my first attempt, but at the rate uClibc gets release out and the kernel _does_, that won't be a problem for long), so I need to try it and see what needs updating in the perl removal patches and the miniconfig stuff. (I need to fix miniconfig to open menu symbols when I switch on menu contents. Actually I need to write up a _list_ of all the projects I'd do if I had more time and energy, and hope somebody else finds them interesting and does them for me...)

Wandered to McDonalds, but barely made it in the door before two different homeless people passed me (one waiting in line for a drink refill, the other just passing the line at right angles) and I was driven back outside by the smell. It's June in Texas, not bathing or even changing clothes for several days results in _weapons_grade_ stench.

The smell wasn't a problem in previous years, I guess the facilities Rebecca told me about offering free showers to homeless people have finally been overwhelmed by the increasing numbers of them in this city. Rebecca sells flowers on Guadalupe, she's been there since the dot-com crash. Last I talked to her she had a special low-income housing apartment, but for most of that time she lived in various homeless shelters. That's what being homeless in Austin used to mean, it meant you spent your nights in a homeless shelter. But the shelters seem to have been overwhelmed by sheer numbers.

I'm aware the concept of homelessness isn't new, even to Austin. Heck, a homeless crossdressing man named "Leslie" ran for mayor the first couple elections after I arrived, and came in third. He was a fixture around town, and got free drinks on 6th street. (I met him, but never smelled him.) Back when I lived on Far West and bought the second condo next to me, I rented it out ot a guy who used to be homeless but had put his life back together. (Admittedly he totally trashed it and it took $20k worth of work to make the place sellable again, but that wasn't _entirely_ his fault. Long story.) Reese was homeless for a while, and I let her live rent free with me for most of a year while she went through a 12 step program. (Last I heard she was properly medicated, employed, and getting married.)

But it used to be an issue like flooding that the city could deal with, not a problem that had overwhelemed the city's coping mechanisms. Now there's bands of homeless people camping on Guadalupe. They hang out at the south edge of the parking lot at 25th street (up to half a dozen at a time there), and another group in front of the closed "Intellectual Property" bookstore store at 24th street (usually not more than 4 at a time there), and a large group of a dozen or more on the front steps of the church across from the Dobie Mall. And those are just the ones I pass regularly on my way around campus. (Oddly, the Guadalupe "drag rats" didn't used to be _homeless_. Most of them were _students_ who panhandled for extra money, and as a way of being edgy without dressing goth.)

I gave up on The Donald's and am across the street at Wendy's now. Not quite as fond of the food (their dollar menu leans very heavily towards the breaded and fried, and they don't have diet doctor pepper, but the #5 combo is decent and frostys are a nice change). But I can't breath in the McDonalds and for some reason this Wendy's hasn't been nearly as heavily colonized by homeless people presumably looking for someplace air conditioned to wait out the heat of the day.

There used to be a homeless man called "Jennifer" at this Wendy's, all day every day for about a year. (Imagine Richard Simmons with long hair, a dress, and a pair of the fake rubber breasts british soccer players seem to find funny, although in this case they were probably implanted prosthetics rather than strap-on ones.) I stopped coming here because I couldn't work, because he NEVER STOPPED TALKING. To everyone and anyone in here, or failing that talking back to the TV (and managing to be significantly louder than it).

According to the Austin Chronicle, he died over the winter. Apparently, he was a homeless rights and transgender issues advocate constantly poking the Austin city council, so his death was newsworthy enough for the Chronicle to devote a column to it. (I didn't actually know he was homeless, just lonely. It's hard to get a Michael-Jackson level of screwed up by plastic surgery without having had money at some point, and he never had trouble buying food. I also don't remember ever smelling him, and he could get pretty close when he was talking at you.)

I also want the city council to do something, I'm just not entirely sure what, or how to go about asking them.

Homeless shelter showers apparently aren't the only part of Austin infrastructure to get overwhelmed recently. I've heard that Austin has a "graffitti unit", and Google came up with this, but I've never seen them actually do anything.

Biking here to Wendy's (less than 10 blocks) I counted around 35 separate instances of grafitti tags. And that's _after_ I decided to keep it out of the alley behind my condo. (Noticed another one there today, although I think I just missed last week rather than it being fresh.) They're on edges of the sidewalk, lamp posts, walls, dumpsters, the back of real estate signs, the front of traffic signs... All over really. None of it "artisic", just tagging: each instance is some clueless twit needing to pee on a fire hydrant to mark territory, and being unequipped by nature to do so.

The switch from small amounts of artistic grafitti (heck, property owners commissioned it as advertising and the people who did it had _art_shows_) to large amounts of random tagging started shortly after Hurricane Katrina, back when the feds bussed several thousand displaced people here and dumped them. Perhaps that's just a coincidence, but whatever the reason it didn't used to be a _problem_. I didn't notice it at _all_ my first few years in the city, and it still wasn't significant when I bought my current condo in 2003, but now it's everywhere.

Anyway, now I'm trying to actually use the cell modem, and I'm intermittently getting this:




That was from, but I got the exact same error from a URL on, differing only in URL. At a guess, t-mobile is using some kind of funky squid-like web cacheing proxy thing, which is overloading. This isn't the _only_ strange web-only error I've seen, another page spontaneously lost its mime type and asked me if I wanted to download "next.php" for what was happily recognized as an HTML page when I hit reload. But this lets the cat out of the bag about WHAT the problem is: t-mobile has a piece of software trying to "help" rather than keeping its fingers out of my packets. Wheee...

I can probably work around this by tunneling to another machine and convincing my network layer (or at least Firefox) to use it as a socks/proxy thing. (ssh -Nf -D It's sad that I have to go to extra lengths to get a dumb TCP/IP network to _act_ like a dumb TCP/IP network, and just transmit packets without trying to understand their content. But that's the modern ISP for you. Oh well.

June 11, 2009

I dug up my old copy of neverwinter nights, but character generation was such a slideshow (mouse position updating about once a second, which is kind of hard to use with a touchpad) I gave up. It worked fine on my previous laptop, which also ran Linux. (This game is so old that it was back when people bothered to put out Linux versions of games. World of Warcraft even had a Linux client during its beta. Back when people thought Linux on the desktop might actually matter someday.)

Alas, without 3D working, even with the faster processor, it's just unplayable.

So back to watching "bleach" dvds and banging on FWL, at The Donald's (where the apple pies seem to have fixed themselves, dunno if it was just a bad batch, if I was consistently getting them dehydrated, or if corporate changed its mind).

The prerequisite is checking for is the existence of the root-filesystem-$ARCH directory, and that's not a particularly useful test. Sure if the cross compiler didn't build, that shouldn't exist, but if the root filesystem build died halfway through...

The problem is, what specifically can I test for in the root filesystem directory that says it completed successfully? I _want_ to be able to pack up fairly arbitrary contents, so setting a flag file at the end of the build is exactly the kind of dependency between stages I _don't_ want them to have.

The script has a natural dependeency: it can't build without the appropriate cross compiler. It doesn't care where that compiler came from, just that it's in the $PATH. The system images don't. I could add a flag to indicate that it _failed_, but that's a bit non-obvious for comfort. (When it's not asynchronous, return codes work fine, which is why this hasn't come up before.)

I'm torn between trying to properly fix this (although I have no idea how) or just hacking temporary tests into the scripts (yet again) to abort at the point I want to see the output. I guess it depends on what I'm trying to fix next. Eh, bop it on the head with a brick for now, get back to beating the behavior I want out of my old enemy, gcc.

So gcc/config.cache needs to contain "gcc_cv_as_hidden=yes" and "gcc_cv_ld_hidden=yes". I could just prepopulate the cache with this, but _why_ is the test failing? (It doesn't seem to log that part.) Hmmm...

Ah. It decided the assembler name it's using is "". That's just BRILLIANT. Right, explicitly set AS_FOR_TARGET, which for some reason 4.1.2 didn't need but 4.2.1 does. You'd think there would be a way to just specify the darn prefix for all these tools, but no...

I dug up my old copy of neverwinter nights, but character generation was such a slideshow (mouse position updating about once a second, which is kind of hard to use with a touchpad) I gave up. It worked fine on my previous laptop.

June 10, 2009

I suspect I should dust off my old livejournal account and move things like yesterday's "I got hit by a car and am grumpy, so here's why I gave up on desktop Linux" rant there. And the big long posts about how knowing computer history makes the FSF seem far less sympathetic. (Basically drama. Livejournal's good for drama. Perhaps that's why the Russian mafia bought it?)

Kenneth Hunt corrected me that Tim McInnerny played Percy in Blackadder, and Hugh Laurie didn't show up until the third series. We watched the first series through netflix a year and change back, but it's been probably a decade since I've seen the rest, which are still in the queue...

Banging away on FWL, moving everything over to gcc 4.2.1 (last GPLv2 release). Been needing to do that forever, yes it breaks stuff, trying to fix it. Need to refactor the hell out of the stage intro code to make debugging easier, though...

When did all this code become so _complicated_? It's just a darn build script. Where can I simplify...

June 9, 2009

Breakfast at Einstein's, then decided to bike south to the Chick-fil-a location near SJG to try to work off some of the calories from the con this weekend. (Making lots of peanut butter and jelly sandwiches, cutting them into quarters, and filling aluminum trays with them? Marvelous idea. A german restaurant with a live polka band indoors? Not so much.) Going down The Drag I got hit by a car that turned right out of the Schlotzky's parking lot without looking, but luckily I'd seen the "how the stuntmen do it" things (my behind the scenes at hollywood hobby came in useful) and rolled up on the hood like you're supposed to, and my laptop landed on top of me. The bike, alas, wasn't so lucky, and the front tire was all wobbyly afterwards, and I had to spend $60 getting it replaced. Didn't actually make it out towards Chick-fil-a until around 5:30 pm.

Reason #9437 why Linux on the Desktop sucks rocks: every couple hours, the wireless network will suddenly stop working, with no warning. Generally, it means this happened:

[15163.962304] iwl3945: Microcode SW error detected.  Restarting 0x82000008.
[15163.962326] iwl3945: Error Reply type 0x00000005 cmd REPLY_TX (0x1C) seq 0x0000 ser 0x0000004B
[15163.964337] iwl3945: Can't stop Rx DMA.

And under Ubuntu 9.04 _that_ means you're going to reboot soon. In previous Ubuntu versions, you could "modprobe -r iwl3945" and then insert the module back in, and you'd get your wireless back. But in 9.04, if you try it your kernel will panic. New and improved!

This is why I've given up on Linux on the desktop. Each new release breaks things that used to work, and this has been true for TWENTY YEARS now. On a personal level, why should I "get out and push" if the people steering wind up in a ditch or going off a cliff every single time?

Open source does not competently handle user interface issues because "shut up and show me the code" is not a useful response to aesthetic issues or usability problems. It's an inherent structural problem withour development model. Suppose a user notices something obviously STUPID, like the fact that the default action on battery drain is "shut down" and quickly suspending when you get the "system is about to shut down" pop-up just means that when you plug it into AC power and resume, you get to watch it shut down then. They complain about it. We filter out that kind of complaint as noise, and tell them "you fix it". Become an expert in our little area, then submit your fix to us and spend months iteratively convincing us to take it. Or else leave us alone you useless freeloader.

These problems are built-in to our development model. Coding disputes can be handled via objective tests, benchmarks, code size metrics, and so on, which can conclusively end arguments with numbers and proofs and logic. Dealing with more abstract issues, we fork ourselves to death with a thousand competing implementations, each of which does a couple things better than anyone else, meaning it does everything else worse.

This feeds into the way "this is a horrible default that bit me" is met with "here's how you can customize your personal copy, to fix it _for you_". No attempt at a general solution is made, because maybe somebody WANTS that behavior even if most people get bit by it. Creating a usable environment requires huge customization effort, but isn't it great how customizable our system is? You can build your own desktop. You must do so, since the default environment sucks so badly you really have no alternative. But there are so many different ones to choose from, if you just spend three years studying your available options you'll eventually find that special one built just for you. Don't like Totem? Try VLC! Don't like Pidgin? Try Jabber! Don't like Thunderbird? We've got _hundreds_ of alternatives...

So it's easy to _leave_, but hard to actually get anything _fixed_. Users who can't or won't back up their reports with scientific metrics are ignored. Or those who don't regularly perform regression testing, or who aren't going to spend six months following up on an issue, or who can't swap out individual components of their system with ones they built from source (after having upgraded everything to the latest version because fixes aren't made to the old versions; you need to be exposed to all the new bugs at once to test any fixes for your old bug)...

And even if you _do_ try to fix these things, you can't keep up. Pointing /bin/sh at dash was _new_ breakage that older Ubuntu didn't have. KDE 3.x->4.x was new breakage. Losing syntax highlighting in vi and the ability to cursor around in insert mode was new breakage. Kmail getting eaten by Kontact and then its user interface changing so hugely in the most recent release I spent 2 hours finding the "sort by thread" option: new breakage. The same network hardware this laptop has had for 3 years now breaking in new and exciting ways: new breakage. The fact 3D support no longer works for my intel graphics card (it worked fine in 8.10): new breakage. (Although a 64-bit dual core 1.73 ghz processor can just about keep up with software rendering to display video, as long as the system's otherwise idle and the window size is small enough. Even then youtube stuff flickers like mad with the "hidden" under-layer of youtube navigation buttons showing through intermittently due to the compositing happening in a single live buffer, and it tends to go all slideshow at the drop of a hat anyway. Oh, and swapping windows it sometimes leaves uninitialized memory garbage on the screen for several seconds, and sometimes the event that would redraw over it _times_out_ and it's left there in the window's buffer. Even if I drag another window over it, the garbage comes back because the redraw is from a double buffer that's got uninitialized memory in it. Kmail does this all the time with its toolbar buttons.)

There's no "steady improvement". There's no indication of clear agreed-upon _goals_, just random lateral thrashing. I keep having to either abandon existing infrastructure that stops being supported, or personally go to great lengths to keep the older stuff working in newer contexts.

Because of what I do, I mostly notice this with technical things that end users only ever notice as "black magic that failed". On a technical level it's epidemic, such as the way the IDE devices became part of the SCSI layer so device ordering is now fundamentally horked in a way it didn't used to be. (SCSI throws all devices in a big bucket and gives 'em a stir, so your USB devices and your SATA devices draw from the same pool of device names, in first come first serve order. Yes really. Back under the IDE world, /dev/hdc wouldn't move without a screwdriver.) Or the emergence of unnecessary layers like the whole HAL/dbus thing, so that your web server goes "HAL says I'm not connected to the network, so I can't let you talk to that web server you're got running on the loopback interface". (Why does it CARE? The TCP/IP layer has had a "no route to host" error for 30 years now, so if there's no network this fact should be instantly apparent when you try to use it. Adding a separate layer capable of vetoing network access is damn stupid, and whether it actually works or not is a side issue to it having no reason to _exist_ in the first place.) And don't get me started on how hideously unstable and undocumented the sysfs exports have been over the past 20 kernel releases. So on a purely technical level, the people steering Linux haven't filled me with confidence in the soundness of their judgement.

And of course there's social/process things too. The GPLv2 vs GPLv3 split didn't help anybody do anything: we're still paralyzed by infighting against our zealot arm which undermines everything the rest of us try to do, in the name of "Purity". (Similarly, I stopped cooperating with the SFLC after the FSF hijacked them.)

On the social front, the people who insist that openness is inevitable and in future all scarcity will vanish have apparently never studied history. Capitalists try to corner the market on things that were previously free _all_the_time_. Capitalism is a marvelous mechanism for managing scarcity, and in the absence of scarcity to manage, capitalism creates it artificially so it has something to manage. For example, Austin is now ringed with privately managed toll roads. I hate this, but they're still there, and they're building more. Individual instances come and it go as people play whack-a-mole with the latest incarnation, but there isn't going to be some magic day when people stop TRYING to corner the market and make a killing.

But technical and social issues aside, an awful lot of the time I notice that my Linux desktop simply doesn't work at an end-user level, and then have to invoke programmer-fu to try to work around it.

My desktop is Linux. My desktop doesn't work. Most of the time I just want to USE the thing to surf the web, watch videos, read email, edit text, and it breaks repeatedly and I have to duct tape it back together to get minimal functionality out of it. (I STILL haven't gotten the bluetooth association between my laptop and cell phone set back up. On Fade's Mac, it took about five minutes starting fromm scratch, most of which was figuring out where in the config menus to find the appropriate settings page.)

These little "end user issues" come from everywhere in the infrastructure. The network spontaneously dying so only a reboot can fix it is a kernel issue. The 3D support not working anymore is an X11 issue. When I press the eject button my my DVD, it gives me a pop-up message 'Failed to eject "/org/freedesktop/Hal/devices/storage_model_CDRWDVD_DW224EV". Given device "/org/freedesktop/Hal/devices/storage_model_CDRWDVD_DW224EV" is not a volume or drive." and that's HAL/dbus breakage. All of those are brand new failures in 9.04, on the exact same hardware that originally came running Ubuntu 6.10 from the factory. (It has more ram and a bigger hard drive since then, but that's it.) Installing a new version of ubuntu breaks stuff that used to work, every time.

The largest chunk of infrastructure I've had to abandon recently is KDE. I'm now on Xubuntu because the KDE developers did one of those "throw out all the progress we've made and start over" things that resulted in a KDE 4.x so hideously unusable that even Linus Torvalds (who has mocked the gnome developers for years as Just Not Getting It) has abandoned KDE and gone to Gnome. (In Kubuntu 8.10, KDE 3.x was no longer an option, and 4.x simply did not work.) I tried Gnome for a bit (couldn't stand it, and then the system's network configuration spontaneously ate itself so badly I had to reinstall to talk to the network again, and I went back to 8.04 for a while), and now I'm using XFCE instead.

This is just within Ubuntu, the only distro that's still _trying_ for the end-user desktop. I used to use Red Hat, but they abandoned the desktop and went off to the enterprise market. (When the call went out for a company to stand up and defend decss so Linux could play DVDs, they yanked MP3 playback support from Red Hat 8 and bogged off to the enterprise workstation market, discontinuing their consumer version entirely one version later.) I didn't so much abandon Red Hat as get abandoned by it. Then I used Knoppix, but while its "boot from CD" gimmic made it a great platform to demo Linux on various hardware it never actually transitioned to running from the actual hard drive at all well. So I abandoned Knoppix in favor of Ubuntu, then Kubuntu, then Xubuntu, which are all more or less from the same group. There are other distros I could use if I tire of Cannonical's offerings, but that's just another manifestation of "all of these suck, so we keep trying new ones".

This means I'm now using the third place desktop (gnome, kde, xfce) on the third place operating system (windows, macosx, linux). My configuration's market share doesn't even register, and NEVER WILL. I'd say XFCE was even in _fourth_ place behind the desktop the subnotebook distros (like Easy Peasy) are using for eeepc clones, except that we lost that niche (as it scaled up past rounding error, all the new seats it added were Windows) so it's comparing one flyspeck with another and the result just doesn't matter either way.

Since I've never been able to recommend Gnome with a straight face, and can no longer recommend current KDE, I'm now back in the OS/2 position where the desktop I use isn't one I can honestly recommend to anyone else. I use a weird custom system nobody else has ever heard of, and I may never actually meet another user of this particular setup in person. I might as well still be using an Amiga.

Embedded Linux is still moderately fun, but I used to do Java under OS/2 the same way. What the code you produce runs on, and the development environment you produce it in, are two different things. Linux itself will be around as long as Java (which is the new Cobol), but the Z80 processor lasted 30 years and DOS could do just about everything embedded Linux does. Cisco still sells IOS, and Solaris and even Xenix still have their users. In a way, the embedded niche is where old techologies go to die. (And every supercomputer worth the name is a hand-rolled bespoke job.) Maybe the openmoko Google phone will stop being "technology in search of a use case" and start to matter as much as the iPhone or blackberry one of these days. I'm not holding my breath, the cell phone companies have no incentive to give up control when they can sell _ring_tones_ for profit and people buy them.

It's been "the year of the Linux desktop" every single year since 1998. We are the boy who cried wolf, and even if it did start to get interesting in future nobody will ever care, or pay any attention.

This is why I need to buy a mac laptop to do my embedded Linux development on.

June 8, 2009

We stayed an extra night at the Hypericon hotel, which meant we got an extra evening with Fade's friends from the Road to Amber mush.

I'd write a con report, but it would be about hanging out with Fade's online friends IRL, and not about the con at all: Ann is _totally_hot_ and apparently moving to Siberia for the next few years, Em is a Linux security geek whom I could have happily geeked out with at great length if either of us had wanted to talk shop instead of getting away from it all, I keep forgetting the name of the bouncy lady I fed half I tin of Penguin Mints to (I mean the thin bouncy lady, the large bouncy lady showed far less enthusiasm for them and only got 2 or 3 mints although I've forgotten her name too), the radio announcer guy was fun but _loud_ which came up when he was GM-ing the unrelated Amber LARP (I was a pseudo-NPC in charge of Arden, and helped 4 people get their victory conditions although only 3 of them kept them through to the end of the game) and I was trying to discuss secret plans with him, there were two bald guys who were both fun to hang around but I couldn't tell 'em apart (Fade insists one of them is black, but I honestly couldn't tell; no really)... Despite several threats to shave Fade's head (a plan Fade was not opposed to when she got sufficiently drunk) it didn't happen because nobody had the right equipment with them...

Part of the problem figuring out who was who (despite name tags) was that my previous vague acquaintance with these people was via the characters they play online (I'm not on the mush, but I watch over Fade's shoulder a lot), and each of them has about 3 characters, including Fade. (One reason I can remember Ann more easily is that she only has one character, although in all honesty the "totally hot" part helped too.) I did notice that all the male characters are played by women, and all the female characters are played by men. (I believe I spotted one exception all weekend, out of over a dozen characters.) Oh, and the in-game marriages don't line up with real-life marriages, and one guy was married to two or three women (by which I mean his various female characters were married to different male characters belonging to different female players)...

So yeah, people. Fun group. I now know what it must have been like for Fade to get dragged along to Ottawa Linux Symposium. Smile and nod. :)

Driving back to Austin. We hit the road around 10am, might get home by midnight.

Can't do much compiling because the laptop battery won't stand for it, but I've confirmed that the move from gcc 4.1.2 to gcc 4.2.1 is what broke the static toolchains. Which means I have no clue WHAT is wrong because too much stuff changed. Much debugging to come, but now I must drive again...

June 7, 2009

Fade went off to the Hermitage, which was Andrew Jackson's house, to see how they spun our most bigoted genocidal president to date. (Em read the book about Andrew Jackson that was covered on the daily show a few months back.)

I was vaguely interested in coming along, but slept instead. The hotel room has this MARVELOUS bed, I've been taking longish naps all weekend. It is fluffy. Didn't actually get up until almost 2, by which time the convention was winding down.

Poking at the FWL script. It's broken in several different ways, but the most fundamental one is that its debug output is horrible.

The main problem is that does things that doesn't, because doing a canadian cross requires multiple architectures, and builds one architecture through from start to finish. This means if I want to test any of the stuff that only builds, I need to test it in a different build environment that produces horrible debug output.

The horrible debug output is because it has to be able to background a lot of processes (to support FORK=1), and doing anything asynchronously makes logging and flow control fundamentally harder. Also, buildall shouldn't stop the whole build script because one architecture fails, and you have to go out of your way to make it build just _one_ architecture.

There are several ways I could address this. I could teach to handle the special case of building this extra stuff for the host architecture, which is the one instance where you don't need two architectures because the two match. I could add the option to to die after the first error. But ideally, I want to stop blanking the temp directory for later stages after earlier stages fail.

I taught the individual stages to exit when they haven't got their prerequisites, and even added an option to suppress that error. But they still blank the temp directory from, before I can make specific prerequisite tests. Moving the cleanup out of common code to solve the flow control issue would require me to duplicate it in every stage. It sucks either way, really.

The reason all all this is important is that the canadian cross stuff (or an equivalent set of prerequisites) is required to do an --enable-shared compiler, which is required to build uClibc++. (It's an unpleasant set of circular dependencies, based on the fact that is a completely horrible idea. The compiler should not add shared library prerequisites to its output, especially not for a library that links against It wouldn't have been so bad if --disable-shared built libgcc_eh.a, but that would have been _smart_ and we're talking the gcc developers here.)

It also turns out that the reason I've been having such trouble with soft-float on arm is that they decided to add the soft float stuff to libgcc_eh.a, so it only gets built for --enable-shared. (I.E. the --disable-shared compilers have been gradually crippled by the gcc developers ever since the start of the 4.x series. Wheee...)

June 6, 2009

The new kmail continues to deteriorate in the name of "helping". Now it's replacing _this_ with this, and I don't want it to, and there's no way to switch it off that I can find in the config menus.

I continue to be annoyed by the perceived need to replace :) and similar with a graphic of a smiley face. I await software that replaces the word "blue" with a color change and the word "house" with a picture of a house (or perhaps a picture of Hugh Laurie, depending on context. Yes, lord Percy Percy from Black Adder got a lot smarter and more sarcastic and began walking with a cane as he got older).

The hotel internet has melted down. It was having dhcp issues (the lease timeout is 4 days, well duh), each floor has its own router and it's not always recognizing returning computers... And now the whole thing's gone catatonic and you can't even assign yourself an IP and ping the router with it. (You can still associate with the access point, but there's nothing to ping.)

Luckily, I have many windows open with previously loaded material to read. So far, I'm really liking this blog, which slashdot linked me to a couple days ago.

June 5, 2009

At a convention, and what do I do? Program on my laptop, of course.

Figured out why gcc 4.2.1 was going nuts (if you pair gcc-core in one version with gcc-g++ in another version, it gets unhappy). Now building everything with that, so I can beat soft-float out for armv4.

The mercurial git->hg converter is still broken (at least with the busybox repository). Need to bug the list about it...

And once again, is going "boing". Sigh. It's sad how brittle binutils and gcc are, what I'm trying to do is fairly straightforward and I have to supply a dozen separate overrides of its assumptions to do it.

June 4, 2009

Driving to Hypericon, in Nashville Tennessee...

June 3, 2009

Since the mercurial git->hg converter Ubuntu 9.04 ships is toast, I'm trying to build the current hg release from source, and install it. I did the standard "make" and it complained, you have to explicitly say "make all" (which is stupid). Then I did a make install and the result couldn't find its libraries. Ok, export PREFIX=/usr and make install... and that doesn't work. The mercurial makefile explitly sets PREFIX=/usr/local so you have to override it on the make command line. Ok... And it still can't find its libraries.

Ok, did it hardwire in a path? Do I have to set the PREFIX before the build and not just the install? Try that... Nope.

The build is dying because it can't find "asciidoc" to convert documentation with. It says that's optional, but the make is exiting with an error, perhaps it didn't build those libraries because they come after that? Ok, try installing asciidoc... And aptitude wants to install 36 packages to give me that. Sigh. 133 megabytes, decompressing to 277 megs. Ok, on Central Market's internet connection, this is going to take a while...

And _that_ wasn't it either. Right, go look at the darn python path...

Ok, Mercurial installed it into /blah/blah/python2.6/site-packages, and ubuntu has /blah/blah/python2.6/dist-packages. Sheesh. Symlink time.

Tomorrow Fade and I head to the wilds of Tennessee for Hypericon.

June 1, 2009

Wow, the mercurial git->hg converter is kind of unpleasant. I just re-converted the busybox git archive, and the resulting archive has commits in it twice, the log of commits isn't remotely chronologically sorted, annotate shows that every line belongs to the same commit... The git user interface is horrible, but I'm not sure a better UI into a fundamentally worse archive is an improvement.

Possibly I can write my own converter that won't suck? (I wonder if I can sort the commits by timestamp and apply them in that order?)

Part of the problem is that the archive has the files moving several times, moving everything out of the "busybox" subdirectory up one level. (Sounds like the initial svn->git conversion was broken.)

Oh wow, the repositories aren't the same. I did an "hg update" to get the most recent version of the files in the mercurial archive... and it bears no resemblance at all to the ones in the git archive. These are _years_ old.

Time to go grab the development version of mercurial, build it from source myself, and then bother the mercurial developers if this is still happening in the current release...

Sigh, in the meantime I need to work with git, so I have to go look at the git howto I wrote where I worked out how to do more or less everything I needed to. (I removed it from because some idiot complained at length that the stuff I documented wasn't the bits he'd have bothered to document, and I didn't feel like arguing with him. But I dug up a copy for my own use because _I_ still need the thing, I donn't use it often enough to remember this stuff and the documentation has that "elaborate yet impenetrable" quality of 1990's man pages...

May 31, 2009

Random bug in VLC: if you enable deinterlace, it kills the old window and pops up a new window. The new window has no subtitles in it, you have to disable subtitles and then re-enable them to get them back.

I'm used to Linux desktop software being buggy. This is normal.

Broke down and set kmail back up, and caught up with the accumulated week of back email. (Mostly spam, of course.)

And kmail has lost the ability to thread messages in 9.04? There used to be a folder->thread messages option in the pulldown menus, and now it's gone. And I can't find any replacement. That's just brilliant.

Reinstalling various things. I have no idea why the "hg convert" extension spontaneously disabled itself. I had the "hgext.convert=" line in my .hgrc, and it got commented out somehow. Apparently, installing mercurial from a dpkg goes through your existing .hgrc and comments out extensions you have enabled. That's deeply stupid; if I have an .hgrc possibly the data in it _means_ something, eh? Once again, Linux developers are _sure_ they know what I want to do better than I do. I need a mac. (They're probably as arogant, but far more likely to actually have good UI suggestions.)

Looking at the busybox "patch" command to see what would be involved in replacing it with the toybox one. (Either as a patch I maintain locally, or something I can push upstream to the busybox guys.) Their version is 254 lines, and mine is 361 lines. On the other hand, mine does a heck of a lot of stuff theirs doesn't (starting with supporting offsets, which is why I need it).

Ideally I'd make it config down to be as small as the old one, but it wasn't really designed that way. Hmmm...

Another thing is that after using the toybox infrastructure for a while, the busybox infrastructure looks HILARIOUSLY clunky. Put a prototype for the applet's main function on the line above the declaration of that main function. (Why?) And of course pass in argc and annotate the fact it isn't used. (Um... Ouch?)

May 30, 2009

So I found a new way to break toybox, and I don't want to put out yet another single patch bugfix dot release. If I don't want to bother with it anymore, then I need to push stuff upstream into busybox.

I pretty much just need patch, oneit, and netcat. I need my version of patch, because the busybox one can't apply anything at an offset. I need my version of netcat because the one they've got doesn't seem to have working -l support. And I boot the resulting system under oneit, which I suppose could be standalone.

The big part is that if I start working on busybox again, I'm going to have a huge urge to clean stuff up...

I tried to build qemu from source, but the version of the source I had died with a build break halfway through. (I have horrible luck with random source control snapshots, part of the "I break everything" schtick.) So I thought "ok, git update time"... except that it hung. Why did it hang? Because is down. That was yesterday. It's still down.

Luckily, the mailing list archive is still up (yeah, my personal email still isn't reinstalled yet. I need to break down and go back to kmail...) and there's good news. Now that the project has a new maintainer, not only did they migrate off cvs but they may finally leave savannah as well. Yay!

Watching Obama's weekly address. (He's really going to the well for his supreme court nominee.) What I'm really noticing the little flag pin on his suit and tie, and can't help but wonder if he's wearing bright red tennis shoes along with it to complete the ensemble. (Or is that just british?)

Yesterday visit to McDonalds has turned me off to The Donald's for a while, not least because I had to move to the other side of the restaurant to get away from the impressively intense smell of a homeless guy when I first arrived (I could still smell him on the other side of the building, just not as _much_). Then a second homeless guy visited the bathroom three times (passing right by me each time, and he didn't smell any better). Then third guy struck up a nice conversation, which remained interesting for almost ten minutes until he took the obligatory right turn into crazy when I mentioned Warren Buffett and he started telling me how the bible insisted nobody would ever be smarter than King Solomon and that some Rozencrantz family secretly ruled the world. (At which point I went home.)

Today I'm at Big Bite, and their strategy for driving off homeless people is the same as Jimmy Johns' (which was the same as Metro used to use): be very, very loud to at least prevent them from sleeping in here after midnight. This one's a "sports bar" (a bit like Cain and Abel's), showing a football basketball game on five different large screen TVs. I found a corner booth where I don't have to look at any of 'em, but it's still loud enough that my noise cancelling headphones aren't cutting it.

Attempting to listen to a podcast where Seannan McGuire (the lady whose livejournal I started following after she wrote a marvelous Barry Ween fanfic) is interviewing Jenn Brozek (the ex next door neighbor of Andy Weir, who wound up in his now completed webcomic as an international jewel thief with a time traveling daughter. Both Jenn and Seanan are also published science fiction authors. Jenn's an editor. Seanan's a filker with several albums out, and draws an intermittent webcomic. Interesting people, generally.

The two of them wound up at baycon together, and BayCon Podcast #4 is Seanan interviewing Jenn. It's a small internet, innit?

On an unrelated note, an interesting thing about VLC is its starting size is the actual width and height of the source video, in pixels, with 1:1 scaling. (The other players I've used just remembered the previous size relative to the rest of the desktop.) This makes it clear that the _reason_ Obama's podcasts look way the heck better than the msnbc ones is the white house podcasts take up the whole desktop, and the msnbc ones are a tiny little window in the corner. (DVDs take up maybe about half the screen.)

May 29, 2009

Mark poked me last night about several FWL todo items, so I guess I should tackle them...

Let's see, I got the nightly snapshots building with uClibc-git, I need to test those to make sure they're all working, then upload 'em to the binaries directory on Once I've got the those working, I need to figure out which ones I can switch to nptl, probably adding new targets for those (ala armv5l-nptl).

And I should really deal with sparc. And I should switch over to gcc 4.2.1 as the stable version and make it always build libgcc_eh.a even when it's --disable-shared (which would be a _better_ way to fix the arm soft float issues, and also fix...

Ok, THAT was annoying. The display froze. The mouse still moved, and the audio of the video I was watching continued to play, but the video itself was frozen and nothing on the desktop responded to the mouse pointer or being clicked on, the keyboard was dead (ignoring ctrl-alt-F1 and ctrl-alt-backspace), and the power button didn't suspend. Luckily, holding the power button down for 10 seconds powered the sucker off, since that's not under Linux's control. And thus I was able to reboot.

That was crazy, the kernel was obviously still working because the audio playback would have stopped if it wasn't (and the mouse would have stopped moving), but all user I/O was frozen. Why on earth did this get redesigned? What idiot decided to route the power button (acpi event) and the desktop (X11) through a single point of failure? (Sounds like the darn "input core".)

Great, now my laptop can choose to just hang on me at any time. "Xubuntu 9.04: there are no nontechnical end users I hate enough to inflict it upon."

I need to buy a mac.

On the bright side, the desktop icon for the bleach DVD has an "eject" menu item if I right click on it, so I can workaround the fact the button on the actual drive doesn't work. (I note that the button on the drive was yet another thing that caused no reaction from my frozen desktop before the last reboot, but then if it had and the graphics were horked how would I know? The mouse pointer is a hardware sprite, just like the commodore 64 used to have...)

Sigh. McDonalds has been having serious trouble with inflation eating away at the "dollar menu". At the start of the Bush administration, a dollar was something like twice what it is now, and in 2007 and 2009 inflation was double digits annually. (It's slowed down quite a lot since the election, although the banking crisis and resultant collapse of gas prices had something to do with that.)

So a few months back, they replaced the double cheeseburger with the "McDouble", which only had one slice of cheese but was otherwise the exact same thing. (And left the double cheeseburger on the menu at about 40 cents more, getting lots of people to order it and pay extra until they noticed the change.) But the McDouble is still a heck of a deal, and my only complaint was the unannounced bait and switch.

But now they've screwed up the apple pies, and I am sad. They've been offering them 2 for a dollar for a while now, or 98 cents for one. (I'm aware this makes no sense.) I'd be fine with dropping the two pies for a dollar thing and the dollar item becoming a single pie. But that's not what they did, instead they took some of the filling out of each pie (not sure how much, somewhere between a third and a half), making them look deflated and bowed on the bottom, and screwing up the ratio between filling and crust so they now taste like crust without enough apple in 'em.

Sigh. They can't add 5 or 10 cents to the cost of the thing because a dollar is a big psychological price point and it would no longer be a "dollar menu" item so they'd lose an extremely successful marketing group. But quietly making the pie INEDIBLE is not a solution, people. Grrr.

(I dunno, maybe I'm wrong and the last few times I've been into this McDonalds every pie I've gotten is seriously dehydrated are something, in a way that doesn't make the crust crunchy. But it takes the fun out of ordering the pies.)

May 28, 2009

Ooh, vlc does go forwards and backwards with the left and right cursor keys, you just have to hold down shift, alt, or control to indicate whether you want a 2 second, 9 second, or 1 minute jump. (The hotkeys menu under preferences list all of this. So far I just remember shift and hit the appropriate cursor key several times.)

Of course what I really want to do is find a way to tell it to lock the aspect ratio on resize, so that when I grab the bottom right corner I don't have to fiddle with it to get the black bars on the edges to go away.

So according to Wonkypedia (I.E. I wonder if it's true), the Model T could use both gasoline and ethanol for fuel, but ethanol was knocked out as a viable fuel by prohibition. So the dominance of gasoline as our current automotive fuel is yet another side effect of prohibition. (Makes you wonder what the side effect of the War on Drugs are. Other than the obvious ones like the new Al Capone types currently taking over Mexico, the price of paper going through the roof because we can't make it out of hemp anymore, and so on...)

One of the main failures of Xubuntu, or possibly Desktop Linux in general, is that you can't easily pick and choose which apps you want to run. Yes, that "failures" and "can't" rather than "features" and "can".

Xubuntu comes with "movie player", which is actually a program called "totem" although it never anywhere tells you that. (You have to use ps to see the name of the running process to figure that out, and this is intentional. That's a bit like saying "web browser" and never revealing whether you're running IE, Firefox, Opera. Did I mention the Gnome developers consistently make horrible user interface decisions and then defend them to the death? Why the xfce guys are humoring the Gnome guys' poor decision making skills is an open question...)

If you press the eject button on the dvd drive, it pops up a window saying "Failed to eject "/org/freedesktop/Hal/devices/storage_model_CDRWDVD_DW224EV", and tells me that name is not a volume or drive. I have to go to a console and run "eject /dev/sr0" as root to get the disk back. This is unrelated breakage, it's because the "hardware abstraction layer" was a horrible, horrible idea Linux copied from Windows NT (see Gnome == stupid, above). But I can't tell modern software _not_ to use it.

Totem is an inferior dvd player to Kaffeine, the one Kubuntu used back in 8.04. (So the rest of what I'm talking about is a regression vs what they got working back in KDE 3, which was rendered useless when they moved to the unusable KDE 4.) Totem auto-launched when I inserted a DVD (not even popping up a "what do you want to do" menu), and then told me it couldn't access the dvd player. Turned out to be just a race condition, and on relaunch it accessed it fine. (Kaffeine did not have this race condition.)

Then it turns out that it doesn't seem to deinterlace the dvd by default. (Note I already went through the magic dance to get decss support working, which is non-obvious and something the mythical nontechnical Linux desktop users would have to phone a friend to get to work.) Google implies that back under 7.04 you could find a deinterlace option under the "view" menu, but there isn't one there under xubuntu 9.04. That menu has "fullscreen", "fit window to movie", "zoom in", "zoom reset", "zoom out", "aspect ratio", "switch angles", "show controls" (the play button and such at the bottom, no deinterlace option among them), "select text subtitles", subtitles (a sub-menu, not only redundant but it isn't aware of the actual subtitles this bleach disk has, which I navigated the DVD menu to switch on), and "sidebar" (which is read-only, showing things like the duration, resolution, and frame rate). Kaffeine puts the duration as a label ont he slider at the bottom.

Kaffeine not only had "interlace" actually as a labeled option in the menus, but when I hit "i" it said it was toggling interlace. I _think_ I enabled some kind of deinterlacing by pressing all the letter keys in sequence (accidentally fullscreening and quitting and such along the way), but A) it never notified me I was doing so, so who knows what key it was, B) the deinterlacing algorithm it's using is crap. It's better than nothing, but still full of artifacts.

Of course the problem with installing Kaffeine is it's bolted to the KDE desktop, using the QT widget set and kobjects and so on. We have so many redundant implementations of exactly the same infrastructure, and they're all slightly incompatible so can't be substituted for each other. It's like windows "dll hell", only built into the OS.

So I tried installing "vlc", another dvd player that google recommended, which presumably doesn't have these unnecessary dependencies on pieces of infrastructure it shouldn't care about. It installed fine. It seems to play the dvd. But when I put in the dvd, totem still pops up. I uninstalled the package "totem", which made no difference. There's 8 gazillion other sub-packages (totem-common) and such, several of which imply it's tied into the guts of mozilla. (This seems odd since mozilla's using the binary-only flash plugin to display essentially all video, because nothing else really works.) Ok, remove totem-common, let it randomly rip out three other packages to "resolve dependencies"...

Ok, now it does nothing when I insert a disk, and I can pull up vlc by hand from the menu to play the disk. I can live with this. (This has no menus, the whole window is displaying the video. You right click on it to get options, and video->deinterlace has seven options under it. It defaults to "disabled", yet even that seems to be deinterlacing the picture slightly better than totem was. I set it to "blend", although what I really want is "just deinterlace properly, don't bother me with implementation details since I'm not a DVD playback codec designer". (This is a really common failing of Linux software, you have to know how it's implemented in order to make it work.)

Except that with both kaffeine and totem I could use the cursor forward and back keys to skip ahead or go back in the disk, and also get slider bar along the bottom. With vlc, I can jump to the start of episodes, and that's it... Ah, I see. There's a progress bar on a separate window. I guess that's reasonable...

I still need to buy a mac, so I don't have this much upgrade hassle every 6 months. I have to make the "to kmail or not to kmail" decision again, and until then I don't have an email client...

May 28, 2009

How do the ubuntu developers get stuff this simple wrong?

When I fire up totem and tell it to play a podcast, it says it hasn't got the plugin and offers to search for it. It comes up with a window that says "GStreamer ffmpeg video plugin" and "GStreamer plugins for mms, wavpack, quicktime, musepack". Ok, fine. There's the pointless step of clicking checkboxes to select them, which pops up a dialog about legal disclaimers (clicking a checkbox should not pop up a dialog, that's horrible UI design), and then I tell it install, it grinds for a bit, and then says "Software installation failed". It won't tell me what the problem was, and there's no way to get extra information. The three buttons are "retry", "add/remove more software", and "close".

Oh, and did I mention it won't tell me the package _names_, just the description fields? If I want to do this from the command line (and thus have at least a _chance_ of better error messages), I have to figure out that "GStreamer ffmpeg video plugin" is _probably_ the "gstreamer0.10-ffmpeg" package. I still have no idea what the _second_ package is.

You could write a "user friendliness fail howto" out of this. Automation that completely hides how the underlying thing works is not useful if it means your car runs out of gas and it refuses to say so (or even that gas is a thing it might ever need).

May 27, 2009

So Xubuntu 9.04 is _almost_ usable. But not quite.

My hard drive developed a bad sector yesterday (with the block remap list full), so I took advantage of the extended warantee I bought and had Fry's replace it. Fresh hard drive, so I thought "might as well try 9.04".

Although the /etc/acpi scripts are all there, it doesn't seem like xfce actually uses them. Instead, I had to root around in applications->settings->power management->general to find a pulldown menu to change the behavior. Ok, fine, I can live with that. Except when it resumes, it asks me for a password. I don't want this, and a half an hour of searching leads me to conclude the config menus haven't _got_ an entry for this. So how do I turn it off? Google suggests that I install the gnome configuration editor (for xubuntu?!) and edit apps->gnome-power-manager->lock. Except that neither aptitude nor their gui thing will install "gconf-editor". The gui thing finds it, but then says it's not supported on amd64. (Brilliant. What, I'm supposed to build this from source? I grepped in /etc/gconf for the word "power" and it wasn't under there, so this apparently isn't a hand-editable .rc file.)

Ah, aptitude remove gnome-screensaver seems to have done the trick. (Well, actually it left me with a totally black screen on the next resume until I found and killed the gnome-screensaver process that was left running after the uninstall. But I found the thing to blow off with the shotgun.)

Oh yeah, "ready for the end user desktop". That's Linux. (I really need to buy a macintosh. I'm now using the _third_ most common desktop because the other two suck too badly for words. This one _mostly_ has the advantage of simplicity, except when it doesn't.)

So to make xfce/xubuntu resume without a password, uninstall gnome-screensaver and then killall gnome-screensaver or reboot. Right.

Computer tied up most of yesterday backing stuff up, and today restoring it. Being able to do 10 megabytes/second through a USB cable isn't that impressive when you're backing up and restoring 300 gigabytes to a terabyte drive.

May 26, 2009

Sigh. Every few years, they do it again. "There's too many divergent implementations of X out there, I'll write my own and assume everybody in the world will switch to its obvious superiornessity." The result is n+1 implementations of whatever it was. (Might be nice to test against, but just testing against 3 or so existing implementations accomplishes about the same thing anyway.)

So this "linuxcon" thing up in portland, which the Linux foundation put together? It turns out it's "colocated" with the plumber's conference, but is a separate registration. Plumber's conference attendees get a 30% discount... off of a $300 registration fee.

That kind of discourages me from attending either one, to be honest. If Mark and I hadn't already submitted a talk proposal to the plumber's conference, I wouldn't bother with it, and now I'm kind of hoping they turn us down. (I'm a hobbyist, and they're double-dipping. I've only ever bothered to attend any of these things as a speaker. I pay my $50 at the door for things like Penguicon and armadillocon, but even the "discounted student price" of $100 is more than I'd want to pay for a hobby thing. I'm already spending my own money to travel there, and for lodging. If you don't know how to trade a room block for function space you're not really a hobbyist event, are you?)

Found the problem, it was a bug in toybox; if I export $CC it'll use it, overriding $CROSS. Oops. I need to push stuff upstream into busybox if I'm not going to properly finish this sucker...

May 24, 2009

A cc wrapper that cherry picks arguments to send on to sub-commands turns out to be a bit fiddly to implement, unless I don't want to implement the same option parsing code in two places. (Luckily the rathole I went down after breakfast turns out to be a red herring: all the linker options are in a separate namespace, cc needs -Wl, to pass them through, so the wrapper doesn't need to care.)

Something broke x86_64, the "strip" command isn't recognizing runnable binaries that the host "strip" command can strip just fine. Oddly, I tried the binary in the release directory and _that_ didn't work either, despite the release building. I have no idea what's up with this, trying to track it down now. (Once again, I broke something _retroactively_.)

May 23, 2009

A little while ago I promised to explain:

So if open source used to be the norm back in the 1960's and 70's, how did this _change_? Where did proprietary software come from, and when, and how? How did Richard Stallman's little utopia at the MIT AI lab crumble and force him out into the wilderness to try to rebuild it?

Two things changed in the early 80's: the exponentially growing installed base of microcomputer hardware reached critical mass around 1980, and a legal decision altered copyright law to cover binaries in 1983.

Increasing volume: The microprocessor creates millions of identical computers

The shortage of customers was changed by the microcomputer, which was based on the microprocessor. Intel invented the microprocessor around 1971 (largely by accident; here are interviews with the engineer who designed it, the one who implemented, the customer they did it for, and their boss at the time who is most widely known for having a law named after him), but it took Intel three revisions (4004, 8008, 8080) to come up with a usable chip, which was at the heart of the MITS Altair, the first microcomputer (and the original home of the S/100 bus and CP/M operating system so widely cloned later, including somewhat indirectly by the IBM PC).

The first computer to sell 1 million units was Comodore's Vic 20, introduced in 1981 and beating the Apple II to the million unit mark by a few months. So within five years of the Apple II's introduction, you could potentially sell a million copies of the same software. This allowed the concept of "shrinkwrapped" software to emerge, which you could write once and then sell many identical copies of without even recompiling. Initially this supported small companies consisting of a programmer and perhaps a small support staff (WordStar, Richard Gariott's "Alkabeth"), but exponential growth in the hardware market led to matching growth in the potential software market.

The first "killer app" (a software program compelling enough to drive sales of the system it runs on) was Visicalc, the first spreadsheet. It was produced by a two person company, Dan Bricklin and Bob Frankston, operating out of Bob's attic. The first version shipped in October 1979, and was the main reason that Apple's less than 50k units its first two years were followed by a million more. (Another fun quote in that article, by Bob Frankston: "it's worth noting that back in 1979 people viewed the keyboard as an impediment to using computers. After all, only secretaries could type...") Oddly enough the book "On the Edge", which is about commodore, gives great behind the scenes information on early Apple and Visicalc stuff. The fact commodore bought the company that made the processor inside early Apple machines has a little to do with it, but mostly it was a darn small industry at the time and everybody knew everybody else...

So the reason something like Visicalc couldn't happen before 1980 is there was no customer base to sell shrinkwrapped software to, not even enough to support individual programmers operating out of their bedrooms unless they had a day job doing something else. You could make a living as a consultant writing bespoke software on commission, but your customers told _you_ what to write, and owned the result which of course included the source code.

The other thing about microprocessors is they were more uniform. The old "wire up a breadboard" computers weren't just produced in lower volume, they were more easily modified at a fundamental level.

The MIT AI lab where Richard Stallman learned to program was started in the 1950's with a pair of prototype computers, built by a recent graduate named Ken Olsen when he interned at Lincon Labs in Boston. Both were unique, each the only one of their kind ever built. The first was the MTC (Memory Test Computer), which only had only four instructions and was built for a single purpose: to stress test early "core memory" and prove it could work reliably under load. The second was the TX-0 (Transistor Experiment 0), Ken Olsen's pet project to prove that transistors could be made reliable enough to build a computer and memory out of them. (Early transistors were so sensitive to static they were even less reliable than vacuum tubes, Olsen came up with a way to buffer them so they could survive normal handling and operation.) Both prototypes were heavily modified and upgraded by the students at MIT, because although they weren't very powerful they weren't shared, either. Each belonged entirely to the AI lab 24/7, so the students there didn't have to make an appointment to use it or get permission to modify it.)

Olsen went on to found Digital Equipment Corporation to commercialize his new transistor memory, and hired a bunch of his fellow MIT graduates to do so, finding the most useful ones came from the AI lab that had received his cast-off prototypes. (The book "Hackers" by Steven Levy covers the founding of the MIT AI lab. This marvelous interview with Ken Olsen covers the TX-0 from the Lincoln Labs end, plus early transistor memory experiments, and goes on to chronicle the rise of DEC.) The TX-0 was the only one of its kind, and when Olsen founded DEC he hired a bunch of MIT graduates and sent one of the first PDP-1 systems to the MIT guys to train future graduates so he could hire them too. (The MIT guys soon wrote the first shoot-em-up video game, "spacewar" for the PDP-1, which became DEC's main sales tool for showing off the new system. DEC eventually sold around 50 PDP-1 systems.)

When DEC built its first mainframe (the PDP-6, they which shipped a total of 36 machines in its entire production run), MIT of course got one. When DEC upgraded the PDP-6 to the more powerful PDP-10 (about 700 total ever made), MIT's system was upgraded. That PDP-10 was the machine Richard Stallman did all his work on at MIT.

All of these systems, from the TX-0 to the PDP-10, were heavily modified by the MIT guys, to add new instructions and capabilities. The fact that MIT not only wrote its own operating system from scratch but customized its computer hardware to add extra instructions and registers, really wasn't that unusual under the circumstances; everybody did it back then. The computers were made by wiring together individual diodes and resistors and such on a bread board. Wiring in a few more wasn't a big deal. But it meant that programs written for a single machine ran _only_ on that machine, or perhaps one or two others that had been similarly hand-modified.

Microprocessors were different in that you couldn't customize their circuitry the way you could a breadboard full of chips and wires. There weren't just more of them but they were also more uniform, thus software didn't need to be ported between different machines of the same model, it could run out of the box. That was something new.

Only with the arrival of millions of identical microcomputers, based on mass-produced microprocessors instead of expensive hand-wired breadboards, (see the books "Soul of a New Machine" and "A few good men from Univac" for more descriptions of that era) could you write a program once and then sell the exact same program to lots of different people without porting (or even recompiling) it for each one of them. That's why the software in the 1960's and 1970's was either custom stuff individually comissioned from consultants, or it was produced by hardware manufacturers in an attempt to sell their hardware, or it was produced by the people who had the computers for their own use, and shared through user groups like DECUS and the CP/M Users Group Northwest. All this stuff came with source code.

But in 1980 Apple Inc. had a record-breaking IPO (the largest in history at the time) based entirely on sales of the Apple II. This triggered IBM to rush out its own microcomputer (the IBM PC) in hopes of squashing Apple before it grew too big to contain. (That's an interesting story in its own right, but a side track to the rise of proprietary software. Robert Cringely's book about it, "Accidental Empires", is fairly accessible and was made into an even more accessable PBS documentary called "Triumph of the Nerds".) In 1982, "The Personal Computer" was Time Magazine's Man of the Year. Exponential growth that continues long enough shakes the entire world.

Changing the law: Apple v Franklin extends copyright to cover compiled/assembled binaries instead of just source code.

The copyright issue changed in 1983, when the Apple v Franklin ruling extended copyright protections to binary code. (The writing was on the wall in that case over a year earlier, but the decision had to be upheld on appeal before commercial interests could seriously act on it.) Before that decision, source code was copyrightable but binaries weren't, so companies shipped source code to _increase_ their ownership of the code in the eyes of the law. If you just shipped precompiled binaries, you had no rights the law would recognize.

The first real "voice in the wilderness" about proprietary software was a kid named Bill Gates, with his 1976 "letter to hobbyists". Except that nobody took it seriously. (Richard Stallman claims not to have even heard about it at the time.) People only listened at the time because MITS paid to run the letter in a dozen magazines, and he wrote the letter because the world _wasn't_ like he was describing, like he wanted it to be so he could profit from it.

William H. Gates III ("Trey" to his friends) was a rich kid and the son of a lawyer who dropped out of Harvard (pre-law) to start a software company with a techie friend of his (Paul Allen). Gates's Micro-soft was founded (in 1975) in hopes of not merely selling software to the emerging microcomputer market, but explicitly cornering that market. Except that their tiny company could only afford starvation wages, to the point that both of its co-founders, Paul Allen and Bill Gates, took day jobs at their main client, MITS for the first few years of the company, and their third employee (the guy their "letter to hobbyists" credits with actually writing Altair BASIC, although their official history now differs), was a part-timer they had to lay off.

They referred to their strategy companies as "riding the bear", attaching themselves like a remora onto one dominant hardware manufacturer after another, grabbing the software crumbs falling from that hardware market and switching to the next when it lost its position. First they rode the Altair's manufacturer MITS, then switched to Tandy as their meal ticket when MITS went under. Finally they landed a lucrative contract with IBM (because Gates' mother was on the board of directors of the Red Cross with IBM's CEO and thus IBM could do business with "Mary's Boy" even though IBM wouldn't ordinarily approve such a tiny company as a vendor because it might not stay in business long enough to complete the contract; see the book "Big Blues" for details). Along the way, they sold their software (mostly BASIC) to Commodore and eery other company that would buy it.

Ironically, the first software product of micro-soft was Basic (a language developed at Dartmouth college in 1964 and freely appropriated by micro-soft). Because it was the only high level programming language MITS offered for the Altair, it quickly became the lingua franca of early 8-bit microcomputers, starting with the Altair clones from companies like Imsai but spreading to the Comodore 64 and Apple II and even IBM's PC. The thing about Basic is it was an interpreted language, where the source code _was_ the program and you ran it directly. Books and magazines distributed BASIC programs by printing source code could type into your own computer. Again, all code written in early 8-bit BASIC was open source, because _not_ being open siply wasn't an option. The language itself enforced it.

Gates was the type of person to not only see a market opportunity in bottled water, but then consider drinkable tap water as a threat to his business and respond by trying to make tap water undrinkable.

In his 1976 letter to hobbyists, Gates was incensed that microcomputer users mostly wrote their own software, and shared their programs freely with others. Users saw Gates as the type of person to not only see a market opportunity in bottled water, but then consider drinkable tap water as a threat to his business and complain about it. Back in 1976, he wasn't laughed off the stage: he was completely ignored.

It's a bit like the recent Time magazine article about how newspapers must start charging for online content by a guy named Walter Isaacson. It's considered a laughable idea today, when the norm for internet content is free access. But if that changes, Isaacson will look like a prophet.

What users didn't expect in 1976 was that Gates and others like him would respond not only with FUD (such as "shrinkwrap licenses" which were laughably unenforceable at the time) but by trying to change the law. Apple's 1980 IPO attracted Wall Street's attention to the booming computer industry, the sharks smelled money in the water, and proprietary interests have lobbied continuously to change the law for their own financial gain ever since. (Note it's still ongoing: the software patents blowup happened in the 1990's, and the Digital Millenium Copyright Act at least claimed to make shrinkwrap licenses enforceable in 1999. Today Rupert Murdoch (owner of Faux News) is trying to implement Isaacson's ideas.)

The thing about Gates is repeated failure didn't discourage him. Here's a 1980 phone interview with a young Bill Gates (audio, transcript, context) who was incensed that a recently published book about TRS-80 programming had printed an annotated dissassembly of the TRS-80 ROMs, so hobbyists could understand their machines better. Gates insisted those roms were his property, and nobody else had the right to look at them (let alone print out annotated disassembly and sell it for profit), and he lobbied congress to that effect. The entire interview is Gates talking about his attempts to change copyright law, back in 1980, into what he wanted it to be.

Similarly, when Microsoft belatedly noticed the internet in 1995, it tried to replace it with MSN, the Microsoft Network. (Remember that in Windows 95, the "unremovable" icon wasn't Internet Explorer, it was MSN. When that strategy didn't work, they bought a browser called Spyglass and renamed it Internet Explorer and made _that_ the unremovable icon, in a second slightly more successful effort to at least partially de-commoditize the internet.)

But back in the early 80's, it wasn't Microsoft that succeeded in changing the law, it was Apple. The Apple v Franklin lawsuit was Apple's response to a company (which became Franklin Ace, makers of the Speak-N-Spell) that cloned the Apple II. Franklin's computers ran Apple software, and they used Apple's ROM code verbatim in their competing product in order to do that. The courts decided that Franklin had gone too far, and that copyright must extend to cover binary code to prevent that sort of thing.

Part of the reason AT&T agreed to be broken up in 1983 (implemented in 1984, but agreed to in 1983) was to get out from under the antitrust regulation imposed on it by a 1956 consent decree, forbidding it from entering any other business beyond its telecommunications monopoly. It nominally owned Unix, by then a widely used operating system, but couldn't enter the booming computer industry as a heavily regulated telephone monopoly. When Unix first leaked out of Bell Labs in 1973, there was nothing to commercialize because there was no market. In between, the computer industry _changed_. Unix was open source from day 1, and developed a vibrant community in the 1970's based on this. (It didn't have a name for open source because it didn't need one, there wasn't any non-open-source software to speak of. When copyright first became an issue for software, the term commonly used for unencumbered software was "public domain".) When AT&T suddenly started enforcing its newly acquired copyrights, and charging _extra_ for source code, it was an enormous shocking change.

AT&T wasn't alone in this, of course. IBM's Object Code Only (OCO) policy was announced February 8, 1983 (more on this), but software predating that commonly came with source code, and often even that source wasn't copyrighted. This is why the Hercules S/370 emulator has plenty of old public domain software to run, including things like the MVS operating system (see also here). Yes, even IBM's mainframe operating systems worked by different rules back in the 1970's.

RMS tells his own story (in his own words, or biographical version) about a Xerox printer he couldn't get the driver source to. This came as a shock to him because previously, giving source code to any software was a matter of course. The world was changing under him, and he didn't like change.

So when Richard Stallman founded the FSF in 1983, the same year DOS 3.0 came out, he was reacting to the recent and ongoing rise of proprietary software. Microprocessors created an exponentially growing hardware market, the Apple vs Franklin decision gave software authors vastly expanded copyright powers, and a gold rush ensued (similar to the dot-com boom of the 90's) that hired away all the "hackers" from the MIT AI lab to work in a pair of start-up companies attempting to commercialize LISP. While his fellow students graduated and took jobs out in the booming commercial computer industry, Stallman stayed in academia as a perpetual grad student, and hid away from the problem as long as he could.

Kicked out of the nest: cancellation of the Jupiter project obsoletes the PDP-10 at the heart of the MIT AI lab.

The big change that forced Stallman out of his comfortable nest at the MIT AI lab was the obsolescence of the operating system he used, which was tied to specific hardware that was discontinued.

Over the course of three decades (1950's, 60's, and 70's), the MIT AI lab upgraded from the tiny Memory Test Computer all the way to a PDP-10 mainframe, keeping ahead of Moore's Law (which Gordon Moore first documented in 1965). Just as the PDP-6 became obsolete and was upgraded into the PDP-10, the PDP-10 itself was over a decade old by 1980, and ripe for replacement.

To replace it, DEC prototyped a new machine called the "Jupiter project", which was roughly compatible and would thus allow PDP-10 software to be easily ported to the new system. But there were only 700 PDP-10 systems ever sold, and their newest computer (the VAX, an upgrade to the PDP-11) was selling thousands of systems per _week_.

The VAX was completely incompatible with the PDP-10. The PDP-10 (like the PDP-6 before it) was a 36-bit machine with 6 bit bytes, the VAX (like the PDP-11 before it) was a 16 bit machine with 8-bit bytes. Porting assembly software from one to the other essentially required a complete from-scratch rewrite.

By 1980, Ken Olsen's DEC had now been producing and selling its own computers for 2 decades, and could hire programmers familiar with them from anywhere, so he no longer needed the MIT AI lab as a source of labor. The industry was standardizing on 8 bit systems and leaving 6 bit systems behind, and the 700 customers with PDP-10 systems were already a rounding error compared to the 600,000 PDP-11 systems eventually sold, and the VAX looked to be even more successful. Thus DEC decided to focus on 16-bit systems, told its PDP-10 customers to migrade to a VAX instead, and cancelled the Jupiter project in 1983.

With the cancellation of Jupiter, the operating system the MIT guys had written over the years, a giant pile of hand-coded assembly proudly named the "Incompatible Timesharing System" (which gives you a good idea of the AI lab's "hacker" culture), was now tied to a dying hardware architecture which would not be upgraded. The writing had been on the wall for years; no new students had bothered to learn the obsolete ITS or PDP-10 programming for a while, and the booming computer industry in Boston hired away the graduates who had been maintaining ITS, until eventually Richard Stallman was the last holdout clinging to the old ways as a perpetual grad student. When MIT announced they were retiring the now-obsolete and unmaintainable PDP-10/ITS system, Stallman's carefully insulated little bubble of ivory tower academia popped, forcing him to respond.

Richard Stallman is not and never was a visionary. He's a reactionary conservative trying to cling to the past, and when the rug was finally yanked out from under him he set about rebuilding that past as best he could. Some of the things he fought to preserve were indeed worth preserving, but looking at him as any sort of guide to the future is ludicrous. He's never been about the future, except to warn against it.

There are of course more questions (why would an ITS guy switch to Unix, why was the FSF as successful as it was, how do lisp and microkernels show the dude is totally totally not a visionary, who _else_ was doing this, and where did it all go wrong), but again they'll have to wait because this is already long enough.

May 22, 2009

Fade and I met Mark at the insurance office on Far West this morning (for a definition of morning that occurred after noon, but still), to finally get health insurance through Impact Linux. Apparently, getting insurance for two employees and a spouse is collectively about twice as expensive as getting insurance for three employees. THIS MAKES NO SENSE. (Yes, I'm sure there are actuarial tables involved, but it's still crazy.)

Wondering if hiring Fade to do some web design for us counts as gaming the system.

Went swimming afterwards (Fade's new bathing suit continues to work), and then finally painted over grafitti in the alley. (They got it again last weekend. Watched three hispanic teenagers wander through the alley the next day gleefully pointing out instances of grafitti I hadn't even noticed to each other, and slap each other on the back, but of course that proves nothing...) The actual paint part for the Quatros parking wall went fine, but I ran out of paint stripper for the wooden fence, telephone pole, and dumpsters. Need to get more...

Listening to the mp3 of Barack Obama's phone call with the space shuttle Atlantis up in orbit (linked from the whitehouse twitter). Early on Obama said "You've excited my 10 year old and my 7 year old", and I decided tha twas about right. After listening to a sitting president talk to astronauts in space, I'm wondering if I can track down a similar mp3 of a racecar driving cowboy talking to firefighters who are in the process of fighting dinosaurs, just for comparison purposes to calibrate the "we have cool jobs that kids want to grow up to do" level of this conversation.

Ooh, and this one is an mp4 file (video, not audio). Darn good resolution on the video. (In general I like the whitehouse twitter, but they have huge problems with the 140 character limit...)

So I wrestled with the design of the new qcc wrapper for a while, unsure quite how I wanted to proceed, but now it's turning into a variant of ccwrap that breaks down stages. It calls cpp to do preprocessing (and that's the stage that gets

The advantage of doing it this way is it both partitions the task nicely (eliminating duplicate code, so multiple different stages have to understand about library paths and such) but also lets me test against the existing gcc stuff before I've built the new replacements. (Writing a wrapper I can't _test_ makes plugging in other functionality a real pain, because if it doesn't work it's not immediately obvious _why_. It's also no fun preprocessing C code you can't compile, or building .o files you can't link...)

May 21, 2009

The Chocolate Penguin Mints arrived! (We ordered 36 tins.)

I have eaten too many of them already today. (They're caffeinated, although apparently not as much as they used to be.

Poking at the UT non-credit course schedule. They've got a pdf version of the current catalog.

So I'm designing the top level "cc" wrapper to call cpp the -M options are all aspects of cpp, not cc1. At the moment, I'm just trying to get it to call all the gcc tools in order (cpp, cc1, as, ld, strip) without telling one of the multi-function tools to do anything one of the single function tools can do. (So even though cc1 can preprocess, and ld can strip, I'm having the specific tools that only do that do that.)

I'm also writing it to not pass on any option it doesn't understand, so if I haven't immplemented support to grab "-M" then it'll barf when given -M, instead of passing it on to cpp.

Then I can swap out this wrapper for the one in FWL and make sure it can build the whole system, and maybe gentoo from scratch on top of it.

At some point in the future, I'll probably want to coalesce stuff back together. It's possible to implement the compiler as up to six processes (cpp | cc1 | as | ld | strip, all called by cc) but passing preprocessed data from cpp to cc might not be a win vs just keeping each translated line in the processor's cache and dealing with it immediately, and outputting assembly source and then assembling it is _definitely_ slower than generating the machine code directly (although the _capability_ is still needed to support -S, which the kernel needs in order to create some of its header files).

But that's all future optimizations, which can at least wait until I get around to actually implementing the code that needs to be optimized. (I'm not too worried about anybody calling cc1 directly, and if calling cpp directly was particularly common Ubuntu wouldn't have split it out into its own package which I've never needed to install to build anything I've tried. Builds do call ld directly and expect it to strip, but that's a question of outputting _less_ stuff.)

May 20, 2009

Still vaguely uninspired about programming. Poking at qcc, but admittedly somewhat listlessly. (Currently Linux isn't seeming like a fun hobby at all; I deleted my old git howto rather than defend it against some idiot who insisted any portion he didn't understand needed to be removed from it. Apparently "I don't understand what you're saying, not that I've actually tried the technique you described to see if it worked" means "you must be wrong". *shrug* Deleted the whole thing, which seemed easiest way to give him what he was technically asking for. There are at least three other git howtos out there written by people who care. Talk to the 404.)

Followed Fade to the library this morning and got a book, Keith Olbermann's "Truth and Consequences". I was somewhat surprised he'd found the time to write a book, but apparently it's mostly transcripts of his special comments with introductions. (About 2/3 of the way through now.)

Various other errands during the day. Went out with Fade to buy a bathing suit, hit a mall.

Up to DVD 5 in bleach, which is the one where they loose track of astral form vs physical form. (Or at least he spends most of an episode with his body being doctored for wounds received as a soul reaper, despite never having established the Matrix-like "your body makes it real".)

Not that this is the first series that's needed a continuity advisor. (Red Dwarf's books tried to justify Lister having his appendix out twice, but glosses over the fact that the first Joke Lister makes upon being revived is "Three million years? I've still got that library book!" and later in the first season "I've never read... a book." Yeah, it's a comedy and these are jokes of opportunity, possibly self deprecating humor on Lister's part, and the MST3K mantra certainly applies to most of the series. Still...)

Bleach remains fun, when it doesn't go into DragonballZ territory. (Watching people yell at each other for 10 minutes isn't fun, and these people would do well to study the video game concept of "logarithmic difficulty increase", although World of Warcraft has some similar funky balance issues. I can't be _too_ upset that individual members of the strongest 10% of Soul Reaperdom could each slaughter the remaining 90%, so why the heck are they sending _those_ guys out into the human world to fight hollows? If you stop and realize that a single Stormwind guard could solve all of Westfall's problems in an afternoon, "it's required by the plot" is really the only explanation required, or offered.

There are a few tricks to understanding the series. They level by something like an XP system (or as they say, "spiritual power increases fastest when the soul is in danger"), but Ichigo's limit break is to level, so when he's almost killed he goes "ding" on his next attack (which heals him out of limit break territory). Orihime's power is based on the Mythbusters Mantra: "I deny your reality, and substitute my own" (which makes her potentially even more of a walking deus ex machina than Ichigo, although she levels more slowly). When cornered and otherwise close to levelling, all of 'em do big long "bribe the GM with a backstory solliloquy to get character development XP" to push 'em over the edge. Don't expect to get any of the jokes, the entire cultural background they're based on is different (although the ones parodying other anime series/tropes you at least have a chance on).

Currently watching the episode where Protagonist Boy regains soul-reaper-dom after starting his heartless hollow transformation. His limit break kicked in and screwed up the cutscene, and he wound up with the "hollow" and "soul reaper" flags both set. :)

May 18, 2009

So I'm writing a fresh implementation of ccwrap.c for qcc,

Wow the gcc option parsing is crap. -static and --static do the same thing, "gcc -cv hello.c" says that -cv is an unrecognized option but "gcc -c -v hello.c" is just fine... "-Lpath and -L path" are synonyms, but "-lpthread" and "-l pthread" are not...

Rather a lot of "hmmm" moments. I'm starting by reading ccwrap.c and writing a new front end "qcc" command line interpreter, which creates appropraite command lines to call cpp, cc1, as, and ld as necessary. (This means none of those commands need to know about standard paths for include files or libraries.) Note that the existing ccwrap.c calls the gcc front-end which does a lot of command line rewriting of its own. This is a harder problem, breaking up the command into stages and calling the sub-commands directly. Another issue is whether or not those sub-commands should ever call each other, or whether the way they're called should string together "cpp | cc1 | as" manually.

The issue I'm facing right now is since none of those sub-commands deal with standard includes or standard libraries, it seems silly to feed them --nostdinc and --nostdlib. (The qcc versions won't _have_ the logic for that stuff.) But the gcc version of cc1 does do that unless called in such a way that they don't have anything to do, whigh gets us bac to the "cpp | cc1 | as > out.o" thing. (On the bright side, that would distribute the work over more processors on smp systems. Is there enough work in preprocessing for this to have any benefit? The existence of ccache implies so. But the communications overhead might make it a net loss anyway. It's easy to program and keep it nicely modular, but it's not that hard to keep it in one process, either...)

Anyway, from command line option parsing standpoint, cpp needs to understand "-D -U -I -o -M* -d* -nostdinc" (and possibly also "-include -iprefix -withprefix -system"). And ld needs all the stuff from "-static -shared -L -l -Wl," and such. So presumably if qcc is dispatching right, the cc1 command line parser doesn't need to know about either...

The question is, of course, how to come up with command lines that gcc would understand, which don't require the same code to be written twice in qcc, and which doesn't impose unpleasant communication overhead between processes. (Partitioning the task! A design issue...)

May 17, 2009

Lunch with Mark, who pointed out a couple of obvious bugs in FWL so I could fix 'em. (Canadian cross broke the ability to override CROSS_TARGET. Ah. Fixed now, I think. And I need an /etc symlink to /usr/etc because putting /etc/resolv.conf in sources/native winds up adding /usr/etc/resolv.conf (or /tools/etc/resolv.conf with a different config).

The PPC board cisco sent him has a dead debug adapter (maybe another bent prong? Hardware problem, not my area...) and although qemu has a bamboo board emulation what it doesn't have is a ppc 440 processor emulation. I tried to walk him through how to tweak qemu to possibly get the bamboo board to do _something_ (plug in a ppc 405 processor emulation, which is a bit like plugging a 386 into a 16-bit PC, but it should run 440 software). But it turns out it's not loading the device tree blob properly, which means I have no idea how anybody's ever managed to use the bamboo board emulation to do anything. He said he'd ask on the list...

Walked home from up near Mark's. Six hours of walking. I'm out of shape.

Got a design document posted to the qcc mailing list. It's rambling and a bit incoherent in places, but it's a start.

May 15, 2009

I asked Mark to create a new qcc list so I can post design documents and such from the new project. Have not yet posted such a design document, but wrote a longish notes file yesterday.

Currently frowning at ccwrap.c and rewriting it from scratch in preparation of slotting the options.c material into place. There needs to be a cc applet that rewrites the command line (so "cc hello.c" becomes a much longer command line with -I and -L entries and mentions of crt1.o and such) and then calls a cc1 applet that does actual compiler thingies. But whether it should literally create a new command line and call another main function with its own command line parser, or whether it should just set up data structures internally which the other command line parser could also do, is an open question. (Leaning towards the first. Simplicity vs code duplication is one of those tradeoffs that makes you want to step back and think about the problem at a higher level because there should be a way to get them to not fight like that...)

Part of it is "should this wrapper still be able to call out to an external compiler, ala gcc?" The current wrapper tweaks $PATH because gcc won't be able to find its own sub-commands otherwise, which clearly _isn't_ required for a qcc thing calling another applet internally via a busybox built-in command style multiplexer pseudo-exec. But if the other 90% of it _does_ work that way, is losing that functionality good? (Especially if I'm going to be writing low hanging fruit like ld and ar and such first, and possibly handing off to some other cc for test purposes...)

Another thing is that ccwrap.c has c++ support, and qcc doesn't (and probably won't any time soon, since tinycc never did). If it's doing c++, it needs to call out to an external c++ compiler. (At least something like cfront.)

Eh, worry about it when I've got more code written...

May 13, 2009

Twitter is down for maintenance. (Fidget. Fidget.)

I admit the #fixreplies thing is an impressive #twitfail. They spun it as a user interface improvement, except that it used to be a configurable setting which already defaulted to "off". They removed the _option_, with no warning, and called it a UI improvement. That's what we call rolling a crit-fail on your intelligence check: When you have Neil Gaiman retweeting Stephen Fry pointing to an article about how stupid your policy is (both of those people are in the 10 most followed twitterers list), you have officially screwed up.

It turns out that the change was due to technical limitation in the database scalability or some such (which means they can't put it _back_ easily), which in some ways makes it worse. This was actually a _marketing_ crit fail. If they'd said at the start "we had to yank it because it didn't scale, the code's unmaintainable or it bogs the servers, we're working on a new implementation", that's one thing. But the actual quote from their "small-settings-update" announcement was, "Today's update removes this undesirable and confusing option." I never even _used_ the option that was removed, so my config hasn't been affected by this, and I'm still kind of annoyed at being patronized like that. I don't want them fiddling with the UI "for my own good", nor do I want them insisting they know how I'd prefer to use their service better than I do. They earned the wall of flame coming back at them right now.

They've been dancing around an apology ever since, but have yet to actually do a Mea Cupla for treating their users like idiots. Nor have they promised not to crash around like tone-deaf elephants in future, perturbing the UI of a service millions of people use with no warning or beta test site. That's an easy way to get millions of people to find or set up other microblogging sites, which would mean I'd need an rss aggregator to follow microblogs. Once upon a time, the livejournal friends list was all you needed. Now as Jeph Jaques says, "Can you believe people still have Livejournals?"

It's inevitable that the idea of microblogging diffuse beyond a single website, the way free website hosting isn't just "geocities" anymore and free blog hosting no longer means "livejournal". In a lot of ways livejournal was doomed by RSS aggregation. But right now, there _isn't_ a way to do @replies across sites, and until we come up with one and spam-proof it, twitter still has a structural advantage. But making their entire userbase see them as poor custodians of the service they use probably just shortened their time of dominance by about a year.

Elsewhere, I'm trying to remember when reading Linux Weekly News became a chore. About a year ago, I think. My May 9th entry isn't a sudden development, more a buildup of annoyance leading to a narrowing of interests (or at least avoiding the bits I don't want to deal with), which isn't healthy long-term. Life's too short to spend it doing things that aren't fun (at least more than necessary to get back to the fun bits; yay delayed gratification but not 40 years worth). But I used to read linux-kernel and for fun.

Sigh. I've done this before. I need a project. I get enthusiastic about _projects_, not about abstract technologies. I'm still banging at qcc but it hasn't "caught" yet (mostly because I'm still in the research phase instead of writing code). FWL isn't quite a chore yet but "stuck with gcc 4.2.1" and "caught between busybox/toybox" are a couple largeish design flaws...

Oh well. Back to turning QEMU into a C compiler...

May 12, 2009

It's been less than a week, and I forgot the horrible password UT insisted upon, which needs numbers and punctuation. Thanks for restricting the keyspace in the name of improving security. You realize that windows viruses will keylog it from everybody stil running Windows, right? Admittedly this being a college (with an apple store in the student building and everything), that may be a statistically small enough percentage they don't care...

Tomorrow: get it reset, and then use standard password security enhancement technique #1: write it down. (Admittedly, using a post-it note on my monitor is less convenient with a laptop, but I suppose that's merely advisory. Like the duct tape in the carrier pigeon IP rfc...)

Reading the tinycc source, collecting bits easy to yank for qcc. Got curious and browsed alloca.S. Wow, I need to brush up on my assembly.

I was a bit confused why "alloca.S" didn't compile from the command line, but then I wandered through the makefile and noticed there's no rule to build it. So I looked at the git version of tcc-grischka, and although they've got an alloca.S there too _it_ never gets compiled either.

So my original plan of "tweak and test build" to see if I can "pop %eax" instead of "pop %ecx" then "move %ecx,%eax" and so on... Kind off derailed by the fact that the original code I'm reading doesn't work, and the last time I actually tried to _author_ x86 assembly code (rather than figure out what debugger output is doing wrong) was... 1994, I think? Yeah, time to brush up...

(Hence my desire to get online and download an x86 assembly tutorial. I've got books on it at home, just not with me. And a reference != a tutorial...)

Oh _that's_ cute. Although gcc won't compile alloca.S, tinycc itself _will_. Ok, at some point I need to figure out what's wrong here, but for now...

And it died because I haven't got the 32-bit development files installed. Right. Come back to this when I have internet access...

May 11, 2009

Why am I *ahem* "not a fan" of the FSF? Ok, I'll explain.

I'm not sure you can be a computer historian and not hate the Free Software Foundation, because they're not who they say they are, they claim credit for things other people did, and these days their agenda has become counterproductive and harmful.

A revolution to preserve the status quo

The US was founded by revolutionaries, hence the name "Revolutionary War". The founding fathers tried out all sorts of new ideas, and often took multiple attempts to get it right; replacing the articles of the confederation with the constitution, then adding a big batch of amendments to the consitution in the bill of rights, swapping in the electoral college when the first method blew up in their faces, going back and forth about whether or not to have a central bank, eventually needing a civil war to finally resolve the whole slavery issue...) The US founding fathers would be the _first_ to say they were imperfect people just doing the best they could.

These days, the founding fathers are primarily invoked by ultra-conservatives, often while fighting against things like desegregation, women's rights, gay marraige, bikinis, and so on. They're about as far from revolutionaries as you can get. (As Mark Twain said, "Conservatism is the blind and fear-filled worship of dead radicals".) If you couldn't do it in their grandfather's day, you shouldn't be able to do it now (so throw out that cell phone and don't use antibiotics). You wouldn't think this attitude would come up much in the computer industry, but human nature is the same everywhere, and thus we have the Free Software Foundation.

The main reason I can't stand the FSF is they're ultra-conservatives masquerading as revolutionaries. While that's always a possibility when "the revolution" drags on for 30 years and the ideals they're fighting for fossilize in their proponents heads, that's not what happened here. The FSF actually started that way.

The FSF was founded as a conservative reactionary movement. Back in 1983 Stallman's goal was to stop (or at least personally avoid) the rise of the brand new proprietary software industry, and return to the glory days of the 1970's.

Whether or not a return to the past was a _good_ goal is a separate issue; there was nothing visionary about attempting to roll back the clock, so the people who see RMS as a "visionary" today are _badly_ misinformed. He is not a visionary leader, he was taking us back where we'd already _been_. If anything this makes him _less_ qualified to predict the future, because his most conspicuous trait is that he doesn't like change.

Unfortunately, this doesn't stop him from trying to predict the future, proclaiming dire warnings about it, and attacking others who have a different ideas about what's ahead. Nor does it prevent hordes of impressed followers (who don't know anything about computer history and think this future/past he's describing is great) from hanging on his every word.

Before there was a proprietary software industry, 1939-1979

So why is "free software" older than proprietary software? The proprietary software industry didn't exist back in the 60's and 70's for two reasons:

First, until the early 80's there wasn't a critical mass of customers to sell proprietary software to.

The MIT AI lab where Richard Stallman learned to program was started with a computer called the TX-0, a unique system built as a prototype by Ken Olsen, the guy who went on to found Digital Equipment Corporation. (The book "Hackers" by Steven Levy covers this in a lot of detail, as does this marvelous interview with Ken Olsen.) That was followed with a DEC PDP-6 mainframe, which the shipped a total of 23 machines in its entire production run. The lab was later upgraded to a PDP-10, of which about 700 total ever made. The fact that MIT modified its computers to add extra instructions and wrote its own operating system from scratch really wasn't that unusual under the circumstances; programs written for early mainframes and minicomputers often wouldn't run on any other machine in the world even if they hadn't been modified by their users, simply because their component selection and configuration was unique.

For most of the 1970's, the best-selling computer in the world was the PDP 8, which was introduced in 1965 and sold a grand total of 50,000 units during its entire production run. The PDP8 was displaced as bestseller by the Apple II, which was introduced 1977 and over the next two years sold a combined total of 43,000 units. But that was enough to put it on top: in 1980, when Apple had its IPO, it had 50% market share of the computer industry (according to the PBS series "Triumph of the Nerds").

So if you tried to sell "shrinkwrapped" binary software back in the 1970's, the largest possible market you could hope for on any platform was _less_ than 50,000 copies, and even that's only if you targeted the most popular platform in the world and somehow managed to sell a copy for every single computer of that type ever produced.

The second reason the proprietary software industry didn't exist in the 60's and 70's was that copyright law didn't cover compiled binaries. Human readable source code was copyrightable (as an original work of human authorship), but binaries produced by a compiler or assembler were "just numbers", and the law didn't explicitly allow them to be copyrighted. So if you wanted to claim any copyright on your program, you were _required_ to ship source code. Shipping source code gave someone _more_ control over the result than just shipping a binary.

Thus publishing source code (with copyright notices) became official policy throughout the computer industry (DEC, AT&T, IBM)... They actually had fewer rights if they _didn't_ publish their source code. (When Unix came out in 1974 what was special about it wasn't that it came with source code; that was normal. What was special is that the source code was in a high level language that minimized porting effort to move it from machine to machine, instead of the assembly language all the other operating systems were written in.)

Software in the 1960s and 1970s was either produced by hardware manufacturers to help sell their hardware (who happily gave out the source code, generally written in assembly language), or else it was produced by the people who bought the computers for their own use (who had no incentive to withold source code from _themselves_). Users sometimes hired consultants to a write custom software for them (in which case the source code was one of the deliverables), but back then the only way to use a computer was to learn how to program it, so users wrote most software themselves or had someone on staff to do it for them. The vast majority of software in the 1970's and earlier was written by the users, who naturally shared the results through early user groups like DECUS and the CP/M Users Group Northwest.

So how did this _change_? Where did proprietary software come from, and when, and how? How did Richard Stallman's little utopia at the MIT AI lab crumble and force him out into the wilderness to try to rebuild it?

I should probably hold off until next time for that. This has gotten a bit long. :)

May 10, 2009

Ok, I know what to do with my tinycc fork now. :)

The obvious way to replace gcc is with qcc, the QEMU C Compiler. Take tinycc, break it up the rest of the way, and slot it into the qemu source code (in a qcc subdirectory) so it can build an "$ARCH-qcc" from each current "qemu-$ARCH", using tcg (the Tiny Code Generator) as a backend.

I'd held off on gluing tcg to tinycc because forking tcg wouldn't be fun (or productive). It's still under active development, any changes I need to make to it must go upstream or they'll become a maintenance nightmare. Pushing the compiler front-end upstream into qemu makes noticeably more sense, if they'll take it. (But they already _do_ all the hard parts...)

Right now qemu application emulation can parse ELF executables and shared libraries, so qcc's linker needs to use that code (and teach it to handle .o and .a files). So in addition to redoing the code generation backend, the linker needs significant work to merge it into qemu.

I split my fork's option parsing logic off into its own file a year ago, and I still need to do the swiss-army-executable thing on it and merge it with the ccwrap.c in FWL.

That's a lot of work. Trying to figure out what order to do stuff in...

Anyway, providing a decent replacement for gcc means the only GPLv3 piece of software that actually matters is Samba, and the kernel has built in smb client support so a simple server could probably be exported from that. (It might not support Windows 95, but I can't really say I care. Current samba isn't exactly embedded code right now, is it? Oh, and the dirty trick to ignore all the case insensitivity issues: export a vfat filesystem. :)

Not quite back to my normal levels of enthusiasm (or at least bloody minded stubbornvindictive problem solving), but any realistic chance to rip out one of the foundations of the FSF's power base is always invigorating.

May 9, 2009

Still apathetic about programming in general.

Linux on the desktop is dead. Ubuntu 8.04 doesn't work with Netflix or Hulu or Youtube's television options. 8.10 was a disaster. And now 9.04 doesn't work with Intel 3D, not only the most common one but the one that used to be recommended for use with Linux. Meanwhile, Apple's market share finally hit double digits. (Whether or not Windows 7 sucking less than Vista counts as damning with faint praise or is actually significant, I reserve judgement on.)

GPLv3 has fatally split the community. The last GPLv2 version of gcc is 4.2.1 which is increasingly obsolete (doesn't do armv7 and up, meaning no thumb 2), and there's still no replacement compiler. Neither PCC nor LLVM/Clang has turned into anything useful yet, and tinycc development is just painful to watch.

Actually, tinycc development is _hilarious_ to watch in a black comedy way. Grischka admitted that he didn't take most of my code because he didn't understand it, and now _three_years_ years later he's slowly reproducing some of the obvious refactoring I did literally back in 2006. At this rate it might only take him 3 more years to get to the point my fork was at when I abandoned it.

It's painful to go back to work on BusyBox both because of lingering Bruce and because I like the toybox infrastructure so much better. And there's a lot of cleanup work to do, starting with removing the forest of #ifdefs littering the code now.

But convincing people that conceptual bloat and what you can measure with bloat-o-meter are not the same thing is exhausting. (A wrapper script around mdev is a better design decision than trying to shoehorn firmware loading into something that wasn't meant to do that. It's not mdev's job to do every possible hotplug activity, and complicating the hell out of the config file syntax is not the answer. Also, removing the mmap config file parser in favor of much slower and more complicated code that allocates memory for each line of text in the config file (and then frees it again), and does this again and again for every single device during "mdev -s"... That was wrong. I don't care if it's shared with other applets, it's still wrong.)

The reason for mdev in the first place was I disagreed with the design of udev, and that there should be an obvious and simple way to do it. I had to fight the kernel guys (Greg KH and Kay Sievers) who thought that sysfs was a private export and not a stable userspace API, so anybody trying to write their own code to do this rather than using udev was obviously crazy. More recentlly I had to fight the kernel guys (pretty much Peter Anvin) who are rewriting the kernel build system in perl. In addition to not needing an additional large prerequisite package which would rule out sane languages like Python or Lua, perl is a specific programm, not a language. It has no standard, merely a single implementation (just like Microsoft Word or Microsoft Excel). Whatever that implementation does is Perl, and it's never successfully been reimplemented (despite the Perl 6 guys trying for over a decade now). That's not a language, that's a legacy hairball.

Sigh. I feel like I'm fighting the entire Linux community, and I'm tired of it. Yeah, you can duck into the embedded world and avoid most of that, but in a proper embedded system you don't care what it's running. Might as well be DOS. The current most interesting embedded device (the iPhone) has MacOS X.

When this laptop dies, maybe I'll buy a mac. (About half the kernel developers are already doing their kernel development on macs, from what I can tell...)

May 6, 2009

In the same way that toll boths tend to cause traffic jams at rush hour, the login screen at chick-fil-a has been going "The service is initializing, please try again later" for fifteen minutes now. (And for years phone company engineers quietly said that tracking long distance usage for billing purposes was the single most expensive part of the phone call. There's a theme here...)

Been totally uninsired to work on computer stuff for a few days now. Taking a break.

Wandered through UT yesterday to see what would be involved in going back to classes. Tuition's now $4500/semester. Before books. It was something like $1200 back in the late 90's. That's the "edumacation" president for you.

Turns out Disney continues to be evil. (Who could have predicted that?) They were an early pioneer of unskippable commercials on DVDs, now they're removing special features from rental dvds. (Perhaps that explains why the Lilo and Stitch Fade and I got through netflix last night didn't have any commentary?) As always, the correct response to this sort of thing is to just grab the full version off of BitTorrent.

If Disney hadn't been gradually turning evil and senile for years, this would probably be ignored. After all, another disk we watched last night (Master and Commander) has a special edition which Netflix doesn't carry. But this reads as part of a pattern; Disney is an evil money-grubbing entity trying to infinitely extend copyright terms and so on, but they drove creative people like Katzenberg away (the guy behind the run of movies from The Little Mermaid through The Lion King, and now the K in Dreamworks SKG), and wound up having to buy Pixar to have any new content anybody wants to see.

The Mouse has turned into The Godfather in his old age, but when his "offer you can't refuse" is (according to the interminable previous on the Lilo and Stitch disk) "The Jungle Book 2", "Return to Atlantis 2", live action "Inspector Gadget 2", a live action "Berstein Bears" movie, "101 dalmations 2"... I think "out of ideas" happened back around the "Peter Pan 2" movie (which wasn't direct to video, but should have been). When I hear the word "disneyland" these days, the first thing that comes to mind for me is that Blagejovich and his hair went there last month to celebrate his indictment. Six flags is way closer.

May 3, 2009

Much cleaning of the condo before Fade gets home.

Poked at build strace a bit in the evening. This is a weird program. Building natively under armv5l, it went:

In file included from syscall.c:129:
linux/arm/syscallent.h:435:3: error: #error fix me
linux/arm/syscallent.h:457:3: error: #error fix me

I'm guessing that strace 4.5.18 doesn't support the 2.6.29 kernel, at least not on an arm host. (Instead of failing by saying unknown syscall at runtime, it fails by refusing to build. That's nice. I note that 4.5.18 is the most recent version on and that page has no link to a mailing list...)

The source code's README says:

You can get the latest version of strace from its homepage at .

Which is 404 and according to has been since 2006.

I'm not sure "maintained" is a word that applies to this package.

Mark pointed out that the Gentoo guys have patches for this, and apparently have for some time. (Apparently there's been nobody upstream to send them _to_? Dunno.) Oh well, yay it's fixed, moving on...

May 2, 2009

The adjective to describe today has probably been "virtuous", largely by accident. Cleaned portions of the condo, picked Mark up from the airport, swam, biked, ate healthy (at Jimmy John's right now)...

Poking at a dropbear build script under the native root filesystem for a nightly cron job. Alas, dropbear went to autoconf for some inexplicable reason, so of course it sits there making stupid tests (very slowly):

checking for gcc... gcc

Yes, gcc is called gcc. If you were checking for "cc" (the standard name required by SUSv3; SUSv4 seems to require "c99" now), that might make some of the later tests less silly.

checking whether we are using the GNU C compiler... yes

Um, why do you care, and what was the first test for, then?

checking for C compiler default output file name... a.out

You can specify -o all the time so this never matters. Same with suffix of executables and suffix of objects files, but you can also just accept that there are exactly three interesting build platforms these days (Linux, MacosX, Windows) and you're under no obligation to support all of 'em anyway.)

checking whether the C compiler works... yes

During the build, you'll notice pretty quickly if it doesn't. You'd also get a better error message saying _why_.

checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes

A) This is required by the Single Unix Specification, B) there's a gcc that doesn't accept -g?

checking for gcc option to accept ISO C89... none needed

Again, your very first test was for gcc, then an explicit "are you really gcc and not another compiler pretending" (no idea why), and now it's doing this kind of crazy stuff.It just goes on and on like this. But that's not the funny part. Here's the funny part:

checking for deflate in -lz... no
configure: error: *** zlib missing - install first or check config.log ***

You have to explicitly say --disable-zlib. What exactly is the POINT of running ./configure if you have to explicitly tell it on the command line what your environment contains?

Of course when you do that it then goes on for a long time checking "is the strip command called strip" and so on, explicitly checking whether or not our C compiler has "memcpy"... and then at the end it tells us to edit options.h to select the options we want.


May 1, 2009

Saw Fade off the airport for Penguicon. (I'm not going this year, the con chair and I don't get along, it's probably better for the event if I stay out of his hair while he's wearing the con chair hat.) I feel bad I won't get to hang out with Garrett and Christian and Tracy and Heather and so on, but oh well. Maybe next year.

Found my credit union debit card (jacket coat pocket, of course; things always get lost there when it gets warm), which saved me a trip this morning.

Lunch with Stu and Beth at the Lakeline Mall. Lasted until around 4, much shooting of the breeze was perpetrated. Went to Fry's to pick up a TV input switch so we can hook up the Praystation, the Whee, and the Ex-box to the same TV. Looked at new TVs while there. Did not buy one.

Read "No more dead dogs", which was good. (I have a weakness for young adult books; they spend less time trying to be "original" and more trying to be _good_.) Looked around the shelves for something else to read, decided I wanted to read "A bad spell in Yurt", found book 5 in the series, book 4, and then another copy of book 5, but no 1-3. Collated bookshelves a lot. (Most of our Terry Pratchett books are now in one place; they fill 2 shelves and involve not just paperbacks but 4 different sizes of hardcover).

Trying the new "Big Bite" place on 24th street for dinner. Not really noteworthy except that it's theoretically open really late and has less intrusive music than Jimmy John's. (Half the time Jimmy Johns' music is _good_, but it's too loud to hold a conversation and too loud for my noise cancelling headphones to make a dent in. Not a good work environment. I suppose that's the point.)

Alas, Big Bite only has diet Pepsi, which is like a hangover remedy designed by Franz Kafka. (It has not yet been featured in a "Steve, Don't Eat It!" but I feel it should.) Alas, they don't have diet mountain dew (the only real excuse for the Pepsi product line). Jimmy Johns is open until 3am and has diet coke. The Wendy's in Jester is open until 4am and has Coke Zero. The McDonald's is open until either 10 or 11 (it varies based on ambient humidity) and has Diet Dr. Pepper and 2 apple pies for a dollar, so I wind up going there rather a lot.

This is a truly uninspired burger. I'm not sure how they managed to get something this bland out of what seem like perfectly reasonable ingredients, the texture of which suggests they were comptently prepared. Hmmm... Did they put _any_ salt on it? Nope, doesn't seem like they did. The mayonaise undermines the result a bit (maybe I made a mistake asking 'em to hold the pickle, the mayo has nothing sharp to mute). And I like big fluffy soft bread but this bun doesn't go with this burger. Too much squishy and tasteless tomato (obviously picked rock hard green and allowed to ripen in a warehouse). The cheese is either mozarella or one of the blandest whitest most tasteless american cheeses I've encountered, which is saying something...

I'm not entirely surprised. A place that advertizes 30 different types of hamburger, pizzas, gyros, grilled chicken sandwiches, cheesesteaks, wings, wraps, subs including eggplant or shrimp parmesan... It _might_ just be overextending itself and not do any of that particularly well. But it was worth a shot. They'll probably pare down the menu in a few months.

And I must admit, the free internet makes up for a lot.

Ok, I know I'm weird, but I'd just _assumed_ the Penguicon guys had a video link set up so Wil could address opening ceremonies (and maybe even voip-phone in a panel or two) from home if he couldn't physically make it to the con again. USB web cameras are what, $15 these days? After a Red-sox-like streak had established itself, it just seemed like common sense to me. (Although I admit my money was on "meteor strike" this time. And no, the third time isn't "hindsight".)

Rather than turn this into armchair quarterbacking, here's a Plan For Getting Wil To A Future Penguicon. (I don't know who future Penguicon runners might want to invite, but in case they someday decide to try again with Wil, and he's up for the abuse, here's how I'd go about it.)

None of this would have actually prevented him from getting sick, of course. But it would allow the whole process to fail in the most entertaining possible manner. :)

(Actually, It occurrs to me that the logical way to get Wil to a future Penguicon is don't announce it. Do it in secret. Fly him in quietly, sneak him into the hotel, and then have Ninjas carry him into opening ceremonies (bound and gagged, if possible, and untie him on stage). Schedule either "Robert Wesley" or "Stephen E. Whitfield" (various pen names used in Star Trek) on Wil's panels and swap him in for those... Somebody on twitter suggested that we should declare he's been reconned into previous Penguicons and that this is actually his Xth appearance at the con...)

April 30, 2009

Eventually decided that building under emulation was the way to go after all, even for three small packages. For one thing, I'm not trying to get strace to cross compile. For another, it's nice to have a smoke test of the build environment to catch things like the fact that the current is leaving prefixes on the _native_ toolchain. (Sigh.)

April 29, 2009

Nothing to say today. Not dead. Fed Mark's cats. Did I do anything of interest? (Thinks...)

Oh yeah, Dealt with the mortgage guy. Had to create a handwritten letter confirming I'd transferred money from savings to checking within the same account, which is at the same bank I applied to for a mortgage, then had to sign and fax said letter. It was an emergency, which will probably delay the closing date, which would cause the sellers to walk since it's already been delayed twice and the end of the month was a hard deadline for them. I told the mortgage guy, who then went on a rant about how he was doing me a favor wasting his time on this small loan that they'll make no money on.

I think when I move out of this place, wherever we go next I'm going to try to arrange to pay cash. May take a while, but this whole mortgage thing just isn't worth it.

April 28, 2009

Grumpy today. Didn't get enough sleep.

Ah Subway, home of the $5 footlong and the $4.86 6 inch sub. The message here isn't that their footlongs are a good deal, it's that everything else on their menu is very inefficiently priced.

I miss the days when menus actually listed everything they sold and how much it cost. Apparently this is too "texty" for a modern audience. I thought McDonalds was bad ("Does that ice cream machine behind you still work? So where's the ice cream on the menu?" They _rotated_ one of the signs until ice cream showed up. Oh yeah, I should have spotted that back inside the wall somewhere, obviously.) But the McDonalds menu is totally coherent compared to Subway, which doesn't mention the existence of roast beef sandwiches, nor the 6 inch option. You have to remember them from years ago when the menu did include them. (Don't get me started on the "Fresh Fit" sub the kids meal contains but which isn't mentioned anywhere else on the menu, nor the menu's implication that Jared is the founder of Subway.) And you wonder why I haven't been here in most of a year?

Here now because it's next to the H&R block location that's still open after tax day, although that doesn't open for another 45 minutes. More paperwork for my sister's house, now they want a K1 form for Impact Linux. (Why? It's a shell company with no assets, it passes consulting money to Mark and myself and lets us buy health insurance in a different way. According to the K1 I finally got printed out, it had 81 dollars in it at the end of last year. Half of which was mine, apparently.)

And they want everything handwritten and faxed. (Because obviously you can't fake a fax in photoshop, save it as a bmp, and send it out via a fax modem. Nobody has a fax modem anymore, and the software you'd need to drive it runs under DOS which is a pain to set up the emulator for.) I keep having to fax things from Kinko's. I've spent at least $200 on faxes so far during this process.

So I want to build static busybox, dropbear, and strace binaries for each target. This is easy to do; I can think of at least three different ways off the top of my head. It's so easy I can sit down and write a script to do this with no references or research. Which means it's one of those class of problems where figuring out the _right_ way to do it makes my brain hurt and I get writer's block for days.

I can just fire up the native build environment, wget the source, untar it, cd into the new directory, make defconfig, make, and copy the binary back out. (Building dropbear first makes copying the binary back out noticeably easier.)

In practice, the config I want to build isn't _quite_ an actual defconfig, I switch a lot of stuff off that doesn't build on various targets (m68k hasn't got taskset) or without strange host support (selinux is evil). So I need to grab that trimconfig file out of FWL, but I can wget it straight from that URL if I like. (Put "tip" in for the version number and it auto-updates.) So I don't quite need to grab the full FWL infrastructure, but it's not entirely independent either.

Next, grabbing the dropbear source again every night is a bit impolite to the download servers because it's not going to change that regularly. Desite my desire for it to grow stunnel support, the project is mostly complete and in maintenance mode. The FWL infrastructure has cacheing already; the download and setupfor functions handle this transparently. So if I _do_ grab the FWL infrastructure, I get this pretty much for free.

I can also cross compile both of these packages, which is quick and easy and ignores the whole point of the project (that building natively under emulation is the way to go, and cross compiling is a transitional stage you get through as quickly as possible). But in each case, I've got a package that's already set up to cross compile and which produces a single static binary as its output, and the rest of the nightly cron job is a cross compiling so I've already _got_ a cross compiling build environment set up. It's even got the FWL stuff sitting there ready to use, and if I jump into the emulator I have to re-download FWL and set it back up if I want to use it.

So there's a simple way to do this which has obvious flaws, and a more complicated way to do this that probably isn't worth the complication, and this is the kind of thing that makes my brain hurt.

There are some actual fiddly bits. The script deletes everything out of packages that it doesn't know about. (This is to automatically zap old versions of the packages after you upgrade It's actually a bit of a pain, because it means that things like qemu (which are speculatively downloaded) or alt-packages (which are sometimes downloaded and sometimes not) get deleted and then have to be re-downloaded, or else are downloaded and kept around when they don't need to be.)

I don't want to add dropbear to to unconditionally download it every time, and I'd like to move strace out too. Have some kind of call stuff. But then I have to use a different download directory (not "packages") or the extra files will get wiped every time the original script runs. And the problem with _that_ is that the original is already downloading busybox, and having the second directory download a second copy of the busybox tarball is kind of icky.

Did I mention that already builds busybox, and could easily be taught that BUILD_STATIC=1 applies to busybox? Except that it would build a lot of _other_ stuff (like uClibc), and would thus be an inefficient way of building that one binary (unless I wanted to have all the root filesystems it built be static), which I don't. So this is either going to be duplicated or is going to have too much fiddly configuration shoehorned into it, and both suck. Sigh...

Again, it's the _little_ things that screw me up. The ones that don't really _matter_ in the larger scheme of things, but are nevertheless _ugly_, that I go round and round on...

Don't mind me, I'm having an aesthetic reaction to a simple problem. FETCH THE CHAINSAW!

April 27, 2009

In a lot of ways programming is an art instead of a science. Debugging, especially. "I don't know what it's doing wrong" is pretty much a requirement for a bug: very few programmers intentionally write the code wrong, so it's misbehaving either because what you wrote didn't match what you had in your head or because what you had in your head didn't take something into account.

An awful lot of reworking is because "no, this isn't right" without being tied to a specific bug. You redesign some bit to make it better (often simpler and relying on fewer assumptions now that you understand which parts are actually important), and as a result bugs you haven't actually hit yet go away. And then when somebody hits a bug in an older version and ask you which patch fixed it you can't really answer because you never saw that bug. And sometimes "which patch fixed it" was a redesign, where the memory leak went away when you switched to mmap(), or the bug was in the translation module and now you've changed the other parts of the program to all talk the same language so there isn't a translation module anymore...

I bring this up because software suspend in Ubuntu 8.04 semi-regularly corrupts the vfs cache if the filesystem is active when it suspends (such as running a build that'll take another 15 minutes to complete and I don't want to wait that long just now, but I don't want to abort the build either). And I know the kernel in 8.10 didn't do that once in all the months I was using it (which is why I forgot about the problem), but doing the Ubuntu upgrades when I installed this laptop (at which point the release was almost a year old) didn't fix the bug; the old kernel is still susceptible to it. Which patch along the way to the new kernel fixed it? Probably not that simple...

I wonder what's involved in getting a Gentoo desktop with kmail on an xfce base and a working Flash plugin. Highly _unlikely_ to be particularly simple, but it's something I could much more easily tweak myself once I got it that far... (This is probably a bit like saying "It's hard to get a tank platoon 200 miles behind enemy lines, but once you do things get much easier".)

Wait until Mark gets back to town. Definitely.

So I'm fiddling with upgrading the system images to gigabit ethernet (e1000). I just did i686 and it's slightly faster, but not much. Maybe 3.5 megabytes per second. When I do a "dd if=/dev/zero of=blah bs=1M count=100" and time it, it seems to be getting maybe 11 megabytes/second. Hmmm, when I do a wget of a 120 megabyte file and add "-O /dev/null" it also gets about 11 megabytes/second. Right, there's some sort of emulated PCI bus contention going on. Whee. Still, sending stuff to/from distcc might go faster now, which would be good.

April 26, 2009

Finally wrestled canadian cross building of static toolchains into submission. Lots of darn fiddly little things to get right (one of the fiddlier, strangely enough, was making sure that the prefixes of everything wound up in the right places. For example, you can't get uClibc to produce an "i686-ld" or "i686-readelf", you have to rename 'em after the fact. And a "cc -> gcc" symlink actually has four possible states, and "i686-cc -> gcc" isn't the one you want. Some darn _weird_ errors result if you just try to run and debug this; visually inspecting the bin directory of the cross compiler is much easier. As opposed to debugging gcc, where visual inspection tells you nothing and the useful thing to do is find a failing command line and run it with -v and under strace to see what it's actually _doing_.

So now Mark can set up the cron job and build nightly binaries, except that I have to get uClibc's nightly snapshots working again. (Every time I ignore them for a month, when I come back they don't build anymore until I've tweaked the config of a half-dozen platforms and come up with fresh patches. Might be different this time, but it seems unlikely.)

Oh I'd forgotten about that bug. Occasionally software suspend on 8.04 will catch some weird race condition having to do with a dentry that's active during the suspend, and will wind up turning every access to that dentry into a segfault. (Sometimes it does one of those non-fatal panics and tells you it's a null pointer dereference in the vfs, sometimes it just kills your process with segfault and no explanation.) I remember the first time I tried to explain source control to Fade, and show her mercurial, Python spontaneously developed this affliction to screw up my demo. Now kaffeine's got it, so I can't catch up on my podcasts until I reboot. Sigh.

Hundreds of open browser tabs, unlikely to get 'em closed this week. Rebooting with the network off means they'll all be 404 and I won't be able to read cached data offline until I reload it. Rebooting with the net on will reload them all, meaning Konqueror will eat 100% of the CPU because tabs that aren't selected still run their javascript (which is a bad thing, but seems to be intentional).

And the dying thing is /usr/lib/ which is taking out amarok too, so I can't listen to mp3s either. Let's see, if I use Konqueror to navigate to an mp3 file and click on it... It tries to launch amarok. Idiot. It has built-in mp3 playing capability, and the flash plugin can also play mp3, but no, it wants to launch an external player. Brilliant.

Oh well, at least the stupid "Are we human, or are we dancers" song I was trying to drown out has gone off the radio now. (Because no human can dance you moronic whiny git.) Now it's gone to some other song by the same guy who did the original song Weird Al's "you're pitiful" is based on. It's a whiny falsetto theme on this station. The noise cancelling headphones, they do nothing.

The radio has now moved on to "Every time you go away, you take a piece of me with you". The frightening part is that this is is an enormous step _up_. (At the very least, I'm no longer contemplating surreptitious ways to damage the PA system here. And no, it's not the theme from the "Saw" movies. Don't think it is, anyway, despite the title I never saw 'em.)

I should look at Xubuntu now that 9.04 is out. Not looking forward to that. Should get my todo list as tamped down as possible first.

Still procrastinating about updating my resume.

April 25, 2009

Ooh, a Let's Play that Yahtzee did back before Zero Punctuation.

Watched the first disk of "Avatar, the Last Airbender" today. I previously thought it was an anime, but it's apparently a Nickelodeon production in a vaguely Japanese style (up to and including Japanese characters in the intro credits. I have no idea why. The practical effect of this is I can't turn the horrible english voices off and go to subtitles with Japanese voices that at least convey reasonable emotions whether or not I can understand a word. Nope, the english voice acting _is_ the original. I can only hope they improve as they get used to the characters...

I spent the first episode hoping the brother would die. The second episode _almost_ involved whatserface and the avatar guy leaving the rest of the cast behind (including the idiot brother) to go off on their own, but no. End of the disk, he's still there, which is sad. (I liked the voice actor when he was voicing Ron Stoppable, who was not a mysoginistic faux authority figure constantly blaming other people for his own screw-ups.) Still, Babylon 5, Star Trek TNG, and Star Trek the Motionless Picture (where "Nomad" has gone before) all started horribly and got better. The water bender girl and Aang's are both nice. And the bad guy's mentor is great.

Also watched the rest of the first Torchwood disk (episode 2, Death By Sex), the other commentary for Gosford Park (which is still way too complex for me to follow even half of it, and based on cultural details of a society farther from mine than the Yonomamo). And now I'm watching Bleach disk 3 (still episodes I've already seen back before Hulu stamped Watermarks all over the screen) which remains inexplicably Japanese in places, stopping for speeches about nature imagery in the middle of fight scenes.

April 24, 2009

A year or so back, while unpacking the boxes we'd kept in storage back in Pittsburgh, I found an old wristwatch that was still running. The pastic on one of the front buttons was missing, and when I held down the button to light up the face, or when it tried to beep, the display went blank because the battery just wasn't up to either activity. But it kept good time, and it wasn't nearly as uncomfortable to wear as the last couple of cheap chinese watches I'd picked up at wal-mart (one of which gave me a more or less sprained wrist after a week, and the other gave me a rash). My last good watch before that had gotten smashed against a table.

Last night, I noticed that my watch had fallen 15 minutes behind, when it usually kept good time. Today, there are no numbers on the face at all, it's totally dead. I feel I should have some kind of memorial service, for a watch that came out of retirement to give me one more year.

It was a good watch. I shall miss it.

Broke down and implemented a proper little setsid program in C, and made build it and call it. In theory I could have added this to toybox, but the build doesn't _depend_ on toybox being there. I could also have upgraded the busybox version to do this, and still might, but again the build shouldn't depend on the host setsid being buybox. (You can skip And the host setsid on Ubuntu is forking a background process and returning immediately, which is useless for my purposes.

So yeah, spent a couple hours yesterday fighting with the shell in an attempt to avoid writing 9 lines of C code. (Because I shouldn't have NEEDED to, darn it.)

*blink* *blink* My watch appears to have had an Inigo Montoya moment. It's back from the dead and keeping time again. It was going all disco for a bit there flashing and being generally nuts, but it settled down after about 5 minutes and now it's working just fine. I have no idea what's up. Perhaps it got water in it? Perhaps the battery is doing some strange chemical thing I'm unaware of? It's entirely possible it's gone undead on me, and/or is posessed, but as long as it's functioning properly I have no immediate reason to complain about this...

Weird. Welcome back, wristwatch. (Had to reset it. It was off by about 2 hours, but other than that...)

So the work our big client asked us to do this week does not, in fact, have a budget attached, and thus is not billable. (I guess that means it's not work product then?) Nothing billable for us to do next week, either. Meaning Mark and I probably need to update our resumes so we can look for work that _is_ billable. (I haven't objected to going down to half time due to a lack of enough assigned work, and I'd personally happily take a few months off except for the whole "buying my sister a house" thing, which makes this bad timing. Oh well. Closing should be at the end of the month, this doesn't affect that.)

Alas, the original problem with running a business comes back to haunt me: I have no trouble doing the technical work, but the sales work of finding clients isn't something either Mark or I are good at.

April 23, 2009

Ok, I haven't seriously poked at Firmware Linux in a week. Time to pick up the shovel again.

So where I left off is I rewrote to take advantage of the canadian cross stuff, and everything broke. The actual canadian cross part broke, but more fundamentally the old buildall functionality broke. One reason is that when dies, it tends to take down the script that ran it because it does a "kill 0" which takes out the current process group. Hence the first to fail stops from trying any of the later ones, and right how hw-uml is failing because user mode linux is still horked, as usual.

The way works around this is by running it via setsid (the "set session id" command explicitly creates a new process group), but needing to do that is icky. It would be nice if could do that internally somehow, I just need to figure out how.

I can't put's logic in a shell function and call that via setsid because setsid isn't a shell internal command, and a regular command can't do anything useful with the name of a shell function on its command line because it can't call _back_ into the shell that just forked and execed it as a child process. (I hit that one _all_the_time_.) I could do something incredibly ugly with "here" documents but the cure's worse than the disease there, especially with what it does to syntax highlighting, which admittedly has never worked in Ubuntu because Ubuntu hates vi for some reason and goes out of its way to break it; another reason to switch to Gentoo or something someday.

So my next attempt is to add this to the start of

  exec setsid "$0" "$@"

The reason that's unacceptable is now ctrl-c does nothing when is running. I took the exec out and added "exit $?" on the next line, but that made no difference.

The _odd_ thing is that when I call the sucker from via setsid, _that_ handles ctrl-c just fine. So what's the difference here? I tried 'setsid /bin/bash -i -- "$0" "$@"' and that confused the _heck_ out of the tty. (Ctrl-c gave me a shell prompt back, but didn't kill the running script.)

Tried calling setsid through the built-in "time" function, since that was the only difference I could see from what was doing vs this thing calling itself. No difference...

Ok, so I commented out the additions to and put the setsid back into and now _that_ isn't letting ctrl-c work anymore either. Which is strange because I'm pretty sure I tested that when it went in and it was providing the correct behavior. (I.E. kill and its children, but not the calling shell.)

I want "kill this process and all child processes, recursively". There should be a way to do that. (The gnu implementation of ps has --ppid, but the busybox ps is kind of a joke.)

Grrr. I need a way to call tcsetpgrp from the command line. Why isn't bash doing this when I call it from setsid? Because its process already is a session leader? Except that 'echo | setsid xargs -- /bin/bash blah' isn't interruptible either, and there bash is forced to be a child process of what setsid calls.

I'm using 'kill 0' because neither 'kill %1' nor 'kill `jobs -p`' work from signal context. It can't access environment variables either, so I can't save the PID and have it call that. I suppose I can reset the signal handler with an explicit list of PIDs to kill after every job I launch (make a function for it)... Except that _that_ doesn't work because killing the child process isn't recursive, it doesn't kill the sub-processes like "make" it was running (even though it wasn't a _background_ process for that shell, so you'd think the shell would do it, but no it depends on the terminal control group to propagate the signal).

Tried killing the pid of the background process as a negative number (which I believe should kill its process group), but it claims it isn't a process group. Tried running the background process through setsid to force it to _be_ a process group, but that didn't work either. (Hang on, does busybox know it's ok to do this? Does the bash builtin kill know? Trying $(which kill) and... no help. It's parsing the negative number as a signal number, so try "kill -11 -$(jobs -p)" and... it says no such pid, but with the negative number this time. Huh, I thought they'd added that support to the kernel years ago. (Try without the setsid on the child process... Nope.))

This _really_ should not be this hard. I really, really, really don't want to have to write C code for this.

Ok, spent another ten minutes installing an atexit handler in the outer instance of the script, sending a signal 11 to the child session running under setsid, catching that particular signal in there, and calling 'kill 0' if it gets it. Set up a little test where the child just waited 5 seconds, like so:

  echo Need to recurse
  setsid "$0" "$@" &
  trap "echo dying now, kill $(jobs -p); kill -11 $(jobs -p)" EXIT
  wait4background 0
  exit 0
  # If we exit before removing this handler, kill everything in the current
  # process group, which should take out backgrounded kernel make no matter
  # how many child processes it's spawned.
  trap "echo die lots now; kill 0 -QUIT; exit" 11

  echo do not recurse
  sleep 5
  echo done
  exit 0

The above _WAITS_FOR_THE_SLEEP_TO_END_ before delivering the signal 11.

*BOGGLE* It sends the signal. And the shell doesn't process the signal until its child process exits. WHAT THE...?

Sigh. The reason this doesn't work:

  echo recurse now
  setsid /bin/bash -- "$0" "$@" &
  trap "kill $(jobs -p) 2>/dev/null" EXIT
  exit $?

Is that it says "no job control" when it hits fg. Presumably because the shell isn't interactive. I can replace the last two lines of the if block with "wait4background 0" and "exit 0", but then I lose the error return code of the background process.


(Well, actually as development sessions go for me, this is pretty tame. Still frustrating, though.)

April 22, 2009

Sigh. Domestic disputes, gotta love 'em. (You can stop reading now, nobody but me will care about this entry.)

My condo complex has designated parking spaces, and my neighbor's been parking over the line of their space for a while now. Annoying, but not a big deal. Today my neighbor came to yell at me because my car, parked entirely inside the lines of my space, was preventing them from pulling out.

I say "yell" because they did, and because they brought a friend with them, and claimed I'd had them towed twice (which was news to me; I asked Mark and he said he didn't tow anybody when he was here either).

I have towed cars that parked completely _in_ my space, but not for being over the line and not since last September. (It's an autumn ritual around here, as each new year starts at UT and the renters figure out which spaces are reserved and which aren't.) If somebody tows you without authorization, what you do is contact the towing company (only one services this garage, number's on the sign over the door), then you get the paperwork related to the tow which identifies who did it and has their signature. Then you sue them.

For once, the cell phone camera came in handy! (Ok, I had to borrow Fade's phone becuase my battery was low, and then I had to spend half an hour fiddling with bluetooth to figure out how to get the pictures off. But that's par for the course.) Here's the space they leave on the left, and the space they leave on my side. (This is after I moved my car to the right edge of my space, otherwise you couldn't see between 'em very well. I tried a close up, but the lighting makes it really hard to see. If you're looking for the line it's under the tires of the car on the left.) This was actually fairly mild "over the lines", he's been over by lots more, it's just this time he decided to demand that I move my car to give him more space. (Yeah, I know...)

The hilarious part was that they say they've been complaining about me to "Mrs. Thompson" for a while now, who I assume is the person they rent from. (I own my condo, they rent.) I took the pictures so I could send 'em to the condo association if it comes up again. (Which is McNeil property management, still no "Mrs. Thompson").

Fairly unmemorable day, otherwise. Baked a cake last night to use up the rest of the whipped cream. (Making whipped cream from actual cream is a good thing. Strawberries and cream turn out to be _much_ nicer than plain strawberries, which is why we ran out of strawberries.)

The internet continues to be entertaining. I hadn't previously seen someone moonwalk en pointe, for example.


[In the last 12 months] the combined earnings of the 500 dropped 85 percent, from more than $645.2 billion to $98.9 billion. Think about that profit number again: One company, AIG, lost more than the combined profits of the other 499 companies in the Fortune 500.

April 21, 2009

Spent the weekend highly uninspired to do anything. Spent monday too tired to had at least 3 multi-hour naps which means I slept through most of the day.

Back to vaguely uninspired today, which is an improvement. This is probably some kind of cold.

April 18, 2009

I'm submitting patches to busybox again. Why am I submitting patches to busybox again?

I suppose once somebody did a dance mix cover of "The Time Warp" it was inevitable that a music video made entirely out of clips from Dr. Who would be set to it and put on Youtube. This is the internet after all.

That's not even a particularly remarkable one. The weird science one makes better use of the Dr. Who material and matching it to the music, as do this one, and this one. There's often significant variation on the same songs, with straight takes and and ironic takes. And some were just extended versions of what wound up in the show, although you sometimes realize how carefully cut the original song was for broadcast TV. (Yes, the part of the song that wound up on TV was explicitly advocating murder, but didn't mention _sex_.) Bumped into some songs I'd totally forgotten, and I've even found some new music doing this.

Of course Dr. Who isn't the only show with with music videos. (Although Harley/Ivy is almost cannon, as is Bonnie Tyler/Hiro Nakamura really. (Although innocently looking for that brings up unwanted Luke Ski stuff, a bit like Penguicon really. Yes, I've had more than one panel scheduled next door to his "concerts", and after six years his material gets old. Nice guy in person.)

Saw "Monsters vs Aliens" today, which was entertaining but made me appreciate Coraline more. Coraline was _elegant_ in its use of 3D, when the tunnel expanded into the distance it was _supposed_ to be otherworldly, when the hands of the "piano that plays you" stepped out they _were_ intruding on coraline's space and being meanacing in a foreshadowing way. The needle coming out at the start and poking the audience in the eyes was supposed to be unsettling, as was the rest of that scene. But it would all work in 2D as well.

Today's movie wielded its 3D like a blunt instrument, and a lot of them would just be silly in 2D. Making a paddle ball fill the screen near the start of the movie was just pointless showboating. Monsters spent a lot of time using 3D very intentionally. Coraline simply _was_ 3D, and made it look effortless and natural.

Some of it's technical stuff. Several times Monsters had a bright 3D object on a dark background, and a halo filtered through the polarized lenses. I never noticed that _once_ with Coraline. (I'd also heard that one of the failure modes of 3D is that rapidly shifting depth of focus for an hour gives you eyestrain. Never came up in Coraline, happened a lot in Monsters.) A movie should makes me _notice_ the technical tricks it's doing, even when it's doing them reasonably well, because it defeats the purpose. Use the tricks to tell a story, not to show off the tricks.

I remember one of walt Disney's famous quotes was "Change the trick". This came up in an old a making of Who Framed Roger Rabbit documentary I watched, where they showed around 10 seconds of footage of a human actor riding in cartoon taxi (chase scene), and in one shot the human was on a crane with the buggy painted over it, in the next he was on a miniature vehicle with the buggy painted over it, and in the third both the buggy and the human were animated (but they were going down a dark alley so it was harder to notice the human wasn't the actor; and they still used his voice). All in the space of ten seconds. The point was so that you _couldn't_ figure out what they were doing because they kept doing different things. (See a man in a rubber suit, look again and it's now a puppet, look again and it's audioanimatronic...)

CGI eliminates a lot of this from a technical perspective, but not the need for it from a storytelling perspective.

By the way, either Ginormica's grandfathers are The Doctor and Death, or the name "Susan" attracts superpowers. Somebody should ask The Invisible Woman about this.

April 17, 2009

Cable modem's out again. Time Warner. Yeah.

Wandered over to Mark's place, fed his cats, rebooted the server. Apparently there was a power failure at Mark's place long enough for the UPS to drain, and when it came back it hung trying to get ipv6 information from a dhcp server that wasn't giving it. It hung for at least 12 hours waiting for this, but continued immediately when I unplugged the darn router (toggling the cat 5 connection status). (Note: we're not using ipv6, this is just Gentoo being weird.)

Then Fade and I headed to Epoch to use their internet, and spent the rest of teh evening there.

Poked at the canadian cross toolchain (translation: hit it with a rock) and made it work. Now I'm ripping a new one to add static cross compiler building to that.

April 16, 2009

So the problem with the canadian cross toolchains is that the pathing is horked. AGAIN. Yet another thing for the wrapper to EXPLICITLY WHACK WITH A HAMMER. Except that I probably need to patch the source to _not_ let it look at the wrong areas _first_ before falling back to $PATH. (And when it's _not_ canadian crossed, the path logic works DIFFERENTLY. Or maybe it's some targets.) Other than that, the sucker seems to work. Or at least it builds a hello world that runs under qemu.

(Ok, technically it doesn't run under qemu-mips, but that's because the current mips application emulation hangs attempting to run a hello world statically linked against uClibc. But that's not the toolchain's fault, and it runs fine under qemu system emulation just fine.)

April 15, 2009

I'd hoped to drop off Mark at the airport _before_ rush hour started, but it starts at 6am and his plane didn't leave until almost 8, so it was either deal with rush hour traffic or get him to the airport 3 hours early. Got home and slept for about two hours before Fade got up, then the phone rang with some guy who was unclear about whether or not he wanted to offer me a job, who said he'd sent me an email (and was calling to tell me he'd sent an email), but no actual email ever arrived. Email about existing work did though, so I dealt with that before going back to bed. It was noonish by that point so I didn't really sleep well, and wound up getting up earlier than I really wanted and being vaguely unproductive until after dinner. (I can work on a day schedule, or on a night schedule, but some variant of consistency does help. My internal clock flashing 12:00 leaves me a bit zombie-ish to focus on technical work. You can bury the problem with enough caffeine, but that's not a long-term solution.)

I'm so glad I got my taxes done last month. Today would really suck otherwise.

I need to re-rip Billy Joel's greatest hits, track down my genesis CD with "land of confusion" on it, and a bunch of other stuff.

"If and when the president does make his Afghanistan announcement, you can be assured that we here at The Rachel Maddow Show will cover it with an almost ridiculous, over the top, obsessive interest in the details. I'm sorry, I can't stop myself." - Rachel Maddow, 3/26/09.

I want to have Rachel Maddow's baby. I'm aware that there are a number of reasons this isn't likely: I'm male, I've never actually met her, I'm married to somebody else, Rachel's gay and in a stable relationship with a woman in the northeast somewhere... Little things like that...

The canadian cross stuff seems mostly building now. Dunno if the result works yet, and there's still a bunch of polishing stuff to do. (The strip pass at the end is going to be fun to untangle...)

April 14, 2009

Went in to the mortgage guy's office down off of Bee Caves to sign lots of paperwork for my sister's house. Hopefully, that's the last of it until closing. (Between paying taxes, funding IRAs, and the down payment and closing costs for my sister's house, I seem to have used up most of my ready cash, _and_ acquired a new $615/month bill. Oh well.)

Fade sounded distinctly forlorn when I called home around dinnertime, so I felt compelled to go make her less forlorn. (This involved sandwich, fruit, three different types of caffeine, and a cookie. She seems less forlorn now.)

More fiddling with transcoding software, banging on qemu summaries, and back to working on the canadian cross stuff too.

Gotta take Mark to the airport at 4am, so he can go visit Eric and Cathy for two weeks.

April 13, 2009

Went to vist the investment guy this morning to actually do something with the Roth IRAs Fade and I set up last week. The S&P 500 fund Wells Fargo has eats 5.75% of your investment up front, right off the top, as a management feet. (Ouch.) I wanted to buy a share of Berkshire Hathaway Class B stock (currently around $3k) and put the rest in the S&P 500 fund, but I wound up putting it all in the fund because to do that they'd have to transfer the money into a self-managed account which would take at least a day meaning I'd have to come back _again_. (Yeah, I know this is how they make their money.)

Fade went with the managed fund the guy recommended.

Being diplomatic seems to primarily be a question of what you _don't_ do. For example, I let two different people have the last word over on the crossgcc mailing list. (The guy who took the position that the only truly "embedded" systems are the ones with 64k of ram or less, which by extension there's no such thing as embedded Linux. He stopped emailing me when I stopped replying to him, unlike the guy who claimed my response to him was flame bait, and then continued to reply to it for a couple more pages of text, and post follow-up emails. Has yet to notice he's arguing with himself. About automobiles. Yes, really.)

This comes to mind because I'm trying to be diplomatic about the qemu mailing list summaries, and there's a guy named Ian Jackson who is so consistently wrong that I feel compelled to point it out in my summaries as one of the things you'd notice if you read every message. I mean he's good natured about it. He seems to mean well. A lot of "don't ask questions, post errors" progress happens because of him. (And it's entirely possible I'm being influenced by the fact he's the main guy trying to push Xen code upstream into the qemu repository, which is the first thing that made me question his judgement.) He certainly knows more about the guts of qemu than _I_ do...

But I started my summary in the middle of this thread because I couldn't think of any way to gloss over the start of it that didn't sound like "The thread started with Ian being wrong again, and while various people were correcting him..."

Sigh. I reserve the right to be honestly wrong myself. I _need_ it in domains like the linux kernel mailing list, where I know I'm outclassed in terms of domain expertise. I really don't want to be hard on this guy, but I also don't want to quote anything he says without some kind of disclaimer, which has caused me more than one bout of writer's block summarizing.

Also researching video conversion software, primarily ffmpeg. My first largeish conversion with it resulted in badly out of sync audio (bad enough to be distracting after a couple minutes, accumulating to something like 7 seconds difference at the end of an hour long video). Wondering if that was a version-specific bug, or if I invoked it wrong? Dunno.

April 12, 2009

Ok, my working style's gotten some pretty deep groves in it. I just got to a good stopping point with the and files for Qemu Weekly News (hey, I'm up to the end of January, 2008! Woot!). And I went, "ok, time to check these into source control"... and I haven't got source control for 'em. (Well they don't develop, except for those two pieces of infrstructure, one of which is from the old kernel docs I spent half of 2007 poking at, each data file gets finished and should never change again after that.)

Seems silly to set up a mercurial instance for two files, and/or to check in lots of static files.

Fade's been busy with online appointments for 6 hours so far today. (This is fairly normal.)

My attempt to come up with a paper for the Linux Plumber's Conference hit the snag that the scope of everything I tried to write was just too big. Even my _outlines_ spanned multiple pages, and just kept getting bigger. (I suppose I could try again tomorrow, but mostly I want to sit down and spend a couple _weeks_ writing documentation, not a couple hours. So many tangents...)

Spent rather a lot of the day doing qemu mailing list summaries instead. Lots of funky editorial decisions there, but they're mostly "should this thread be included or not, and how much". The scope of each week is "threads starting between this date and this date", and from there it's a question of _removing_ stuff, not adding it...

For example, currently I'm trying to figure out if this thread is worth covering. Four posts between two people, riffing off a LWN thread that already implemented something, wondering if qemu could do a better job and deciding "no". Long term result for qemu: none really. Is it likely to come up again? Dunno, probably not. Is it good background material explaining how qemu wound up the way it did? Well, sort of, but not very.

Or how about these two, where somebody asks a very good question... and never gets an answer.

Yeah, all this is totally subjective, but most editorial jobs are.

April 11, 2009

Darn it. Making a canadian cross work in is harder than it looks. (Right now the same compiler builds libgcc.a and crtbegin.o as builds the gcc executable. I need to specify _different_ compilers for that. Ideally, those should be a separate build target, except re-running ./configure between build targets would be hilariously ugly. Once again, the autoconf step makes simple things hard...)

Good to know Moore's Law continues apace. Also nice to see idiots punished.

April 10, 2009

Saw The Last Unicorn with author Peter S. Beagle at the Drafthouse, with Mark. (Who has shaved his head. Mark has, I mean.)

The drafthouse was deeply disorganized about this (he's introducing the movie! No, he's doing a Q&A afterwards! No, he's just signing stuff in the lobby! Wait, no, the Q&A is starting, after 1/3 of the audience left because they were explicitly told it wasn't. Right. (In their defense, this was an encore presentation. The wednesday showing sold out, and the author was still in town, so... Still, the drafthouse guy who announced to the audience that we should all go out to the lobby because the Q&A had been yesterday? Bad form.)

It was a good Q&A. Lots of fun writing things which he said he'd answer differently in an academic environment (where he didn't have beer; he introduced himself as a "beer snob", hence his enjoyment of Austin and The Drafthouse which serves more different types of beer than many pubs). Very honest, lots of "that symbolism was entirely subconscious, it just felt right. I was writing it to see how it turned out, sometimes you get away with that and sometimes you don't." Somebody pointed out how much he'd packed into two sentences they quoted and how other authors would take a half-dozen pargraphs to say as much, and he immediately pointed out "I wrote those six paragraphs first". Marvelous bit about how hard writing the screenplay for the Rankin/Bass adaptation of Lord of the Rings was ("I told him I've been writing for days and I haven't reached the Riders of Rohan yet, and Arthur groaned and collapsed on the desk... He'd forgotten about the Riders of Rohan"), and how some things just don't translate well to animation. He thought a live action version of The Last Unicorn with modern CGI could follow the book more closely. And that it hadn't occurred to him when he was writing it that Prince Lyr would grow up to be King Lyr, just that the name sounded right. And that making the cat a pirate wasn't actually his idea (although he loved it), he just put a cat in the book because there was a cat curled up on his desk when he was writing the book; more or less the same cat.

His business manager (I've forgotten his name) was also there, and answered my question about "Do all unicorns turn into anime chicks?" by pointing out that the next project the Japanese animation team Rankin/Bass hired to do The Last Unicorn went on to do immediately afterwards was "Nausica, valley of the Wind" and they formed the core of Studio Ghibli. (So yeah, there's a *reason* the unicorn transforms into a girl with big eyes and a small mouth, naked transformation sequences that never quite show any actual anatomy, and so on.) The Japanese guys apparently submitted redesigns for most of the characters, and the rankin/bass guys agreed to some changes and kept the original for others. Apparently the witch Angela Lansbury voiced was one of the Japanese designs. (As far as I can tell, Almathea pretty much looks like her voice actrees with more hair, but the style's a bit of a giveaway.)

I hadn't realized how much of an all-star cast it was. Rene Auberjonois voiced the skeleton before he was even on _Benson_, let alone played Odo on Deep Space 9. Mia Farrow went on to play Supergirl two years later (she still can't sing, though). Prince Lyr was voiced by Jeff Bridges (Flynn from Tron, most recently the bad guy in Iron Man... he could _almost_ sing). Christopher Lee voiced King Hagard (more recently he was Count Dooku and Saruman).

Bought two books and had Mr. Beagle sign both.

Ah, I'd forgotten how kmail treats mbox files with more than 32767 messages in them. (At least in Kubuntu 8.04 LTS.) Decide the sucker is corrupted, pop up a window announcing it's going to rebuild the indexes (losing the read/unread/flagged status of all existing messages) with only one button that doesn't let you tell it _not_ to do this thing, and then loop endlessly popping up the window again with no way to cancel it because the thing still has more than 32767 messages in it. Bravo, kmail. Bravo.

The way out is to kill it (except you have to kill kontact, not kmail, because they sucked it into a Microsoft Office style bundle), copy the mbox file to a backup, open the original with vi, delete the first 100,000 lines or so and trim it to the start of a message, save and exit, delete all the hidden files starting with the same name as the mbox file (the indexes), restart kmail, and let it regenerate them then.

Simplicity, elegance, scalability... Yeah.

April 9, 2009

Of course this wouldn't have any application to brainwashing at all. (SO glad the shrub administration's worked its way through the guts of the nation.)

April 8, 2009

At home, awaiting the cable guy.

Poking at Sparc, which I last seriously fiddled with back in november. Last night I re-fixed the sparc ELF type glitch, but as to why the dynamic linker isn't working, I dunno. The kernel seems to have built fine with that toolchain...

Implemented BUILD_NICE and BUILD_VERBOSE in FWL. The first makes automatically nice down the rest of the builds, the second feeds V=1 to the Linux, uClibc, and BusyBox builds. (Yeah, you can do the first from the command line and the second's a trivial patch to the scripts when you're debugging something. But it's nice to be able to make the first persistent in .config, and the second transient from the command line. A convenience, nothing more.)

Ooh, I accidentally hit "i" in Kaffeine (I thought I the vi window had focus), and it turned "deinterlace" on, which is actually something I'd been trying to figure out how to do. (Otherwise the bleach dvd it's showing looks terrible whenever things move around on screen to fast.)

But why would deinterlace ever _not_ be on? (Isn't interlacing an artifact of CRT scanning which even modern LCD and plasma televisions need to compensate for?

Weird Al's Don't download this song remains a work of genius. (I admit I feel slightly guilty about listening to an mp3 I ripped from a legally purchased album instead of listening to the Youtube version or something, but I'm way too lazy to actually _investigate_ the file sharing networks...)

April 7, 2009

Distinctly under the weather today. Fade said she felt blah a couple days ago, now I've got it. (Very mild upset stomach, zero energy.) Not really feeling like doing anything specific, so I cleaned the kitchen. (Always a good option when nothing particularly seems worth doing at the moment.)

The cable modem's out. (There's that marvelous service Time Warner wants to charge extra for.) Somewhat inconvenient to look up time warner's number without internet access. I'm spoiled. Have to go dig up the phone book... And it's a "signal issue". A guy has to come look at it, some time tomorrow between 8 am and 9 pm.

Checked the other local wireless access points. I could associate with a couple but they route packets either. Dunno what that means. Heading down to Thundercloud, I haven't had breakfast yet anyway.

Removed mlocate and updated my "next time I reinstall" list. (Having a cron job start up and hog the disk when I resume my laptop is silly, especially for a service I never use.) I should probably poke at Xubuntu 9.04 when it comes out.

Hanging out at Epoch with Fade. Yay coffee shop internet...

April 6, 2009

Long design discussion going on with Yann Morin, maintainer of crosstool-ng (and the guy who wrote insmod modprobe for busybox long ago, which is where I know him from). I belatedly remembered to cc: it to the FWL mailing list.

The fact that off the top of my head I can find a dozen objects to the design of crosstool-ng isn't really a big deail, because finding objections to software designs is just about a superpower with me. I object to my own designs; toybox is stalled because the design needs work (possibly a complete rewrite in lua), and Firmware Linux fundamentally should just be four packages: kernel, C library, compiler, command line tools. I was trying for linux, uClibc, tinycc, and toybox for a while there, but the tinycc development community and I had an unhealthy relationship and I broke it off. The linux kernel needs a "hello world" variant we can build up from with the boot wrapper code, early_printk, and _nothing_else_. But I haven't got time to investigate and restructure it.

A few other things currently wrong with FWL, off the top of my head and by no means even CLOSE to exhaustive:

Did I mention that years ago I learned not to be bothered by the fact that everything I write sucks?

No really. I _expect_ it to suck, from my point of view, because I know what's wrong with it and how it could be better. It could always be better. All you can do is find the point of _least_ suck, which is a local peak at best. But you have to just botch something together, make do with it as best you can, and move on to the next crisis. Fake it with style. If you're lucky, you can come back and make it suck incrementally less later, but as with optimizing there will ALWAYS be a bottleneck. All you really do is move it around, in an eternal game of whackamole.

This is the "better to wield a flamethrower than to curse the darkness" approach to creative endeavor. When wielding a chainsaw against infinite quantities of suck, you have to get your satisfaction out of the wielding part, not any resulting lack of suck. A cat does not play with a string because they expect to finish off the string. (Ok, that's one depends on the cat.)

Rereading The Velveteen Stories. Already re-read Velveteen vs the Isley Crawfish Festival, Velveteen vs Andy's Coffee, and Velveteen vs the Flashback Sequence (parts one and two). Now reading Velveteen vs the Old Flame (while listening to Weird Al's "do I creep you out", seemed appropriate somehow).

Hey, I forgot the kde system monitor thingy, which graphically shows me how much of the system's going to non-nice tasks (ala qemu) and how much is going to nice tasks (ala distcc). And yup, distcc tasks aren't consuming any noticeable amount of CPU even though I've got a running. Now what's gotten screwed up...

Brittle. Evil. Do not seamlessly fall back to local compiling, FAIL NOISILY SO I CAN SEE IT'S NOT WORKING. Grr. When the distcc trick is enabled, the _only_ things gcc should be doing locally are preprocessing and linking, no compiling.

I should just break down and teach ccwrap to do this. At least I know exactly how that sucker works...

Ah. Doh! My fault, it's building mini-native, so it's using the cross compiler it just built, which means the the distcc config on the host compiler isn't being used.

Grrr. Ok, time to fix this properly. I need to teach to canadian cross.

April 5, 2009

This is what our new couch looks like, except for the actual couch part, and there are more cats, and the word balloon is merely implied.

The gcc source code is one of the most horrible codebases I've ever encountered. The layers of unnecessary indirection are painful to _describe_, let alone wade through.

I'm trying to make arm big endian work. I've been trying for two days. I figured out what needs to change hours ago (the line #define SUBTARGET_EXTRA_LINK_SPEC " -m armelf_linux -p" in gcc/config/arm/linux-elf.h needs to change armelf_linux into armelfb_linux for big endian versions), but figuring out how to come up with a patch that works for both big and little endian is a FLAMING PAIN.

I refuse to conditionally apply patches. The same code should work for all targets, anything else is stupid. But gcc is just so horrible. This is creating a built-in spec file, which gets parsed at runtime. The information available at compile time and the information avilable at runtime aren't on _speaking_ terms with each other.

First, they implement their own totally unnecessary language (spec files), and resolve things at runtime using that language. There's no need for it to _exist_ except that they couldn't make anything simpler work and their reaction to failure is to add more complexity and layers.

So I'm tring to determine the default endianness at compile time. The default endianness is determined at _runtime_ by the global target_flags, which is initialized from the global structure targetm.default_target_flags in gcc/opts.c. The targetm structure turns out to be "struct gcc_target" which is defined in gcc/target.h. That's a big ginormous struct definition, going on for pages and pages. The global instance targetm is initialized by target specific code, in this case residing in gcc/config/arm/arm.c, but every target's initialization boils down to the same line: struct gcc_target targetm = TARGET_INITIALIZER; and TARGET_INITIALIZER is a giant macro defined in gcc/target_def.h.

The _ugly_ part is that TARGET_INITIALIZER is this mess:

/* The whole shebang.  */
#define TARGET_INITIALIZER                      \
{                                               \
  TARGET_ASM_OUT,                               \
  TARGET_SCHED,                                 \
  TARGET_VECTORIZE,                             \

Going on for, I kid you not, _80_lines_. So at first glance, the only way to figure out which of the 8 gazillion sub-macros results in a given field is to count the members of struct gcc_target in gcc/target.h and then count in to the corresponding line in TARGET_INITIALIZER. Then go track down what that macro does. All the while hoping that none of the initialization macros results in more than one structure (and there are a LOT of sub-structures in targetm).

Except that the first three of those macros each initialize large structures, and the fourth turns out to be the one I'm looking for. (Go figure, it's not part of a structure but at the top level.) So TARGET_SECTION_TYPE_FLAGS is #defined earlier in the file to default_section_type_flags... which is a function? It's in gcc/varasm.c as a single line wrapper around another function, default_section_type_flags_1 (I don't know why), and that sets it to either SECTION_CODE or SECTION_WRITE... Darn it, looks like I went down a strange blind alley again? Those are defined in gcc/output.h and don't mention endianness... That's for elf sections, I want the elf header.

Ok, I got on this trail because I added a #warning RUTABEGA right before the #define I need to add an #ifdef around, and when I cut and paste the first compiler invocation make spat out that produced that warning, removed its -o stanza and added -dM -E instead, one of the lines it spit out was:

#define TARGET_BIG_END ((target_flags & MASK_BIG_END) != 0)

Which looked promising. MASK_BIG_END is (1<<5) so that's not very useful, and I just traced target_flags to a dead end. Wheee. Either I missed a curve while tracing it through, or there's more than one target_flags (even though it's a global).

I hate the free software foundation. All of their code is like this. They never clean up and remove anything, they just make it bigger and more unnecessarily convoluted, with each release. I have wasted DAYS tracing through this crap. As usual, I already know the change I need to make and have no _clue_ how to get their code to do it sanely (or at least consistently).

This would be easier if it wasn't the compile time "string1" "string2" combining thing. You can't put runtime logic around any of that, because adjacent strings only concatenate automatically at commpile time; after that you have to allocate memory and do strcat().

Oh it _can't_ be that easy, can it?

Sigh. Should have known. This is FSF code. The correct response isn't to add stuff to it, the correct response is to rip stuff out:

--- gcc-core/gcc/config/arm/linux-elf.h 2009-04-04 23:41:52.000000000 -0500
+++ gcc-core2/gcc/config/arm/linux-elf.h        2009-04-05 04:36:41.000000000 -0500
@@ -36,7 +36,7 @@


-#define SUBTARGET_EXTRA_LINK_SPEC " -m armelf_linux -p"


The linker knows what machine types it supports, and has a default which is correct. As long as we don't try to override that default, life is less screwed up.

(Did I mention that binutils was only nominally maintained by the FSF for many years, and _actually_ maintained by Cygnus, which was bought by Red Hat? They had their own fork and everything. This is probably why binutils is generally less stupid than code more actively intefered with by FSF zealots who couldn't code their way out of a paper bag yet are very firm about religious idology. Alas, the fork went to GPLv3 when the base did, and is thus useless.)

So yeah, arm big endian builds now. I should try to come up with a test environment for it...

Oh, and squashfs 4.0 came out. Need to make squashfs packaging work properly now...

April 4, 2009

So, a little _practical_ advice about the time warner thing. According to The Statesbeing, Austin does have another cable modem service provider, Grande Communications, which has explicitly stated they have no plans to cap bandwidth. The obvious thing to do is switch away from the idiots and start buying service from non-stupid people.

So it looks like the question becomes "how can we get out of our TWC contracts", and _that_ is a question best handled by small claims court or a class action suit if they're going to be stroppy.

The sad part is if they were _actually_ threatening to cap our bandwidth, I.E. slow down the connection once we'd passed a certain activity threshold, I don't think anybody would really mind. People are describing this as a cap, when it's actually "extra random charges for using the service you sold us for a flat fee". So if they let us download the first gigabyte each day at 3 megabits/second, then packet shaped us down to 128k/second after that (or perhaps some more gradual, less noticeable tail-off)... Well it would be _annoying_ but not reason for widescale organized protests. But no, they want the opportunity to present us with unexpected bills for $200 or more. They don't want to provide less service for what we pay them, they want to charge us more money for what they're giving us now. Potentially, 10 times as much. This is because they are GREEDY BASTARDS.

"You ate too much from the buffet, pay us three times what the sign says" is not going to win you a lot of repeat customers.

Phone TWC, turns out our contract's expired so we can switch at any time. Bad news: Grande doesn't service this location yet. (But they offer service two blocks north, and have another six months to install new lines, so here's hoping.)

April 3, 2009

Good to see grassroots mobilization against Time Warner's stupidity. It's also becoming a political issue. Lots more links here and here.

I repeat: Time Warner needs to pull back a bloody stump from this attempt. They won't learn if it doesn't hurt. Regulate the hell out of 'em, _and_ I'm looking around for other ISPs.

I already emailed the address in that last link to ask what their procedure is for getting out of our contract early. If they don't have one, or if they try to charge extra for it, that's grounds for a class action lawsuit right there...

Fade bought an Xbox 360 today. I see it as a Netflix viewer with HD capability for when we upgrade the TV. She actually plans to play games on it (although the guy at the store warned us that the Law Offices of Small and Limp, Esquire designed the thing with a milimeter or so of clearance between the laser and the surface of the platter, so if a cat walks by while it's got a disk in use it'll destroy the disk. Brilliant. I repeat: netflix viewer).

Considering that watching five movies a month in HD via netflix puts us over the bandwidth cap if we do _nothing_ else with the connection, leaving time warner becomes pretty much essential from a purely pragmatic standpoint.

April 2, 2009

It shipped.

Falling over now.

April 1, 2009

When I first heard that Time Warner was going to charge people extra for using too much bandwidth, I thought "april fool's joke". Alas, it isn't. They really _are_ being evil and stupid. Time to find a new ISP... (There are a number of dsl options.)

I'm sorry, but Time Warner needs to pull back a bloody stump from this, for educational purposes. (If it was just me, I'd cancel and be happy with my cell phone internet, but Fade needs a stable connection.)

Got email from Yann Morin today, who ready my March 7 blog entry and wondered if I wanted to discuss design issues with him. (I'm game, waiting to hear back if he wants to do it in email or on the crossgcc list.)

March 31, 2009

We have a couch. It is orange. I should take a picture of it.

Mark and I rented a u-haul, got the couch installed (a small phrase which contains a lot of incident), and then moved the rest of his stuff out of the old apartment into his new garage. (Most notably the washer and dryer.)

Now at Mark's pulling what's turning into another all-nighter, switching between working on the crosstool-ng patches Cisco wanted and getting FWL 0.9.6 ready for release. (I was hoping to have a FWL release out _before_ the end of the month, but as midnight approaches it's not looking likely.)

No I didn't pull two all-nighters in a row, I did an all-nigher sunday night, and now I'm pulling an all-nigher tuesday night. (Monday night was at home, Fade and I watched the rest of the Schoolhouse Rock DVD and )

It's a bit odd to be copying things I got working in FWL months ago into into a completely different open source toolchain project (which is also mostly a single person endeavor). The fact it took me about an hour to get FWL to produce such a toolchain and over a week to get crosstool-ng to do it doesn't enter into it, apparently. They want what they want, and are willing to pay to get what they want. I don't understand _why_ they want it, but it's their money...

I suspect I've been drinking too many energy drinks.

March 30, 2009

Very long day. Up overnight at Mark's last night, finishing up many things, then came home and slept a lot.

There's a release candidate set of binaries up at the new snapshots directory. We plan to start doing nightly builds, but it hasn't quite started yet.

March 29, 2009

Busy week. Mark and I got together yesterday and banged on FWL and GFS, and we're in pretty good shape for a release. Made a read only root filesystem work. Squashfs isn't working (needs updated userspace tools that aren't out yet), but in theory one tweak to and that's in.

Just wrote up the past 4 weeks of reports for FWL. Posting them to the list to give Mark something to put in the bloggy thing on

Lots and lots of little FWL tweaks preparing for a release. The static toolchains now build under qemu, in parallel even.

March 28, 2009

The 2.6.29 kernel's network flakes out under load. Nothing in dmesg, no obvious error. It just... stops, suddenly. Luckily, there's a fix. Here's hoping ships by monday...

Ok, there are times when I wonder what other programmers are smoking.

An "archiver" is a type of program that combines a bunch of individual files into a single file. Tar and zip are examples of archivers. This is not a new concept.

The "genext2fs" and "mksquashfs" programs are essentially archivers, because they do what archivers do. But they're not _implemented_ as archivers, and thus their command line argument parsing is missing obvious functionality.

For example, with tar I can list a bunch of files and directories to put in the archive. If I want to rebase my root directory, there's the tar "C" option or I can "(cd dirname; tar blah blah blah) > filename.tar". (It's even smart enough to spit the progress indicator to stderr so it doesn't wind up redirected into the tarball.) This functionality is many years old, and quite standardized.

But genext2fs takes exactly one input source: a directory. That's it. You can't feed it a list of files, you can't specify more than one directory, and it does the "cd into that directory so the directory itself doesn't show up in the output" thing without giving you the option for it _not_ to do that.

The mksquashfs program is less stupid. You can specify multiple input files and directories, but everything you specify on the command line gets squashed to the top level directory. I.E. if you specify "build/sources" it tries to put it as "sources" at the top level. That's not how archivers work.

Of course this isn't nearly as weird as when I'm trying to figure out what _I_ was thinking when I wrote code. Toybox patch broke again, so I have to untangle it and fix it. (Patch shouldn't be this fiddly.) I was most of the way through rewriting it to handle a patch it turned out I didn't need to apply. I shouldn't discard that work (because it addressed a real problem, just one that turned out not to be pressing), but I hadn't finished debugging it yet... Do I even still have the old test case? Wheee...

Much frowning at what TT.state means in various contexts. Managed to the state in patch_main() a local variable, which helped a lot. That leaves TT.state mostly being used by do_line()...

I have to cut a new toybox patch with the bugfixes to the "patch" command, and fix the resize2fs stuff so it doesn't screw up GFS. But tomorrow, I think...

March 27, 2009

Fade and I went to Lax today and bought a couch. It's orange, because the green ones just weren't as comfortable. (We can put a throw over it or something.)

I'd forgotten Konqueror's "Crash and take all tabs with you" behavior. It's sad new Ubuntu releases break more stuff than they fix, because it _does_ fix stuff. Still, Gnome. *shudder*

More testing, fixed a number of small bugs. Squashfs does not work, because the most recent release version of mksquashfs produces a squashfs 3.1 file and the kernel only supports squashfs 4.0. Sigh. (Also, it needs the misc filesystem config symbol to be enabled, I need to fix kconfig so that when KCONFIG_ALLCONFIG switches on a symbol, the menu containing that symbol also gets switched on. No point HAVING the symbol in ALLCONFIG, eh?

Finished the README and checked it in. Now banging on the static cross compiler auto-build script (which is 99% of the cron job). Lots of little corner cases...

March 26, 2009

It's 2:10 in the morning and IHOP is _full_. Biked here, but it's so loud my noise cancelling headphones can't keep up. (Yay exercise, anyway...)

Still poking at the FWL manifest file. When I first wrote do_readme() it was called from a file that didn't source (just, so the PATH=build/host bit didn't come up. Calling it with the trimmed $PATH does not combine well with UNSTABLE support or fetching the mercurial revision of the build scripts.

Oh, hang on, this needs to go in, because that's where package versions are defined. It needs to go at the end, which means it won't get zapped by because's job will be to regenerate it.

Worried about Mark.

Spoke to the new mortgage guy today. Need to fill out a web form this evening. He's traveling tomorrow, but he might be able to get me a preapproval letter from the road. Aiming for a closing date of around April 20th.

Heard back about the ppc 440 board, some kind of glitch was found in the prototypes and they're fixing it before sending us one.

Ok, todo list for the release:

  • add config option for the newly repaired cross compiler smoke test (which doesn't work on arm without root access because a change to the linux kernel broke qemu)
  • get sparc building again
  • get the manifest file added for GFS
  • finish the README file and check it in
  • build static cross compilers
  • get the cron job working
  • test everything
  • Fix the resize2fs thing, it breaks GFS

Ok, the new config entry you have to enable to get qemu application emulation to test run a hello world binary at the end of is CROSS_SMOKE_TEST.

Ok, the sparc thing is because the linux 0.9.29 kernel unified the sparc 32 bit and 64 bit architectures. This broke uClibc the same way the x86/x86_64 unification did, because kernel_types.h in uClibc is too brittle for words (and I still don't understand why it reproduces chunks of asm/posix_types.h instead of just #including it).

Writing a MANIFEST file to the top level of mini-native turns out to be darn fiddly. I've already got a do_readme() in sources/ but the problem is that that to figure out what version of the actual FWL scripts we're building from, you need the "hg" command. Two problems with this: 1) hg might not be installed on the host, and there's no guarantee you're building in a mercurial repository instead of a downloaded tarball anyway, 2) even if it is there, edits hg out of the $PATH.

Note that I can generate must of the _rest_ of the package versions even with a trimmed $PATH, because it's based on the packages/* filenames. (Except for the alt versions, which need either mercurial or subversion.) But knowing which FWL scripts built this system image seems kind of important.

I actually _can't_ solve the general case, because there are times when the info isn't available, so leaving this field blank sometimes is inevitable. For example, when you download a tarball snapshot from the mercurial web viewer, it includes a .hg_archival.txt file which contains the big long 320 bit hash code for the commit, but not the actual revision number. (And without mercurial and the repository you can't translate one to the other.) You're unlikely to have a mercurial repository without mercurial installed, so trying to parse data out of .hg is silly (and nontrivial anyway).

For release tarballs I can include a pregenerated MANIFEST with values adjusted by hand. (Where to put it's a bit iffy, can't be packages because it'd get deleted by cleanup_oldfiles, sources is now all stuff checked into source control, can't put it in build because "rm -rf build" is how you "make clean" on this project. Putting it at the top level is clutter but I'm not seeing a better alternative right now...)

As for building out of a repository, if you run ./ when there isn't a build/host directory, then we know that $PATH hasn't been adjusted to exclude hg so we can try to create a MANIFEST then. But that's kind of disgusting, because running is supposed to be entirely optional. This would be a non-obvious side effect, plus it wouldn't know when to rebuild MANIFEST because package upgrades (or at least switching USE_UNSTABLE around) aren't tied to rebuilding build/host. The lifetime rules are all wrong. This is ugly.

I suppose that when there's a .hg directory in $TOP I can symlink hg into build/host. That's ugly too, but possibly _less_ ugly than the alternatives. Hmmm....

Sigh. I always get hung up on the small stuff, because that's usually where it's least clear what the best approach is.

March 25, 2009

Faxed an updated purchase agreement to my sister's realtor, upping the purchase price by $2k because we missed the closing date and the place is costing them money to keep. (So now it's $77k.)

I'm annoyed at my credit union, which strung me along for 3 weeks before noticing they're not certified to originate loans in the state of Minnesota.

My laptop is bluetooth associating with my cell phone again! Yay! (So the problem was that Ubuntu 8.10 is incompetent, and can't do things that Kubuntu 8.04 could just automaticall do from the gui. Gee, what a surprise.) I need to dig up my cell modem config and set up wvdial again, but that shouldn't be too hard. Need to edit the list from the 23rd to include installing the bluetooth package and setting this stuff up...

So current qemu-arm doesn't work unless you "echo 0 > /proc/sys/vm/mmap_min_addr", as root. The kernel broke it. Wheee. Meaning I need to add a config option for it, because it's a policy issue that FWL does not require root access on the host, ever.

And sparc was broken by the uClibc upgrade. Fun.

The ppc440 board has yet to arrive.

March 24, 2009

I don't understand why "fiscal conservatives" vote Republican. The federal deficit got run up by Regan (among other things merging social security into the main budget and "investing" it all in treasury bonds, turning it from a retirement fund into our largest entitlement program), then Bush I spent more money in 4 years than Regan did in 8, then duh-bya borrowed unimaginable amounts of money from China, never vetoed a single spending measure, kept things like the wars in Afghanistan and Iraq off the main budget so it _looked_ smaller than it actually _was_.

In contrast, Clinton balanced the budget and actually paid it down. Obama's made reducing the deficit a priority and is current making sure we get a better bang for the buck even with crisis management. (The Democrats care what they spend money _on_. This isn't new.)

Which of these two groups are accomplishing what fiscal conservatives claim to beleive in? Why on _earth_ would they vote for the people who ran up the national debt? (And oversaw a series of entirely avoidable financial disasters from the whole enron/worldcom/arthur andersen collapse through the mortgage industry collapse, and no-bid contracts to Haliburton for everything from Iraq through Katrina.)

So I'm adding an option to build qemu to, and it wants to use "strip" during install. Here's why that's problematic, starting with some backstory:

What does is populate a directory (build/host) with host versions of the same build tools the target system will contain, and adjusts $PATH to point only to that directory. It also builds a few other things (like distcc and genext2fs) that may not be installed on the host. The reasons to do this are A) it provides evidence that the target system can probably successfully rebuild itself (since these tools are what it built with in the first place; if nothing else you find most problems early), B) it isolates the build from variations in host system, so you don't get different system images when building on ubuntu/fedora/gentoo, C) we're building everything from source, even the tools we build with. :)

The exception to this is the host toolchain. In order to build from source, you need something to compile _with_, and it's the host's job to provide a toolchain you can compile host binaries with. (We build a known cross compiler that produces _target_ binaries, but not for building host binaries. If the one supplied to us by the environment doesn't work, something is wrong.)

So instead of trying to build a host compiler from source, it creates symlinks to the existing one in build/host:

for i in ar as nm cc gcc make ld
  [ ! -f "${HOSTTOOLS}/$i" ] &&
    (ln -s `PATH="$OLDPATH" which $i` "${HOSTTOOLS}/$i" || dienow)

I.E. the seven binaries it's creating symlinks to are ar, as, nm, cc, gcc, make, and ld.

Notice it's _not_ creating a symlink to is the host's "strip". It doesn't build one from source, either. (It's in binutils.) This lack is intentional: a common cross mistake when cross compiling is to use the host's "strip" to strip target binaries. This can sort of work sometime, because usually your target's using ELF too (although it could be using another format like binflat), and on some targets the sections will match up. And on others you get subtle bugs that are a pain to track down (like the bugs I got when stripping the *.a and *.o files).

So if something we're cross compiling accidentally calls the host strip instead of the target strip, I _want_ the build to break so we notice the problem. Easy way to do that, don't have the host strip in the $PATH.

So I can add "strip" to build/host. It's easy. I just don't _want_ to, it's not there for a _reason_.

Except that qemu refuses to install without "strip". It's possible to build pre-stripped binaries (-s option to gcc), but qemu isn't doing that. It's possible to just install the binaries verbatim (with debug symbols), but qemu won't do that.

What qemu is doing is using the busybox "install" command with the -s option, which shells out to strip. I could patch busybox to not do that, but the way the patch infrastructure works is that setupfor patches the package every time, both the version we build for the host and the one for the target. The target's doing a native build, it has a native "strip", so using install -s is ok in that context and breaking it is bad.

I could temporarily add a symlink to "strip" right before qemu, and then remove it afterwards. That's too ugly for words. Not going there.

So I'm down to modifying how qemu installs. I can patch the qemu source to remove the -s, or I can just do the install by hand. Doing the install by hand has the benefit that I can skip installing the docs and stuff (which I really don't want in build/host anyway, that _should_ just be the executables except that qemu can't run without the pc-bios directory).

That's why I reimplemented qemu's install as a "find" and two instances of cp, and now you know.

The reasons I'm building qemu from source now is:

  • Because I can. (I.E. It's now possible to build qemu from source without requiring gcc 3.x on the host,)

  • Older version of qemu don't support lots of the targets we now make system images for.

  • The current release (0.10.1) needs patches to support powerpc and sh4.

  • Having it in build/host makes life easier for

  • It turns out that the smoke test at the end of (using qemu application emulation to run a target "hello world") hasn't worked for a _long_ time. (Oops. It skips it if the appropriate qemu target isn't installed, and if it's not in build/host cross compiler won't ever find it...)

I also note that only builds qemu from source when HOST_BUILD_EXTRA=1. (It takes longer than building a complete system image, this sucker's expensive to compile.) If that variable isn't set, it symlinks it from the host the way the host toolchain is.

March 23, 2009

Ok, reinstalled my laptop with Kubuntu 8.04 LTS. (Because Ubuntu 8.10 refused to unhork itself, and it now has so many redundant and conflicting layers of unnecessary infrastructure that I don't have a _clue_ what actually triggered its descent into madness.)

The checklist for turning a fresh install into something useful:

Sigh. Done... And kmail has been hung trying to launch for 10 minutes. (Eating 100% cpu, albeit only on one processor.) That's probably not a good sign... Nope, it worked its way through. Regenerating the indexes, perhaps?

The 2.6.29 kernel and qemu 0.10.1 both dropped while I was out. Of course.

March 22, 2009

I'm trying to figure out if I hate SELinux or Ubuntu more. Last night, in the middle of some light web surfing and programming (as my normal user, not root), my laptop decided to go nuts, with the video card showing snow. A quick hard boot later, and everything seemed well... except that I couldn't talk to the network. Suspecting a heat problem, I gave it the night to cool off, and now it's morning and the network problem hasn't changed.

The _weird_ part is that the wireless card is associating with the access point, and can see the ~2 dozen other access points in the area, so it's both sending and receiving fine at the hardware level. Dropping down to root, I try to ping the access point:

root@driftwood:~# ping
PING ( 56(84) bytes of data.
ping: sendmsg: Operation not permitted
ping: sendmsg: Operation not permitted

That's as root. I can't sendmsg _as_root_. What the? When I'm logged in as root, I can format the hard drive. I'd BETTER be able to talk to the network.

Possibly it's the network card giving bad errors. Intel's wireless driver still has a horrible binary blob, last I checked. And it's crapping all over dmesg:

[   16.469066] iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, 1.2.26ks
[   16.469074] iwl3945: Copyright(c) 2003-2008 Intel Corporation
[   16.469224] iwl3945 0000:0b:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[   16.469244] iwl3945 0000:0b:00.0: setting latency timer to 64
[   16.469277] iwl3945: Detected Intel Wireless WiFi Link 3945ABG
[   16.677223] iwl3945: Tunable channels: 11 802.11bg, 13 802.11a channels
[   16.696606] phy0: Selected rate control algorithm 'iwl-3945-rs'
[   17.072806] iwl3945 0000:0b:00.0: PCI INT A disabled
[   71.102667] wlan0: associate with AP 00:18:f8:ef:8f:44
[   71.109791] wlan0: authenticate with AP 00:1a:70:80:4a:53
[   71.112691] wlan0: authenticate with AP 00:1a:70:80:4a:53
[   71.112713] wlan0: authenticated
[   71.112719] wlan0: associate with AP 00:1a:70:80:4a:53
[   71.312096] wlan0: associate with AP 00:1a:70:80:4a:53
[   71.315455] wlan0: RX AssocResp from 00:1a:70:80:4a:53 (capab=0x401 status=0 aid=1)
[   71.315464] wlan0: associated
[   72.445119] IN= OUT=wlan0 SRC= DST= LEN=40 TOS=0x00 PREC=0xC0 TTL=1 ID=0 DF PROTO=2 
[   72.504279] IN= OUT=wlan0 SRC= DST= LEN=144 TOS=0x00 PREC=0x00 TTL=255 ID=0 DF PROTO=UDP SPT=5353 DPT=5353 LEN=124 
[   72.524281] IN= OUT=wlan0 SRC= DST= LEN=351 TOS=0x00 PREC=0x00 TTL=255 ID=0 DF PROTO=UDP SPT=5353 DPT=5353 LEN=331 
[   72.780310] IN= OUT=wlan0 SRC= DST= LEN=351 TOS=0x00 PREC=0x00 TTL=255 ID=0 DF PROTO=UDP SPT=5353 DPT=5353 LEN=331 
[   73.036289] IN= OUT=wlan0 SRC= DST= LEN=351 TOS=0x00 PREC=0x00 TTL=255 ID=0 DF PROTO=UDP SPT=5353 DPT=5353 LEN=331 
[   73.236524] IN= OUT=wlan0 SRC= DST= LEN=327 TOS=0x00 PREC=0x00 TTL=255 ID=0 DF PROTO=UDP SPT=5353 DPT=5353 LEN=307 

And so on for quite some time. As far as I can tell, that's a log of all the packets userspace tried to send and got permission denied for. (I can only assume this is the stupid selinux permission stuff. I haven't figured out how to switch it _off_ yet. I haven't go the selinux userspace packages _installed_. I may need to reinstall at this point.)

Off to Mark's to grab a boot CD and rule out a hardware problem, although the fact dhclient can get a dhcp address would seem to do at least some of that...

March 20, 2009

So I want to make run multiple qemu instances in parallel. It should be possible to make hda read only, provide a separate hdb for each instance, and even run genext2fs on build/sources to create a read-only hdc.

There's a couple infrastructure problems here. The first is that FWL has a bad relationship with mke2fs, tune2fs, and fsck.ext2. Busybox no longer provides them (it did circa 1.2.2, but the implementations were horrible and got removed). It's currently only used by, and I really don't want to suck in e2fsprogs in for that. But gentoo doesn't put /usr/sbin in a normal user's $PATH, and that's where they live, so I have to jump through hoops to access them. Right now said ugly hoop jumping (adding /sbin:/usr/sbin to $PATH if mke2fs isn't found) is limited to, but the need for multiple hdb instances is a second use for this.

So I can't rely on them being easily accessible on the host, I don't want to build e2fsprogs in The long term fix is to add them to the lua toybox, but that's not immediately useful.

I could move the $PATH mangling to sources/ (grab those three binaries the way I grab the host toolchain), but the adjusted path in build/host doesn't contain qemu. (Now that 0.10.0 builds with gcc 4 I'm pondering adding qemu back to, but that's a separate issue.) Adding things to $OLDPATH is ugly because then it's not the old path anymore.

The OLDPATH thing is actually kind of annoying because I need to use genext2fs and qemu from the same context in, and right now no $PATH is guaranteed to have both of those. Sigh. I could get away with this before, but it was wrong and it's coming back to bite me. And to be honest, right now I need to apply two patches to qemu 0.10.0 (to get ppc and sh4 working)... Ok, time to clean up

It turns out Grischka is using a git tree for tinycc. (I _think_ this is an active repository. The last commit to it was three and a half months ago, but then again that could just be the rate tcc development's been going.) He's still syncing the cvs repository from it, but if he agrees to _stop_ doing that I'd be willing to contribute to this repository. (Yeah, I need to learn more git stuff, what little I knew has gone rusty. But I have no fundamental objections to git the way I do cvs.)

Hmmm... Building qemu in build/host is nontrivial, because qemu wants to install a pc-bios directory and needs to know where to find it. (Which means that you provide a $prefix and it throws stuff under there. I want the binaries to go in "build/host" and it wants to put them in "$prefix/bin".) There isn't currently really a place to _put_ the pc-bios directory. I could compensate by supplying -L to qemu, but then it has to know if it's running from or from the host system. (I suppose I could throw logic to detect this case into, but what I'm trying to do here is simplify that...)

Maybe having a build/host/pc-bios directory isn't _too_ disgusting. (It means the qemu we build would have a hardwired absolute path, but build/host never really claimed to be relocatable, did it?) I'm not sure what happens if you try to run "pc-bios" out of $PATH if it's a directory. Shouldn't come up much, but still...

Also, the qemu build is _long_, and hugely memory intensive. (Mostly due to translate.c, which causes cc1 to peak at over 20% memory usage on my laptop according to top, which is about 400 per instance. When two of 'em happen at once with -j 2, the sucker can go totally swap happy.) If all you want to do is build a cross compiler or a system image, and not run the result locally, building that is painful...

Ok, just built it with -j 2, with 728 megs free according to /proc/meminfo so the sucker has no excuse to swap. Totem is eating 12% of one CPU (catching up on my podcasts, currently countdown), and the system otherwise idle. (Killed npviewer.bin just to be sure.) How long does qemu 0.10.0 take to build...

real    16m42.256s
user    26m51.669s
sys     1m55.315s

Yeah, that's about how long it takes to run the entire rest of the build. If this goes into host tools, it needs a config option. (Darn it, symlinking all the qemu-system-* into build/host is being evil. Tab completion can give me a list of everything in $PATH starting with a known prefix, ala "qemu-system-[tab][tab]". But doing this from the command line... How? I can't do it with "which", can't do it with "whereis". I suppose I can run the $PATH through sed and iterate, but that's kind of disgusting...

Sigh. Actually implementing this is easy. Figuring what the right thing to do actually is: hard.

Went outside and painted over the fresh grafitti in the alley. Got a mild case of heatstroke and sunburn (remembered that Fade's bike had some spray sunblock on it after about an hour of painting in direct Texas sunlight), managed to whack the front of my right shin quite impressively in the process (they can swell up; good to know), got paint all over one of the dining chairs, destroyed two paintbrushes, and destroyed our broom. But I got it done, and I'm pretty happy with the result. (I'm getting better at this, hopefully next time I have to do it it won't take 2 hours and will cause less collateral damage. 1pm was not the optimal time to start this, even in March. Diet Rockstar energy drink was not the optimal beverage to stay hydrated, switching to tea for the second half worked much better. Oh, and we still don't have duct tape but we _do_ have blue painter's tape. My leg hurts.)

I want to up Netflix to at least 4 disks. (Mark has 8, it apparently costs him $45/month which is what he was spending on cable.) There are tons of Doctor Who DVDs out now, including some really good episodes I want to show Fade. (Invasion of Time is out! Woo!). Especially since Hulu started being stupid with blatant watermarks over the video you're trying to watch. (Ok, Fade can watch a lot of these videos online on her mac. I expect that'll be really nice in about 3-5 years. But for right now having more disks to choose from means we'd watch them more often.)

Looking at the non-credit courses at UT. Turns out the second wave of them start right after spring break, I.E. next week. This means I didn't wmiss the remedial scribbling class I wanted to take (ahem, "Drawing I), it starts tuesday. I'm also interested in "Aerobics - Lunchtime" (1010.601, 18 meetings, MWF Mar 30-May 8, 12-1 pm). I'm a bit confused about "Xtreme Abs", which claims to run from 7:30-7:45 twice a week. 15 minutes? Um... ok? I wonder if Fade would be interested in the Ballroom Dance class (two sections, one meets mondays and one meets wednesdays, both starting next week). I might be interested in "Japanese for beginners" if it wasn't the same time as the drawing I course.

March 19, 2009

Watching various Ted talks. Adam Savage is geekier than me, and I am deeply impressed. It's interesting that Don Norman has wandered from usability to aesthetics, but the talk was so-so. Dan Ariely is a behavioral economist, which is a specialization I didn't even know existed, and gave a _marvelous_ talk. I also really liked Dan Gilbert's (long) talk about how everybody sucks at math in the real world (enumerating common mistakes).

Sigh. A large "Viz Media" watermark has shown up over all the bleach episodes on Hulu. I don't know when this started (it wasn't there last time I was watching them), but it's intrusive and distracting, and after an episode and a half I realized it's taken all the fun out of watching them. And since I refuse to reward the importer by giving them money for making something more annoying, I guess that means either give up on the series or track down fan subs on bittorrent. (I actually prefer to watch this legally... up until the point where they get abusive.)

Darn it, Hulu's doing it for everything now, including The Colbert Report. Well that sucks. So much for watching things on hulu. (They have a whole page to put stuff on, and they feel the need to obscure the video itself. That's too stupid to live. Yes, the watermark is far more annoying than the commercials are. I choose to watch the commercials to support the people legally providing me with the video, but standing between me and the video when I'm trying to watch something is just RUDE.) I can watch ted talks and msnbc's podcasts and all sorts of other stuff WITHOUT prominent annoying watermarks.

Ok, so it's not viz being stupid, it's hulu being stupid. So check netflix and see if bleach is there... Yup, 16 disks. Ok, add that to the queue (I've seen several of them already, but Fade hasn't). And worry about it later.

March 18, 2009

SxSW is over and I didn't see a bit of it. Andrea headed out around midnight. I was having bad sinus trouble so she gave me a sudafed, warning me it might keep me awake, which instead put me to sleep.

The qemu 0.10.0 release doesn't work out of the box for powerpc or sh4. The patch for sh4 -append hasn't been merged yet, but is easy to apply. The powepc issue I tracked down yesterday is something in openbios. A workaround is to skip -nographic, but that makes the emulator much less useful because you can't script it through stdin/stdout.

Greg Ungerer's done a marvelous job hand-holding a git newbie like me. (Ok, I didn't know how to fetch a branch. It hadn't come up before.) Confirmed his m68k fix works, and hopefully it'll make it into 2.6.29.

The sh4 native building issue was fixed by narrowing --strip-unneeded. That was a weird symptom, but it explains why the cross compiler worked but the native compiler didn't.

Finally got Stephen Dawkins' 610 kernel tested (he hasn't got a debug adapter so he sent me a kernel to test with mine... back on the 10th. Oops.) It didn't work. In theory you can tftp new firmware into a router that's in the process of booting, it stops and waits about 3 seconds for you to do this, then continues booting with the firmware it's got if you don't. In practice, I've never managed to hit the window (not that I'm sure I'm even attempting it correctly).

Catching up on my podcasts, watching Rachel Maddow's February 16th show (yeah, a touch behind) where she's covering Kim Jong Il's birthday. Am I the only one amused that the miracles the government of North Korea is attributing to Kim Jong Il (his birth caused pear and apricot trees to sprout in winter, and this year his birthday created a halo around the moon) are pretty much the same as the Roman government attributed to Jesus (he smote a fig tree and his birth was marked by a star). Then again I categorize "the mother of all battles" and El Shrubbo's "mission accomplished" banner together. Obviously, all this propoganda is the result of Intelligent Design, by His Noodly Appendage.

It occurs to me that around half the eating I do is for social reasons. I went out to dinner with with Andrea at Whataburger last night (having just cooked and eaten chicken), I'm meeting Mark for dinner at Texadelphia (having just eaten leftover chicken and an apple), I regularly enjoy going out to restaurants with Fade, biking out after midnight the only places that are open are restaurants...

So, according to qemu svn 6658, the OpenBios image got updated from OpenBios svn 450 to OpenBios svn 463. So how do you build openbios? Hmmm, let's see, no ./configure, the README says it's config/scripts/switch-arch, which supports "cross-ppc". Ok, try that and run make. (I haven't supplied a cross compiler yet but presumably it'll complain when it gets to that point...) And it builds fourth stuff for a while, and then dies with:

Building OpenBIOS on amd64 for ppc
tail: cannot open `/build.log' for reading: No such file or directory
make: *** [build] Error 1

Well of course it can't access a file out of the root directory, I'm running the build as a normal user. And V=1 does nothing. No obvious way to make it verbose, actually, so time to grep for occurrences of "build.log" in the source... Ah, the bug is actually that the error reporting tries to "tail $$$dir/build.log", and that dir should only have two $ on it. Queue up an email to blue swirl about it, fix my local copy, and... It's looking for powerpc-linux-gnu-gcc, and it's getting that from the TARGET=powerc-linux-gnu- prefix in config/examples/cross-ppc_rules.xml... and there isn't an obvious way to override it at configure time. It's generating obj-ppc/Makefile but calling it via recursive make so overriding it on the make command line won't even work. Great. Ok, edit my copy to be just powerpc-, make clean, re-configure, set the $PATH to point to the powerpc cross compiler, re-run the build, and...

Wow, I think it worked. (I usually expect packages to put up WAY more of a fight...) obj-ppc/openbios-qemu.elf is 271368 bytes, the 1.0 release version in qemu is 271336 bytes and "file" says it's a 32-bit ppc ELF...

March 17, 2009

Biked to the north Chick-fil-a location last night. Adrienne called (locked out) while I was there. It's at least an hour and a half bike ride each way (closer to 2) so she drove up to get the key. I only got about 45 minutes of work done because they closed at 10, but yay exercise.

Thinking of biking to the south chick-fil-a location today. Made it as far as the Starbucks on 15th street.

Mark fingered the --strip-unneeded line as screwing up all sorts of stuff. In his case, breaking the ncurses build. In my case, it's the reason static linking wasn't working. (It might also be the reason that sh4 wasn't able to compile anything, but I can't be sure because qemu 0.10.0 won't run sh4 either. I vaguely recall I'd patched something in qemu to make it work, need to track down and redo that...)

Ok, installed svn on securitybreach, and now binary searching to see where qemu broke powerpc. Let's see, 6591 works, 6776 is broken, 6650 works... It's svn 6658 that broke it. Poked the #qemu channel on freenode and the qemu mailing list. Apparently it only breaks -nographic. Go figure.

Huh. So Dell responded to the Macbook Air by coming out with the Dell Adamo, a system which costs $200 more than the Macbook, has a slower processor, less powerful graphics chip, weighs a pound more, is slightly wider, /bin/bash: indent: command not found

I've been dropping "span" tags in my blog for something like a year now, meaning to write a cgi filter that'll actually use 'em, and for some reason today I keep doing "spam" tags instead. (Blame my email...)

Interesting. Time magazine estimates that if California legalized pot, it could balance their state budget by $2.3 billion/year. That's $1.3 in tax revenue and $1 billion saved in enforcement. (It's california's most lucrative agricultural crop, earning $14 billion/year.) Legalization would be personally inconvenient since I continue to be allergic to the stuff, but other than that it seems kind of obvious, possibly inevitable.

March 16, 2009

Dropped Adrienne off at SxSW this morning (she overslept; I simply hadn't set the alarm), then returned yet another fax to my sister's realtor, and now hanging out at the coffee shop next to Kinko's on 26th street. The barista is _stunningly_ cute. (Somewhat distracting, actually.)

Wendy at the credit union poked me about hazard insurance for Kris's new house, and I phoned and left a message with the Nationwide office on Far West.

Tested a new m68k tree from Greg Ungerer. Didn't fix the problem (uClibc build dies trying to generate the syscall list due to headers exported by make headers_install #including files it didn't install), but I got him the error message and hope for a reply.

Replied to an email about my tinycc fork (unrelated to resubscribing to the mailing list, this is a college professor at Tufts who says he and his postdoc wanted to hook my old fork up to a different backend. Could be interesting. (I still think qemu's tcg is the logical backend for the thing, but _any_ work to genericize it so the frontend and backend talk through some kind of defined interface can only help matters.)

Critiqued Pavel Machek's filesystem requirements documentation some more.

Wrote up a quick summary of last week's 440 work and sent it to Mark so he can do Impact Linux invoicing.

Heard back from the insurance guy, need to talk to an insurance guy in Minnesota. Right, called him, gave him the contact info for the mortgage lady so she can tell him what kind of insurance I need.

Tracking down how I broke the powerpc target, checking the revision history and building old versions until I find the one that doesn't work. (This would go faster if I didn't keep running out of disk space. Yeah, I managed to fill up the new bigger drive.)

I'd hoped to go to some of the free SxSW interactive parties tonight, but after being beaten to death by interrupts over the weekend I just haven't got the energy. Might go bike to someplace to huddle with my laptop...

March 15, 2009

Didn't get to update my blog yesterday. It's 8pm and I'm keeping Mark waiting while I stop at a restaurant to eat for the first time today.

Too overscheduled. Really in danger of burning out. I am now officially swap thrashing. I spend so much time being interrupted and context switching that I'm getting NOTHING DONE.

Adrienne's visiting and I have no time to spend with her. She's free for dinner tonight as soon as she finishes her nap; she came back from the conference exhausted. She's skipping the SxSW parties tonight, many of which I could go to if I had time because they're free. I really wanted to see the vintage coin-op video arcade they had (free to everybody, no badge required) at the convention center, but I never had time and it was only there for three days ending today.

Fade made it off to the airport early this morning. I didn't get to spend a lot of time with her this week, but she's only gone for a week.

Mark's wanted to meet me every evening for four days now (both because he needed help moving slightly _farther_ out of town, and because he needs help with a big documentation task Cisco assigned him with; Mark doesn't really do documentation). He's my friend and I want to spend time with him, and the technical stuff is even billable time, but with traffic it's an hour drive each way to his apartment. (Between errands, visiting Mark, driving Adrienne around, and driving Fade around, I've gone through an entire tank of gas this week. That's kind of unusual.)

I'd hoped Adrienne's visit would inspire me to clean the condo (which would have helped Fade's pre-trip stress too), and I did manage a couple hours, but I just didn't have time to finish it.

The mortgage on my sister's new house was approved, although the closing had to be rescheduled for sometime later this month (I forget when, but it's not this week and I think it's in my email). I need to go get some kind of house insurance ASAP (I forget what kind, but it's in my email). Still haven't manage to do so, maybe I'll find time tomorrow. (I need to call Kris and see how she's doing. Maybe I'll find time tomorrow.)

I've haven't had a chance to open the Lua book since the last time I wrote about it. Yes I know Gentoo From Scratch is blocked on a new useradd (because mark doesn't want to install an old version of shadow utils for some reason) and useradd is a toybox thing and toybox is blocked on me learning lua to rewrite it. I'd _love_ to have time to learn lua this week. It's very hard to do in 15 minute increments. Maybe I'll find time tomorrow.

Spamalot was in town all this past week and I missed it. I missed David Schwimmer mocking Showgirls at the drafthouse. I missed the start of the drawing class at UT I wanted to take this semester. Lawrence Lessig is in town for SxSW, maybe I'll find time to meet him tomorrow.

I've made trivial progress tackling my todo list for the FWL 0.9.6 release I wanted to do tomorrow. Maybe I'll find time tomorrow.

Went out to Whataburger last night and carved out a couple hours to poke at the 440 toolchain, because I'll have to do a progress report about that tomorrow. (I'm calling this week half-time because even with poking at two different crosstool variants I can't say I managed more than 20 hours.) I duct taped something into a FWL target that _might_ look like 440 toolchain in the right light, and fired up "qemu-system-ppc -M bamboo", and found out that the 440 processor isn't _implemented_. It's just a stub, the bamboo board just emulates the system-on-chip peripherals and otherwise uses a 405 processor emulation. (I'm embarassed this took me until yesterday to discover, but I didn't have anything to _test_ before then.)

I'd love to go visit the doctor to get antibiotics for the sinus infection that's preventing me from sleeping for more than a couple hours at a time. I've meant to go ever since I got back from Pennsylvania. (Of course the paperwork for better health insurance never got filled out anyway.) Maybe I'll find time for both tomorrow.

There are so many other things that have fallen through the cracks. Solar asked me to cut new FWL binaries on friday, and Vladmir Dronnikov's wanted those for months. Haven't found time. I need to set up the nightly build cron job, which means I need to fix static linking, see "FWL todo list" above. It's just a simple binary search regression test, I could sit down and figure it out in an hour or so. (Might be when I fiddled with the wrapper, might be when I started stripping the libraries.) And the m68k guys sent me a link to an atari emulutor that can apparently run a m68k kernel ( but I haven't looked at it yet, nor have I followed up on the m68k headers_install issue in 2.6.29. Maybe I'll find time tomorrow.

I need to poke Eric about the C++ paper (he says he emailed me a draft earlier this week, but I didn't get it, and he's not getting my email either; sounds like a local delivery routing issue on grelber we need to track down). I haven't been to Starbucks in weeks. I need to get some exercise, and a haircut. My current car insurance card expired on the 5th (I think they sent us a new one, but dunno where it is). I need to hire a lawyer to show that Stafford idiots that when they gave me a "failure to show proof of insurance" ticket I did, in fact, have insurance (the card fell under the passenger seat). I need to deliver those shirts to Reese. I haven't spoken to Stu in weeks, I need to resubmit the perl removal patches, I may _never_ get QEMU weekly news updated...

Sigh. I don't even REMEMBER most of the things I have to do, they're just piling up into a big wave. But Mark called, and I guess I've kept him waiting long enough. Off to do documentation...

Ok, got some documentation edited over at Mark's, took Adrienne to dinner around 1 AM at the 24 hour IHOP (after visiting a 24 hour wal-mart and walgreens trying to find a supplemental battery pack for her smart phone), going to bed in hopes of getting up early in the morning and tackling backlog. May just turn my phone _off_ all day to avoid interruptions.

March 13, 2009

Lua books (2): check. Roxette "Look Sharp" album (to which I learned C): ripped, converted to mp3, and shoehorned into rockbox (which has the most horrible user interface, but oh well). Soda refills (Einstein's this time): check.

Not a bad language, really. It's firmly in the whitespace agnostic camp, not even requiring semicolons (which to be honest, aren't usually needed in C either; the compiler tells you when you've dropped one). It uses "end" in place of closing curly brackets, and makes it mandatory instead of having magic single line contexts after if statements and such. 21 reserved keywords. -- starts a single line comment, --[[ and --]] are multi-line comments. Variables are created by assigning to them, and reading a nonexistent variable return "nil" (instead of throwing an exception, ala Python).

I love this Album. Yeah it's old, but it's great to program to. (Joyride's ok too.)

Ok, I'm to section 1.3 in "Programming in Lua" and I've hit the first place I want to smack the author: "Global variables do not need declarations. You simply assign a value to a global variable to create it... if your variable is going to have a short life, you should use a local variable." Ok, _HOW_? YOU'VE NEVER MENTIONED LOCAL VARIABLES BEFORE NOW. This section offers no way to distinguish between local and global variables. I'm GUESSING it's a scope issue, but that's just a guess. Throwing me a forward reference here would be very reassuring; the author did not do this.

Wow Rockbox really is a horrible music player. (It works fine, it's just a pain to pause and unpause when I get a soda refill. Plus putting music _into_ it requires an act of congress.) I miss the KDE one. After several months of daily use, I think I can confidently say that I hate everything about Gnome. (I still have my little orange "iPod Klepto" around here somewhere, but apparently it's not in my backpack. Presumably in the car. Bit fiddly to use my giant noise cancelling headphones with a device that's maybe 1/20th their size, but it's got the "big prominent pause button" thing down. Hopefully there's a way to get it to play things in order rather than random shuffle. I'll have to look into it.)

Dear Gnome console developers: putting the close button for the last tab right above to the up arrow for the scroll bar is DEEPLY STUPID. The Konsole people figured out not to do that _years_ ago, and I just closed 3 tabs I wanted to keep in a slightly misaligned attempt to scroll up. (This is unrelated to the above statement that I hate everything about Gnome, which predated me doing this.)

The original crosstool only _claims_ to have support for the 440. If you actually try to build it, the thing breaks trying to build the first package (gcc 3.3.6, which is _really_ strange because I _told_ it to build gcc 4.1.1, which is the latest version it has). Looked at its config and it _seems_ the magic is --with-cpu=440 for gcc. Dunno if I need to tell binutils something (it didn't try to build binutils before trying to build gcc 3.3.6... I don't get it either).

Broke down and resubscribed to the tcc mailing list a couple days back to ask if CVS was dead yet. No official response so far, but I replied to the one guy who commented with a big long rant about what tcc SHOULD be doing. (Do the busybox swiss-army-executable thing and replace binutils, grab TCG from QEMU and use it as your code generator, clean up the path mess, try to build real packages... The usual. The kind of things I was working on when I left. Progress made so far: zero. Oh well.) They'll probably say CVS isn't dead, and then I'll go away for 6 months again. It's a bit like groundhogs day: I pop my head up, see CVS, and go away for 6 more months.

March 12, 2009

Once again searching for a 68k emualtor to run the m68k target on, since qemu's 68k support is just coldfire. I found the "uae" and "xmess-x" packages in Ubuntu, which in theory might both be able to load and run a 68k Linux kernel, but both projects are too insular to actually have obvious contact info for anybody I can ask about them. The m68k linux site is all about using real hardware, and doesn't mention a single emulator. Sad, really...

Punt that as a tangent I haven't got time for right now, focus back on clearing the deck for the new ppc440 target.

Downloaded the original crosstool, which already has 440 support, albeit only for glibc and last updated ~3 years ago.

March 11, 2009

The last time I tried to get gcc 4.2 working was over a year ago, but I seem to be making the same mistakes in the same order. (Up to and including setting ac_cv_path_RANLIB_FOR_TARGET="$ARCH-ar" instead of $ARCH-ranlib.) Oh well, I vaguely recall the debugging steps I did last time, so it's not taking nearly so long to fix. (Checking my old blog doesn't help, the granularity of what I blathered about wasn't high enough. I thought I had the old patch saved somewhere, but if so I can't find it.)

I'm pretty happy staying with gcc 4.2.1 until a gcc alternative comes around because the FSF seems to have gone completely insane:

We are also concerned about recent attempts to implement the C front-end of gcc in C++. We believe that is a bad decision in general (due to demanding C++ as bootstrapping environment) and would like to get rid of the gcc dependency for these reasons.

You'd think even the FSF wouldn't be crazy enough to rewrite their C compiler in C++, but you'd be wrong...

Sigh. Once again, moving to gcc 4.2.1 fails to fix _anything_. The "duplicate .personality directive" thing with arm eabi is still there, now m68k is broken... No, that was broken by the 2.6.29-rc7 kernel.

Note to self: updating multiple things at once is a bad way to debug anything. Fix stuff and commit it, THEN upgrade the next thing. (This is why I'm holding off on moving sources/packages up a level so "sources" is all stuff stored in mercurial without other files mixed in there.)

Trying to close down for a FWL release, but I still need to install 4.2.1, get the 2.6.29 kernel working smoothly, fix arm eabi, fix sh4 native building, fix native static linking (which is blocking the nightly cron job), fix the "new" pthreads implementation in uClibc, add a PPC440 (bamboo) board, fix sparc, redo the USE_UNSTABLE stuff, move sources/packages up a level, make sure uClibc didn't break anything...

March 10, 2009

My tax appointment turns out ot be at 2pm, not 10am. Oh well, it means I got Fade up in time for class even though the power apparently went out last night and the alarm clock was all flashy this morning.

I do not understand Firefox. When I type into the URL bar it has a pulldown guessing things I might want to do based on browser history. (This is fine, Konqueror's been doing this for me for 5 years, I'm glad they caught up.) But when I hit "d" (and nothing else), its first suggestion is "" which hasn't got a d in it. (It's one of the ~2 dozen webcomics I follow.) The description for that is "Freefall 01702 March 9, 2009" which hasn't got a d in it either. This entry is by no means the most frequently visited site I go to (it only updates monday/wednesday/friday, Schlock Mercenary updates every day including sundays), it's neither the first nor the last entry in my daily webcomic trawl, (it's usually after Misfile and Questionable Content but before Girl Genius and Irregular Webcomic... none of those are on the list it shows.)

The second entry is, which hasn't got a d in it either but the title string it remembers is from Friday which has a d (and that d is underlined by the dropdown thing)...

Sad, really. That's got a d in it too.

Yay, managed to panic the kernel and reboot my laptop. (Once again, the ipw3945 driver went catatonic, and an rmmod/insmod cycle brought the whole thing down. I realize it's now called "ipl3945", but am ignoring it because the rename was pointless and stupid. It's the Intel Pro Wireless 3945. That's the product's name. I don't care if Intel calls it the Purple Gorilla, I just want CONSISTENCY.)

March 9, 2009

I wonder if anybody's written a sequel to Fear of Forking incorporating the effect of distributed source control. With new tools, the friction of moving patches between trees is greatly reduced, allowing properly managed forks to diverge farther before resyncing between them becomes problematic. (This has sped up development considerably.)

I also wonder if anybody's written a proper text essay covering the points in the time based releases video. It's a great talk, but it takes an hour to watch; a good text version could convey the information much more concisely.

For both of the above, read "I wonder if anybody's written" as "I hope I'm not going to have to write this up myself, because I haven't got time for this."

The Donald's is offering free cinnamon rolls with any "McCafe" purchase on Wednesday. Still trying to strap rockets to that pig and get it airborne...

Sigh. I have an adversarial relationship with cash. Last night (this morning?) when I biked to Ihop at 3am, I had a $20 bill left over when I left. I have no idea where it's gone. Using my debit card to buy the 2 for $1 apple pies seems _silly_, but...

March 8, 2009

I keep forgetting about various things I've done until something reminds me of them. (No, I'm not talking about finally having picked up my new glasses from the mall, although they are nice.) I emailed Mark a link to my 3 waves articles (linked from the end of this page) due to a conversation we had on Friday. It's an ancient thing that I just assume my friends already know about, but apparently not.I haven't written for The Motley Fool in almost a decade, don't have anything to do with Penguicon anymore, haven't done much with Liquid Nitrogen in a couple years, am not currently involved in any intellectual property lawsuits that I'm aware of, haven't taught college courses in a long time, and so on. (Oh, and my chain mail making equipment keeps getting taken out of my backpack every time I go flying somewhere, just remembered and put it back.) I don't sing or play the piano anymore.

Of course I start new things all the time too. Busybox is receeding into ancient history, but toybox is likely to get relaunched in Lua once the books mark ordered come in.

I'm also pondering writing a webcomic. I can't draw, but that doesn't seem to be a requirement (at least not initially, and even the ones who could draw tend to improve. And sometimes they could draw some things better than others, and there are some obvious ways to cheat...)

Still, huge time commitment. Goes on the todo heap. (Been on the todo heap for years, but Penguicon was on the todo heap for 3 years before it actually happened.)I should catch up on this first, though. And find somebody to hand this old thing off to.

Speaking of which, if I was still involved with Penguicon I'd try to get Lawrence Lessig and John Conyers on a panel to talk about copyright. The reason that's particularly relevant is that Conyers is a congressman from Michigan's 14th district (I.E. Detroit), so if the Penguicon guys wanted to lobby against him in the primary it would _mean_ something, so he'd want to listen to them. And Lessig would want to bend his ear something fierce, and that would be worth seeing.

But I doubt the current lot would think of it, or bother to do anything about it if they did. Oh well.

Today while emailing the Armadillocon panels guy, I wound up looking at some of the comments about the video of me throwing Liquid nitrogen into a swimming pool years ago. Back when the Penguicon concom were doing the "Oh no, nitrogen, it'll kill us all" thing (hint: before it was deployed at Penguicon, which was after we'd done it at Linucon _twice_) I wrote up an analysis of how dangerous it really is, at least from a suffocation standpoint. (Summary: not as much as you'd think, although you can always drown in your bathtub if you try.)

It's amazing to watch people deeply, deeply clueless about something, with no firsthand experience, try to warn others about the dangers of it. Liquid nitrogen is about as dangerous as boiling water, which isn't to say it can't hurt you, but fleeing in terror is probably not a measured response to the situation. No, the pool wouldn't freeze solid from one bowl of LN2. No, it wouldn't cause your arm to shatter. (Flash-freezing _anything_ bigger than a grape is hollywood theatrics, even if you discount the ledenfrost effect. The specific heat of the material isn't that high, the quantities involved are fairly small, and there are properties like "thermal conductivity" (read "insulation") that come into play to prevent anything more than surface burns. I've been stung by LN2, but it never even left a red mark. My friend Reese had a whole bowl dumped in her lap (by somebody who was looking at the pretty girl and not where he was pouring during an ice cream making session) and she wound up with a nasty looking burn down her thigh (somewhere between first and second degree, some blistering but not much); she said she'd gotten worse spinning fire. And Mark grabbed a runaway vapor hose whipping around when the phase separator shattered at the first Linucon (it's sort of a metal sponge on the end of the hose that keeps the pressure contained so liquid comes out instead of a 3000 psi stream of gas; somebody let go of the hose after filling a bowl and it whacked against the side of the metal tank while still frozen from use, and it shattered). Mark had to hold on to the hose (no glove) with one hand while turning the tank's nozzle off with the other, and the last joint of his middle finger on that hand swelled up and essentially turned into a giant blister, but it healed with no permanent damage.

Yes, I was wearing flip-flops in the video. I always wear flip-flops, but in this case they turn out to be recommended because shoes can trap the liquid against your skin, or absorb it. The _TOWEL_ was the most dangerous part about what I was doing because it could absorb LN2 and hold it against my skin (rather than letting droplets skitter off like water on a hot griddle). But I didn't have anything better (the metal was painfully cold, think Minnesota in February), small splashes would evaporate (room temperature is over 200 degrees celsius hotter than liquid nitrogen, it boils instantly on contact with _anything_), and if it started to sting my hand badly enough it felt like I'd get an actual burn I could always I dump the bowl on the ground and step back. (That would give a similar cloud of fog, but not as much as the pool. The fog's water vapor, don't you know. Nitrogen's as transparent as you'd expect something making up 78% of the atmosphere to be.)

People who have little or no firsthand experience with fire think if you touch a match to a log or a person it'll go up in a blaze instantly, when in reality you can pinch out a candle flame with your fingers and getting a campfire lit can be serious work in damp places like michigan. Liquid nitrogen's the same: yes it's cold, no it's not magic, no matter what hollywood likes to show.

Several of the commentors confused it with liquid helium (according to the guy at the UT cryolab that's far more dangerous, no Ledenfrost effect to speak of so little droplets stick to your skin and burn it; also several times more _expensive_ than LN2), and with liquid oxygen (that stuff's a fire hazard and again significantly more expensive, but I have heard of people making ice cream with it and they said it tasted fine. Gasoline's a fire hazard and we pump our own gas, so..).

Off to watch more bleach, then maybe actually get out and bike today. Tomorrow morning I have to collect documentation to get to the mortgage lady for the house I'm buying my sister, and then tackle crosstool-ng some more.

I need to write a pod2man replacement in lua, so I can remove the use of perl from the documentation build of various packages. (Man pages are legacy, but still hanging around...) Yes, qemu svn 6785 concerned me a bit when I first saw it. Even BusyBox uses pod2man. Camel's nose under the tent flap. Just a _little_ deficit spending...

March 7, 2009

Fade and I went to see three museums today, each of which was very very small and proudly displayed the kind of things you tend to see hanging on restaurant walls.

Watched Fade play a lot of Neverwinter Nights 2 today, now watching more bleach and pondering a bike ride. (Yay exercise, now that the pollen count's down to something measurable.) But then I couldn't watch bleach episodes...

Downloaded crosstool-ng. It insists I install libtool and makeinfo (well, texinfo) before it'll finish its ./configure. That's just sad. An obsolete documentation format nobody uses anymore, and a tool that makes non-ELF systems pretend to be ELF systems (and is thus a NOP on ELF systems).

Ok, installed the vestigial organs it complained about, but now I don't see any way to tell it what target I want to build a cross compiler for. The ./configure's --help option doesn't mention it. There's no "make help" target. Typing "make" by itself makes a file called "ct-ng" and some docs files... It's another makefile. What an odd roundabout way of building.

So the documentation files are in man page format. (Why?) And start with:

crosstool-NG makes it easy to build cross-toolchains, and allows you to take all the juice out of your target by configuring the different com‐ ponents of the toolchain accordingly to the targeted processor.

What does it mean to "take the juice out of your target", I wonder?

It mentions a menuconfig, but there's no "make menuconfig" target. Tried "make -f ct-ng menuconfig" but that complained about being unable to find the current directory's files under "/usr/local/lib". So I have to install this source at an absolute path in order to use it? That's kind of insane...

Ah, it expects me to "make install" the _source_code_ so it can then build from this absolute path. Sigh. (I am now repeating "I'm reinstalling this laptop when xubuntu 9.04 comes out" like a mantra. As mantras go, it's a bit long.)

It doesn't autodetect $CPUS, it has you set it by hand in menuconfig. Lots of other things I use --options for, it sets in menuconfig (including things like "exit after downloading source code" where menuconfig seems an outright INSANE way to do it). It has a "Use obsolete features" selector but even when I don't select that it defaults to binutils 2.16.1 (and lets me select) and kernel 2.6.26 (and lets me select up through 2.6.27 something). Oh wow, it's got a config option for the gcc "-pipe" flag. Is this an implementation detail the user should be bothered with? And what's up with "Kernel verbosity" here?

Heh, yeah I heard that gcc 4.3 now needs additional random external libraries (gmp and mpfr). Yet more reason to look into other compilers that aren't maintained by crazy people. I don't think gcc 4.2.1 (the last GPLv2 release) was afflicted with that, though.

Why does this thing have config options with only one entry? "C compiler (gcc)"... is a menu with one thing you can select. "Toolchain type (cross)" is another one.

Ok, it's configured for generic powerpc. I'll have to dig up 440 options later (and make a uClibc config for it), but let's see if the build can download the source tarballs first...

Huh, "Directory '' does not exist. Will not save downloaded tarballs to local storage". That's kind of crazy. Ok, tell it "walrus"... and it complains that doesn't exist either, I have to mkdir walrus for it. It's also complaining I "did not specify the build system". What does that even MEAN? Isn't crosstool-ng a build system for cross compiler toolchains?

[ERROR]    Build failed in step 'Extracting and patching toolchain components'
[ERROR]    Error happened in '/usr/local/lib/ct-ng-1.3.2/scripts/functions' in function 'CT_DoExecLog' (line unknown, sorry)
[ERROR]          called from '/usr/local/lib/ct-ng-1.3.2/scripts/functions' at line # 564 in function 'CT_ExtractAndPatch'
[ERROR]          called from '/usr/local/lib/ct-ng-1.3.2/scripts/build/kernel/' at line # 25 in function 'do_kernel_extract'
[ERROR]          called from '/usr/local/lib/ct-ng-1.3.2/scripts/' at line # 430 in function 'main'
[ERROR]    Look at '/home/landley/x-tools/powerpc-unknown-linux-uclibc/build.log' for more info on this error.

Where the heck did /home/landley/x-tools come from? I never asked for that. It's making random directories in my home directory... (Ah, it's part of the default .config, CT_PREFIX_DIR. That's _INSANE_.

They really expect this sucker will only ever be run on a dedicated build system.

Here's a message Linus Torvalds wrote back in 2000 railing against something similar. Hardwiring absolute paths into your build system is a BAD IDEA, and this is not a new observation. (The phrase "locality of reference" comes to mind...)

But it's worse than that. It's not just that I am _in_ a directory, so if you need to put the output somewhere, here (or a subdirectory of here) is the logical place. No, it's that the build is already reading .config relative to where I am, so I have to build in this directory. The dependency on the current directory already exists, and they add an absolute hardwired output path on TOP of that. MADNESS!

Ok, so that directory contains a build.log (why not just spit it out to stdout???) that says:

[ALL  ]    bzip2: Compressed file ends unexpectedly;
[ALL  ]     perhaps it is corrupted?  *Possible* reason follows.
[ALL  ]    bzip2: Inappropriate ioctl for device
[ALL  ]     Input file = (stdin), output file = (stdout)

That was while attempting to decompress the linux- kernel tarball it just downloaded. Left to its own devices, this build script couldn't competently download and extract a tarball.

Where's the corrupted tarball? The "walrus" directory it complained about is empty, "find" says it's in targets/tarballs. So I hit ctrl-c to kill the build because it complained it wouldn't save the tarballs locally until I created a directory, and it kept the partially downloaded tarball, and then didn't notice it was

This is really horrible. How do people use this thing? (I guess there's a reason somebody's willing to pay _me_ to fiddle with it rather than do it themselves...)

And the frightening part? This package is apparently considered a huge improvement over the original crosstool. THE MIND BOGGLES!

By the way, all the problems I'm complaining about here? My build system handled them properly before I ever _looked_ at this thing. But _this_ is the one everybody uses as the standard, and I've been asked to beat a GPLv2 ppc-440-uClibc toolchain out of it. Right...

I deleted everything out of targets/tarballs and reran the build. So far I've gotten "4:43" ( a minute and second counter) of a twirling bar "progress indicator" that doesn't actually indicate progress, just that the sucker hasn't hung yet. No indication of what file it's downloading, how long left to go... The default output of "wget", by the way, is a progress bar. This thing has to go out of its way to screw it up. And does.

And then it died because I hadn't provided a uClibc .config. Ok, sort of valid gripe, but the amount of expertise required to _use_ this tool approaches the amount of expertise required to _create_ the tool...

Random bleach comments: I vaguely recall from the first time I watched this thing that around episode 17 they started having real trouble keeping their contexts straight. (He got stabbed in astral form, and then his physical body's being patched up. Except I thought it was established that they don't have "the matrix" style manifestation of physical wounds from astral combat, what with Kon posessing him half the time he's away and all... Oh well, they'll have fewer contexts to keep track of a few episodes from now.)

March 6, 2009

So UML works if you enable CONFIG_EXPERIMENTAL. It became a requirement in 2.6.28, and apparently I'm the only person anywhere who noticed. Fun.

Met with the mortgage lady this morning. She needs four different types of additional documentation. Track it down tomorrow...

Conference call with Vito and Stefan, got a concrete assignment (PowerPC 440 toolchain with current uClibc) for the next two weeks. Yay. Meeting with Mark this evening to go over documentation.

And the cable modem went out. Sync light flashing. Called them and they're sending someone today on an "emergency basis", but they have no idea when. Whoever it is will call an hour ahead of time, though. Three hours later, Fade and I are out at Schlotzky's because there is internet here and we have laptops, and my phone is a cell phone.

Hey, QEMU finally had its 0.10.0 release. (The first one that builds with something other than gcc 3.x, thanks to TCG.) Of course to find out about it, you have to either subscribe to their mailing list or magically know where their new website is, which strangely enough isn't this, isn't this, and weirdly enough isn't even this. Um, yeah.

Sigh. Spamalot is coming to Austin next week, but the website's "get tickets" link is broken. Fade's having midterms and then spending the following week (spring break) in Tokyo, so the entire week's pretty much shot for her. Oh well, maybe when Avenue Q gets here in April...

Eric has a Google Phone, which I saw when I was up visiting him in Pennsylvania, and I was vaguely pondering getting one to play with. It's a Linux programming environment with a keyboard, Arm based, even runs Lua according to Christian, and I wouldn't have to switch off of T-mobile to use it. But it turns out that they're too stupid to live, and I have now permanently lost interest in ever developing for that platform, ever. (Maybe if I were directly paid to work on something specific, but for fun? Let it die. I'd rather get an iPhone and jailbreak it, less hypocrisy all around.)

March 5, 2009

At Starbuck's, sans internet for the moment, pondering my todo list.

No idea what's up with Cisco anymore, but I've got email in to ask. (I'd hoped being away for a couple weeks off meant they'd have a clear idea what they wanted me to do when I got back, with work queued up for me, but apparently not. Out of sight, out of mind, I guess.)

Now at Thundercloud, meeting Fade for lunch.

On the FWL front, armv4l, i586, mips, mipsel, and x86_64 currently boot and build a threaded "hello world" with the native toolchain, using uClibc and linux-2.6.29-rc7. This means armv5l, i686, m68k, powerpc, sh4, and sparc aren't, and need to be fixed.

Two of those failures (m68k and sparc) have been completely broken all along, so there's nothing new there. The armv5l problem is that I'm halfway through converting it to eabi, and the toolchain isn't liking me, and the problem with sh4 is that the native toolchain still isn't finding _init and _fini out of crti.o (but the cross toolchain is; go figure). The powerpc problem seems to be a kernel thing; it's just not booting. (Or possibly it's booting fine but not attaching the serial device to /dev/console.)

The i686 problem is that I switched it to the "new" pthreads instead of the "old" pthreads. Yes, uClibc has two complete pthreads implementations, and is about to gain a _third_ threading implementation (NPTL), and Bernhard doesn't see this as a problem. I tried to eject the "new" one (which never worked) before 0.9.30 came out, but Bernhard wouldn't go for it. Now Denys and I are trying to eject the "old" one, which involves debugging the "new" one. So debugging there...

Looking at why hw-uml doesn't work. User Mode Linux 2.6.28 doesn't work when built on an Ubuntu 8.10 host. I've tracked it down to one of the changes that went in between 2.6.27 and 2.6.28-rc1, so now it's time to binary search between hg 110878 (works) and 118520 (broken).

Binary searching with Mercurial is fairly straightfoward. I wrote about it last year, it's mostly just binary searching the sequence numbers of a range, the tricky bit is if you track it down to a commit that's a merge, look at the the two parent commits and figure out which one of them exhibits the problem. 114000 worked, 116000 worked, 117500 worked, 118000... doesn't build, it has an error in the block layer. Well, that can be configured out if we're using hostfs (and we enable CONFIG_EMBEDDED)... And it doesn't boot init if I switch it off. Did it _used_ to work with the block layer switched off? (Check 114000 again... Nope, hostfs doesn't work without the block layer. I wonder why that is?) Ok, 117961 is the start of a branch... and that works. 117980 fails. 117970 works. 117975 worked, 117978 failed, 117977 worked. It's 117978. Which is WEIRD because that only affects kconfig files. Ok, so what impact does it have on the .config? The symbol "CONFIG_3_LEVEL_PGTABLES=y" goes away. That sounds kind of important... Yup, there's the bug. The kconfig "lack of visibility means it doesn't get written out" insanity strikes again. Still stuck with that design flaw, after all these years...

Poked the kernel list. Might be fixed before 2.6.29 ships, but really, has NOBODY used UML on x86-64 since 2.6.27? This broke in 2.6.28-rc1, and was broken in 2.6.28, and is still broken in 2.6.28-rc7. As far as I can tell, it doesn't work at ALL on x86-64. I know KVM and lguest and the other dozen virtualization schemes has taken a lot of the wind out of UML's sails, but they only ever ran on platforms (ppc, x86, x86-64), ppc was never fully merged that I know of, and then x86-64 being dead for almost two releases... That's not a sign of health.

March 4, 2009

Plane back to Austin, offline all day. Read "Graceling" instead of trying to pull my laptop out of my backpack. (Decent book, awaiting sequel.)

March 3, 2009

So one easy way to bypass the VFAT patents on long filenames being asserted in the tomtom case is to use UDF instead. That's the file format used by DVDs, which turns out to be writeable just fine when put on a writeable media such as a USB key, and should be supported by windows and mac out of the box just fine.

The problem is, the people who wrote the Linux mkudffs command didn't zap the FAT file allocation table at the start of the device, so if you plug in a device that used to be fat formatted but which you've reformatted UDF using that broken command, it gets autodetected as FAT and you see the old root directory. (If you then try to do anything with those files, bad things happen.)

Yeah, somebody thought that through.

So I now have a USB key that gets automounted as FAT by the stupid HAL when I plug it in, and then I have to umount it and "mount -t udf /dev/sdb /root/disk" to see the actual _files_ on it. I need to look up the fat file format and lobotomize it with a hex editor. (Also, there's supposed to be a mkfs.udf link so mkfs -t udf can find stuff, but there isn't.)

I suppose this gives my backups accidental not-quite-cryptographic protection. (It's in a hidden partition! Oooh!. That would be... steganography? No. How about "horrible, broken indexing"? Yeah, that one.)

Had two energy drinks today. Got a drop of the paper back to Eric. (Yes, once again I'm heading home with the thing unfinished. We made lots of progress, there's just so much to do...) I have a third energy drink in the fridge, but I think I'll hold off on drinking that just now lest the Cheeseburgers of Icann start demanding veangance. (Caffeine plus sleep deprivation is _fun_! Wheeee... Grape mimes drink alike!)

So uClibc is out, and built for all targets in FWL. (I'm told C99 math is broken on PPC, but I have no idea if it worked before.)

LLVM 2.5 shipped. I should really go poke at that, but the todo list runneth over. Still need to track down what's up with sh4 and arm eabi, get gcc 4.2.1 working. Need to check in with Mark and Cisco when I get back to Austin. Appointment for taxes on tuesday, financing a house for my sister on thursday...

March 2, 2009

Sigh. It's not that I intend to violate GPLv3, it's that it's a complicated fiddly language which I haven't got the interest in studying it as deeply as I have GPLv2. I don't want to _deal_ with it. It's incompatible with GPLv2, there was no pressing reason to _replace_ GPLv2, most of the packages I deal with (starting with the kernel) still use GPLv2, and I'd really like the whole mess to go away, but this means completely weaning myself off of FSF software. Luckily, they don't really _do_ software development anymore, so it's not as hard as it sounds. 95% of the problem is "the standard toolchain", which is only interesting because it's _standard_. (Yes, the FSF is doing what Microsoft does: "We suck at this whole 'software' thing, but everybody uses us, so deal with it. Now open wide while we leverage our dominance to cram the rest of our agenda down your throat.")

March 1, 2009

Got the new toybox mailing list set up. Christian expressed interest in poking at Lua. (So did Eric, we're all thinking it would be a good learning experience.) But when I cc'd the old list at, it bounced. (Ah. I moved the Firmware Linux list, but forgot to move the toybox one. Right.)

The ball's in Eric's court on the C++ paper. I got the updated outline to him... day before yesterday, I think. And yesterday I went through the first 4 sections or so to align them with the new outline, and delivered diffs to him. Alas, Wesnoth put out a release today, which ate his brain. Plus he spent half of yesterday blogging a patent analysis of the Tom-Tom thing, and didn't have the focus left to work on the paper afterwards.

So of course today Slashdot links to a much shallower analysis done by Bruce Fscking Perens, which made Eric sigh and shake his head a bit on the way out to dinner. Oddly enough I sympathize: I walked away from Penguicon and thus let Matt take it over, Eric walked away from OSI thus let Bruce took it over, but we still move in the same circles. Unavoidably, we run into whoever's playing dress-up in our old clothes and roll our eyes at how badly they do it. But we left and life goes on. (Still, it's nice when you get a _competent_ successor, the way Denys is doing a great job on BusyBox, and Eric seems happy with the current Fetchmail guys...)

Both the sh4 and arm eabi issues have a 50% chance or so of magically fixing themselves if I upgrade from gcc 4.1.2 to gcc 4.2 (which I think is the last GPLv2 release). Given a choice between armv4l soft float and armv5l EABI, I think the second's more interesting anyway. Hopefully none of the other platforms will regress...

Darn it, it looks like gcc 4.2.1 was the last GPLv2 release. I'd hoped that the bugfix-only dot releases wouldn't be gratuitously relicensed, but this is the FSF we're talking about. They don't really _care_ about software, software is just a means to an end for them. What they really care about is cramming their political agenda down people's throats, and these days they've become so extremist that GPL version 2 isn't restrictive enough for them anymore. (I'm aware of the license exception that says building with gcc shouldn't contaminate the output, but I'm distributing build tool binaries and I refuse to get any GPLv3 on me the same way I refused to deal with Sun's CDDL. Of course it would be HILARIOUS if the SFLC sued me personally over my pet open source project, I could make that a real PR gold mine for them given our history. But I'd really rather not go there.)

February 28, 2009

So Red Hat's finally moving from Xen to KVM. That's only about a year and a half overdue. (Xen is massively overcomplicated, intrusive, and only paravirtualization. KVM is a QEMU accelerator, it's essentially a successor to KQEMU that relies on the new VT extensions in newer x86_64 processors (and powerpc has an equivalent). So you can't run it on older hardware, but how much older hardware has the speed and memory to spare to run instances of virtualized operating systems? And in theory KQEMU is more or less a drop in replacement on the older systems, but I've never gotten that sucker to work...

According to IDC demand for computers was down in Q4 of 2008, and expected to drop sequentially for Q1 and Q2 of this year too. The only up news is Intel's atom, showing the continuing shift to netbooks seriously picking up steam.

I need to make a genext4fs command. In lua, as part of the new toybox. The filesystem is out and stable as of 2.6.28...

February 27, 2009

So Livejournal's broken again. If you log into livejournal, your login expires after 24 hours. When your login expires, the next livejournal post that gets opened responds to the dead cookie by redirecting you to the login page.

The problem with this is if you leave your laptop running overnight (or suspend/resume), the first livejournal tab you open in the morning gets eaten by the login page. There's no trace of the URL you were trying to go to. If your browser was restoring state from the last time firefox crashed, or you did a "right click open in background tab" and then closed the original window before looking at the tab, you've lost that page. (Not a big deal, it _is_ just a livejournal page after all. But ordinarily this is safe because even if it didn't _load_ you still have the URL in the bar. The page has to go out of its way to wipe it, and now Livejournal does.)

Yet another way the crazy Russians manage to screw up anything computer related. I'm definitely noticing a theme here. (I know there _are_ some good Russian programmers, but most of them seem to move to another country as quickly as possible.)

February 26, 2009

Got about half the outline done yesterday, working on finishing it today.

Pushed some patches upstream into uClibc. Getting some blowback, which is mildly amusing considering the continuing lack of a release and that current uClibc svn does not _compile_. Oh well.

Current qemu has an "-hdc fat:/path/to/directory" option that's actually quite nice. The emulator reads the files and makes a read-only FAT filesystem out of 'em. You them "mount -o ro -t vfat /dev/hdc1 /blah" and you can read all the files in that directory. Alas, you can't write to it. (And if you forget to mount it read only, then _try_ to write to it, the virtual drive goes catatonic and the linux kernel running inside the emulator starts spamming the console with disk error messages. But that's pilot error.)

So in theory I only need the netcat trick to get data back _out_ of the kernel. Although another trick I could do is call genext2fs and try to create an hdb image that already contains all the build files, and then run the emulator with that one.

The problem is, genext2fs hasn't got an "extract" mode. I tend to view this sort of thing as an archiver: take a directory of files and turn it into a file, and vice versa. That's an archiver. Alas, the genext2fs people do not, and I've been too distracted with other things to get back to my own gene2fs implementation. (Which now needs to understand ext4, and 64 bit support, and so on. I never even got around to teaching it about the ext3 journal (which is actually just a file dangling from a reserved inode) and htree support.)

The _other_ reason I haven't been banging on toybox much lately is I'm considering porting the whole mess to lua. The project that introduced me to busybox all those years ago, tomsrtbt, implemented the rest of its command line utilities in lua. This from a project that was attempting to cram as much functionality as possible into a 1.7 megabyte floppy image; it used a high level language analogous to Python for _size_ reasons. It had the smallest interpreter I'd ever _seen_ and the compiled binaries were crazy tiny as well. Ever since it's been on my to-learn list, but when I first poked at it most of the documentation was in Portugese, because it's from Brazil. Plus the way to learn it was to buy a book, and I never bought the book.

Last week I asked Fade if she could buy the book. (Ok, now there's two of them, but that's what I get for waiting.)

February 25, 2009

Made the mistake of mentioning politics in front of Eric again. His reply went on for 15 minutes, was Godwined almost immediately, and involved (accidentally) spitting on me at least twice. *shrug* That's why I try not to do that.

Good progress on the C++ paper. We have plenty of material but it's not properly organized (too many back references and forward references and arguments we want to make with no obvious place to put 'em), so we're doing an outline.

I have a fax to return to my sister's realtor. (Still buying Kris a house.) Got delayed a bit because they tried to put the closing on the 4th. I fly back to Austin on the 4th, I'll be incommunicado rather a lot of the day, and won't get to talk to my credit union until the 5th. Not useful.

February 24, 2009

Huh, seem to have missed a day somewhere. I suspect half of yesterday's entries were actually about Monday. Oh well.

Eric and I are grinding away on the C++ paper. I'm also grinding away on a cron job to build cross compiler toolchains and system images from the nightly uClibc snapshots. Plus I'm sticking printfs into the binutils code to try to figure out why arm EABI isn't working, trying to puzzle out what's up with sh4 native builds, and if I get time I still need to track down and beat sparc into submission..

The usual.

I miss working on tinycc sometimes. When it was linking things wrong, I'd added a -v options (and -vvv) that made it print out what it was DOING. The binutils ld does not have such an option. The closest I can do is run it under strace, except that strace is annoying to cross compile so I need the native compiler working in order to build it..

The binutils build is spitting out a dozen pages of errors the first time it tries to link anything. It's quite impressive. It's not finding libc, which is odd because when I build a hello world command from the command line, it works fine. So _sometimes_ gcc is finding libc, and other times it isn't. This is kind of annoying.

The working linker invocation is:

/usr/bin/../libexec/gcc/i686-unknown-linux/4.1.2/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ -L/usr/bin/../lib -L/usr/bin/../gcc/lib --dynamic-linker /lib/ -rpath-link /usr/bin/../lib /usr/bin/../lib/crti.o /usr/bin/../gcc/lib/crtbegin.o /usr/bin/../lib/crt1.o /tmp/ccYO0sf5.o -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/bin/../gcc/lib/crtend.o /usr/bin/../lib/crtn.o

The failing linker invocation is:

/usr/bin/../libexec/gcc/i686-unknown-linux/4.1.2/collect2 -v -m elf_i386 -static -o ld-new -L/usr/bin/../lib -L/usr/bin/../gcc/lib -rpath-link /usr/bin/../lib /usr/bin/../lib/crti.o /usr/bin/../gcc/lib/crtbeginT.o /usr/bin/../lib/crt1.o ldgram.o ldlex.o lexsup.o ldlang.o mri.o ldctor.o ldmain.o ldwrite.o ldexp.o ldemul.o ldver.o ldmisc.o ldfile.o ldcref.o eelf_i386.o ei386linux.o ../bfd/.libs/libbfd.a ../libiberty/libiberty.a -lgcc -lgcc_eh -lc -lgcc -lgcc_eh /usr/bin/../gcc/lib/crtend.o /usr/bin/../lib/crtn.o

And this worked a few months ago.

Oh, hang on. It's -static. Uh-huh, is working, but somehow I've screwed up libc.a. Right...

February 22, 2009

Eric posted his GPS rant. He _does_ do embedded development. :)

Eric's been downloading a season's worth of an old black and white science fiction series via bittorrent. It's in its third day. This is because Verizon's FIOS detects the use of bittorrent and throttles upstream bandwidth to about 4k/second when it's in use. Since the FIOS connection is shared by everybody in the house, all downloads become insanely slow because so many of the reply packets are being dropped.

This is why I didn't consider getting a Verizon cell phone when I left sprint. I refuse to give a dime to a company that aggressively stupid.

Made uClibc-svn build under FWL again, at least for i686. (Needed one patch, which I committed to the repository.) Now building all targets to see what else breaks...

Pushed a second longstanding uClibc patch upstream while I was at it. And the need for four of the others seems to have gone away (I know I complained about the underlying issues on the list for most of them at one time or another)... I think that means that now the svn uClibc should build with no patches at all. That's kind of impressive. (Have to wait for another nightly snapshot to cycle round to make sure.)

February 21, 2009

Slept poorly last night. Woke up around 8am with my old "something wrong in my head somewhere" symptoms (pinkies going all pins and needles until I slept on the other side for a bit, to get the chest pains I once went to the ER for). Couldn't get back to sleep for a couple hours. Considering I went to bed at 5am, that was bad. Cathy woke me up to go to breakfast at 11am, I've been a bit distracted and grumpy all day because this is the third day I haven't gotten to bed until 5am.

I feel I'm being antisocial by not keeping a sufficiently day schedule. Need to get to bed earlyish tonight, but at night after everybody goes to bed I spend five more minutes with my laptop and it's quiet and I get stuff done and I look up and it's 5am...

Got Eric another draft of the C++ paper today. He still hasn't quite merged all of the one I sent him last time, but I wrote more material anyway. (He's also writing a rant about how GPS standards are screwed up.)

Poking at sh4 some more. I'm not sure if running the sh4 system image under qemu is insanely slow because the kernel .config is wrong (I know it's not finding the realtime clock, perhaps there are related delay sources) or if it's just that the qemu target is not remotely optimized yet. I need to figure out how to profile it. Build strace inside the thing, perhaps, but when building "hello world" takes 15 seconds, that's kind of iffy...

Nobody's responded to my post to the uClibc list about the linking issue. I tried hexediting the ".init" and ".fini" in crti.o into "_init" and "_fini", but it didn't affect the linker funkiness. I'm comparing the crti.S source against the assembly of other platforms, and against crt1.S in sh to see if anything jumps out at me. Other than "@function" and "%function" (synonyms?), and not being in an .init or .fini section, I really don't see a difference. I haven't figured out why the cross compiler works but the native compiler doesn't, either. I may wind up sticking printf() calls into gcc to try to track down what the heck's going on. Probably not today, though.

Mark's installing Ubuntu and Fedora images on securitybreach (the 8-way server). I asked him to poke at kvm, but apparently you have to enable whatever cpu extensions it needs in the BIOS. I have no IDEA how that is supposed to make sense.

Why does Ubuntu keep spinning up my cd-rom drive? Yes, I have Weird Al's "Straight Outta Lynwood" in there. I've had it in there for weeks. But I'm not playing it, I ripped the tracks I want off of it ("Do I Creep You Out" is inspired), and the case is in Austin so I can't exactly put it away at the moment.

There's probably some random hal/dbus thing that's going "you have multiple scsi drives, I can't tell them apart, let me scan all of them to see where your swap partition is" or something crazy like that. It spins it up about hourly, it hums away to itself for aminute or so, and then it spins back down. This is annoying, but it's about 15th down the list of annoying things in Ubuntu.

I removed Pulseaudio from Cathy's machine for her yesterday, because she was having the random sound failure problem I was. Earlier today Eric hit "underscores in the text screwing up the diff" which turned out to be Ubuntu once again converting the second space of two consecutive spaces into a some kind of unicode excape (hex 0xc2a0). I think it happens when I cut and paste text between windows (yes, even into vi). I wrote a little program to remove them from things, I just didn't run in this time because I hadn't noticed. Yes I'm aware it shouldn't do that. It shouldn't default to an insane $LANG value (en_US.UTF-8) that makes the "sort" command case sensitive either. (The workaround is "export LANG=C". I told Eric to type "set" to see that value and he was surprised at the 4500+ lines of crud that scrolled by. It's ubuntu, it's insanely bloated with stuff that's serves no obvious purpose except to complicate the system and randomly break.

When I first told him most of the developers I knew responded to Ubuntu 8.10 by looking for another distro, he didn't understand why. It's not a very good development environment, and it's getting _worse_ with each new release. Pointing /bin/sh at dash and vimrc.tiny were just the tip of the iceberg.

Nontechnical end users switching off of Windows do not get Linux machines. They get Macs. The main objection to Apple taking over the PC space is you have to buy their hardware, but when full-time senior Linux kernel hackers buy Mac hardware as their primary systems, that's a hard argument to really get behind. (Mark's current system is a mac laptop, he runs Linux under vmware.)

Fiddled with my script to publish my qemu, busybox, and uClibc patchlist directories on my web page. (Because fiddling with svn is too much like work, and attaching patch files to messages when discussing them gets old. Besides, if I restart qemu weekly news I need to be able to link to the patches under discussion...)

February 20, 2009

Got a draft of the C++ paper to Eric, relaxing a bit now. (Still not fully recovered from a month of Pollen.) Playing lots of gemcraft, poking at various things.

One of those things in the sh4 target for FWL, which doesn't natively compile "hello world" because it can't find _init and _fini. These symbols live in crti.o, which is there and being linked in according to gcc -v, but when I objdump -d the file it has ".init" and ".fini", where the i686 equivalent has "_init" and "_fini". Hmmm...

Poking at uClibc/libc/sysdeps/linux/sh/crti.S it has ".hidden _init" and ".hidden _fini" lines that look suspicious. I don't know sh4 assembly (and to be honest I'm still a bit fuzzy on the funky gnu assembly syntax that's vaguely AT&T based instead of what Intel uses (that's been on my todo list longer than learning Lua), but if I remove those two lines... no difference. Sigh. Ok, ask the list.

Speaking of Lua, Eric bumped into this over on the wesnoth project, after the attempt to port chunks of it to Python died because its proponent managed to piss even Eric off. (Are all Russian programmers crazy? I note that Denys Vlasenko was very insistent that he's Ukranian, not Russian.) Anyway, somebody else showed up proposing to port it to Lua instead, which is cool.

Lua is an interesting little language. It's sort of like an anorexic version of Python: it's a similar sort of scripting language but incredibly minimalistic. The interpreter is tiny, the programs compile down to tiny bytecode, and the whole thing's basically a fairly thin wrapper around chunks of C code. I first encountered it in tomsrtbt (the floppy distro that first introduced me to BusyBox almost a decade ago now), because half of the commands in later versions of tomsrtbt were implemented in Lua, and the whole distro was a 1.7 meg floppy image. Seemed cool, learning it went on my todo list. Unfortunately, the documentation didn't have a quickstart last I checked. (You could read the reference manual, which I vaguely recall was a PDF of a book. I see they now have more than one book you can read, but nothing like a quickstart or introduction to the language. On the bright side, more of the website's in english than when last I checked. The language is from Brazil.)

Looked for a Lua book at the bookstore at lunch, but they didn't have one. Bought the Waiter Rant book instead.

Eric was unaware of the LP64 standard (or the rationale, or the insane legacy reasons why this is broken on Windows).

February 19, 2009

Yay, Christian Michon and Jason Azze both sent me copies of my missing December blog entries, which I've finally spliced together into something that probably bears a passing resemblance to the original blog file.

I slept until 3 today. (Eric had dental surgery this morning, and collapsed to sleep off the anesthetic when he got back. I'm still recovering from pollen.) Cathy's been sick the whole time I've been here, and today got diagnosed with an ear infection, so she has antibiotics. Fade tells me she's sick too. I am currently the _healthy_ one. It's a strange feeling.

Cathy got me another box of the Traitorous Joe's triple ginger cookies dipped in chocolate (milk chocolate this time). I have a box (I've eaten about 1/3 of it already), and Fade has a box which Cathy plans to mail to her as soon as she's feeling better. (It's a small box, but makes up for it in sheer caloric content, and of course deliciousness.)

I spent most of yesterday and a lot of today dealing with a backlog of "things I had no brain for this past month". I'm now mostly caught up, I think. I need to give Gentoo From Scratch more of a workout, but running it paralyzes my laptop.

Linux is great at scheduling multiple CPU hogs, but sucks at dealing with even one I/O bandwidth hog. I've tried the CFQ scheduler and I've tried deadline and they're just different _kinds_ of suck, the problem is the queues are too deep. The disk has to do 3-5 seconds of work for my I/O to make it to the top, and if it's blocked on page faults (because the disk cache evicted pages my executable wasn't using because I hadn't moused over that Firefox tab and triggered its cursor-freezing decision recently) then it generally faults in ONE page, waits three seconds, faults in the next page, waits three seconds... This can easily turn into 30 seconds of pain on a really unhappy system.

Now imagine the disk cache disk cache putting enough memory on processes that they start seriously swapping. (Because, let's say, genext2fs was told to create a 1 gigabyte file and it's too stupid to make it sparse even with -z.) Once the system goes INTO swap crazy, with firefox and kmail open, it's not coming back out. I eventually gave up and hit ctrl-alt-backspace to kill the whole desktop, and it took about 3 minutes for everything to go away. It was that bad.

I don't think I believe in Linux on the Desktop anymore. Let Apple have it, if we can't figure this out after 19 years of trying, we deserve to lose here. (I didn't feel this way before I was forced to switch to gnome, if I switch to an XFCE system I might start to have hope again. I really miss Konqueror. Even on a system that ISN'T loaded, Firefox regularly goes out to lunch for 30 seconds at a time while it runs its garbage collector. This is triggered by crazy things like scrolling the screen or mousing over hotlinks.)

Anyway, in the long run I can redo the bottlenecks in GFS to make it _not_ beat my laptop to death. Right now we're just focusing on getting it to work, though. Polish comes later. The real problem is that my laptop maxes out at 2 gigs of ram (apparently I can't install more than that without buying a new one), and my working set of open windows is rather close to 2 gigs anyway. Add in a 256 meg emulator instance and 4 distcc nodes running gcc and it goes all pear shaped. (It works fine if I just let it run, but if I try to do _other_ things with the laptop during this, the responsiveness sucketh mightily.

Banging on the C++ paper now. (Ordinarily I don't read Eric's blog and Eric doesn't read my blog. We don't get along when we discuss politics, and his blog is mostly about politics. But there's a reference to the paper last time we were working on it. Most of what I've been doing is re-reading the old drafts and trying to figure out where I was going next when we left off.)

February 18, 2009


On the 16th I copied this year's notes.html file over notes-2008.html, and rsynced it up to the website before noticing. I _think_ I have a backup of it somewhere. (Let's see, that one goes up through August 28th, that one goes up through October 20th...)

If you wondered how out of it the pollen made me: there's a clue.

Ironically, the google cache has notes-2006.html, notes-2007.html, and the current one, but not 2008. last updated _any_ link to my blog in February 2008.

Hanging out at Eric and Cathy's, sort of working on "Why C++ is not my favorite programming language" but so far mostly just recovering from The Great Pollen Apocalypse of 2009. (Yay recovering. I'm in favor of it. Slept most of yesterday after my plane got in, and then slept until almost noon today. Kind of needed it.)

Found a backup that goes to the end of November, that's something...

I need to write my own top command. The normal one does NOT do what I normally want to do. The ability to sort processes by memory usage is nice (although more control over the different _types_ of memory usage would be nice). But really, in addition to CPU usage I want to sort by I/O wait. If I have two processes bogging the hard drive (usually because one is doing a "find . -name blah | xargs grep blah" or something similar, and the other is a compile or some such that turns into a disk bandwidth hog when disk cache grows to evict its working set and it goes all swap crazy), then the rest of the system GRINDS TO A HALT. And it's not grinding to a halt due to CPU hogs, it's grinding to a halt because Mozilla swaps just mousing over its tabs. (My laptop can only hold 2 gigabytes of ram, not _nearly_ enough for Firefox to successfully wallow in the mud or whatever it is pigs do when failing to display web pages.)

So yeah, top should sort by iowait, it doesn't, need to write one.

My sister is getting a house, it seems. (Well, I'm buying it and renting it to her, but the principle's the same.) The offer she asked me to make got accepted, and there's much faxing expected in the near future. It's odd that an amount that wouldn't buy a one bedroom condo in Austin gets a 4 bedroom 2 bath house in New Ulm Minnesota. I keep thinking of Austin as a cheap place to live. It's not, really.

February 17, 2009

Plane leaves at 8:10 am, so I'm staying up.

The Linux Symposium guys just emailed out their "Mid-Feb Reminder!" It's kind of sad that I'm not going to OLS, CELF, or Penguicon this year. I feel I should go to _something_, maybe in the second half of the year.

For CELF I simply missed the paper submission deadline (I was busy), and I've only gone to that when I'm presenting. Possibly I should look into submitting a paper to the CELF Europe thing in November. (This year's page isn't up yet but here's last year's.) Mark and Solar wanted to do something on Gentoo From Scratch for this year's CELF, maybe one or more of them would be up for the European one?

OLS has been slowly deteriorating for years, ever since the Kernel summit peeled off and half the kernel guys stopped coming. Not exactly their fault, nor is the fact that they're moving out of Ottawa. (The facility they've held it in is being renovated this year, and there apparently isn't another one in town big enough.) I'm reminded of when Atlanta Linux Showcase moved out of Atlanta. (Anybody remember ALS? Anybody? Beuler?) Oh well, good luck to 'em. But I'm not traveling to Canada for it this year.

And of course I'm not going to Penguicon this year, because Matt Arnold finally became con chair, and he's the reason I stopped trying to help organize the thing in the first place.

(Imagine somebody who invented the "director of communications" position so he could be the face of Penguicon, taking over the website (which was our main interface to the world our first year, and getting interviewed instead of the con chair. Imagine this person organized opening and closing ceremonies last year and gave himself more screen time in each than the con chair got in both combined. Imagine that this person has lost more than one job due to his devotion to Penguicon, which was the first convention he ever attended. Now imagine how he reacts to somebody who can legitimately say "actually, I co-founded Penguicon". He doesn't just react that way to me, of course, at last year's con he had a circle of his friends standing around mocking Tracy. She went off and got a nursing degree, and isn't currently involved in the concom anymore either.)

I'm sure Penguicon will do fine this year, out of sheer momentum if nothing else. Elizabeth Bear's going back, Wil Wheaton's finally promised to show up, and I'm not trying to stop anybody _else_ from going. Fade's flying up to attend it. (I was pondering hanging out at Haven during that weekend, and need to arrange for somebody else to throw a dead dog party there if I don't, but if I set foot in the con hotel during Matt's year he'll get all territorial and start behaving... strangely. There has been rather a lot of unnecessary drama with Matt and Matt's friends over the years, mostly unremarked because airing dirty laundry doesn't help. But I know from experience he's not always entirely rational where I'm concerned (just because I haven't made things public doesn't mean I haven't got years of email archives). Deep down I suspect that anybody capable of drawing attention away from Mr. Penguicon "Must Be Stopped", and that's not healthy in the long run, but it's not my problem anymore either, and I could easily be over-analyzing something that's merely a personality conflict. I'm just sad it screws up my wedding anniversary...)

So that's three events I attended last year (four if you count the HP thing), and for unrelated reasons I'm not going to any of 'em this year. (I'm still racking up more frequent flyer miles than ever, but that's for work and visiting Eric and such.)

On the bright side, I finally heard back from the Armadillocon con chair (I know her because she ran Linucon's art show), and she's putting me in touch with the head of programming. So that could be fun. (Stu and I decided not to try to launch a new con during the big Economic Meltdown thingy. Mostly because Mark wanted to launch Impact Linux instead, and I wouldn't want to do a fresh con in Austin without Mark.)

The new version of distcc builds with -Werror. You'd think they'd know better; you can never predict what useless new errors gcc will decide to random generate each new version.

And distcc's ./configure requires -disable-Werror (mixed case) where binutils' ./configure required --disable-werror (all lower case). That I don't expect better from: autoconf is pure evil.

At the airport. Associating with the "free public wifi" access point here causes NetworkManager to hang the whole gnome desktop. The mouse still moves, but nothing I click on matters and the keyboard is totally locked. Telling the system to suspend starts to do so, but hangs waiting for the desktop processes to quiesce. And of course, hard powering off the system and rebooting, it re-associates again immediately on login. So ctrl-alt-F1, login as root, killall NetworkManager (yup, mixed case) before it has a chance to lock up the system. (And then experiment to confirm that's what was causing the problem, requiring another reboot or two...)

Year of the Linux desktop. Nontechnical end users. Oh yeah. I think we can just give up on that one at the 20 year anniversary of Linux, don't you? (Is this "the ipw3945 driver sucks", "NetworkManager sucks", or "The Gnome Desktop Sucks"? Answer: Yes, all of the above.)

Now I have no preloaded web pages to read on the flight. Darn. Not quite awake enough to program (they have caffeine on the flight, but not in the waiting area). I had to leave my chain mail equipemtn at home because the Security Theatre people are convinced I'm going to take over the plane with a pair of flat nose jewler's pliers (I've lost 3 pairs the way and my current set were a birthday present from Fade), but of course the laptop, router, and dozens of feet of cable aren't a threat. Right. (Can't garotte anybody Cat 5, after all...)

Yay podcasts.

And gnome's volume control insists there are no sound cards. Totem is currently playing sound, just quietly. When I right click on it and "open volume control" I get a window with a slider that does indeed turn up the volume, but the icon still insists there's no sound card.

Dear Gnome: give up already, will you?

The new distcc version is detecting that the build $PATH has python in it, and then failing because Python.h isn't in /usr/include. Because python-dev is packaged up seperately, and may not be installed. Wheee.

So trim the $PATH. If it can't find Python, it can't make stupid assumptions about related packages. (Well, it would be more of a stretch, anyway.)

I'm testing Mark's Gentoo From Scratch project.

February 16, 2009

Actually implemented the SNAPSHOT_SYMLINK option I mentioned on the 10th. A bit more work needed to make HDA read only for the static toolchain build script.

February 15, 2009

One more day until I can get away from all the pollen. (I note that when I booked that flight, I hadn't identified the pollen as the reason I was feeling horrible, but getting away from it is a marvelous side effect.)

I note that there are people out there who feel more strongly about Cedar pollen than I do. It leads to silly titles.

So the uClibc rollup patch didn't apply with toybox patch because svn isn't signaling that it wants to create a file (by having the old file be /dev/null or dated the epoch), it's just trying to add hunks to a currently nonexistent file. Ordinarily, you don't want to do this because if you apply a patch at the wrong -p level you don't want to create a whole parallel hierarchy of files containing random junk.

Deferring file opening to the first hunk requires shuffling around far more code than I'm comfortable with, but we just haven't got enough information to see if it's one of these broken svn patches until then. (Ideally we'd read _all_ the hunks and make sure there only _is_ one, but that queing up more pending state in memory than I want to do.)

I've done the resuffling, but I screwed up the state machine and now it's writing duplicate lines into the output. Possibly I should have #defines for all these TT.state values.

February 14, 2009

Hanging out at Wendy's, hopped up on cedar pollen and antihistamines, but getting work done.

Antihistamine vs air filters isn't a good trade. Hoping a pill would compensate for leaving said air filters may not have been the greatest move, but I'm getting pretty tired of the inside of my condo. I did some work from the bedroom with the door closed today, with Aubrey meowing pitifully right outside the whole time. (Her turn to prevent me from working, apparently.)

Reviewed the pending patches for uClibc- Two of them are required to build with 2.6.28 kernel headers, two of them are long and complicated patches that may not qualify as part of a "bugfix only" release, and then rest are harmless. (One of them is a comment typo fix. I'm not sure that's really dot release material either, but at least it's harmless.)Unfortunately, gluing all those patches together into an uberpatch gives me something toybox patch won't apply but the gnu one will. (Sigh. TANGENTS!) This is because svn sucks and produces bad patches that don't signal they're creating a file. Quoting the patch man page:

You can create a file by sending out a diff that compares /dev/null or an empty file dated the Epoch (1970-01-01 00:00:00 UTC) to the file you want to create. This only works if the file you want to create doesn’t exist already in the target directory.

Unfortunately, the gnu version accepts broken patches that _don't_ do this, such as those produced by svn, so I have to as well. It seems like I have to hold off on opening the file until I've parsed the first hunk, so I can figure out that I need to create a file if the first hunk has no context lines, and the minus range is 0,0 (meaning the starting line in the old file is 0 and the starting length of the old hunk is also 0). Hmmm...

Mark poked me about arm-eabi, so I'm converting armv5l over. Apprently the python guys didn't want to take his arm-oabi fix patch because they don't support oabi anymore. (Meaning Python no longer cares about armv4, which is still shipping new hardware. Oh well, maybe we can support it.)

Argh. I really hate C++. Guess how arm eabi decided to break? During the mini-native build of gcc-core/libstdc++-v3/libsupc++/ of course:

/tmp/cccqYVZh.s: Assembler messages:
/tmp/cccqYVZh.s:262: Error: duplicate .personality directive

Sigh. Made it spit out the assembly code in question, assembled it and reproduced the error, glanced through it but was not enlightened. All Google found about this were some unanswered questions, and that Paul Brook would probably know. Have to poke him during daylight hours.

February 13, 2009

So, I need two patches to qemu svn. One makes sh4 -append work, and the other makes cursor keys work in emulated systems.

The other thing I needed to do was remove the built-in command line from the sh4 kernel; apparently it only looks for the one supplied by the bootloader if it hasn't got one built in. The sh4 kernel still isn't finding its realtime clock (so it thinks it's Jan 1, 2000), but I'll worry about that later.

The sh4 emulation is _really_slow_, I think there's large delays getting inserted somewhere. (Something I/O related, perhaps? It boots up to the "freeing init memory" bit pretty quickly, but then pauses there for 10 seconds before giving me a command prompt.) Also, the sh4 native compiler isn't linking anything corectly because it can't find _init or _fini symbols. Still: progress!

Saw Coraline again today. Very well done movie, with lots of little subtle things I hadn't noticed the first time, such as when The Other Boblinski is falling apart he sounds obviously drunk (as he was accused of in the other world by Coraline's real mother), or how when Wybie, the cat, and coraline all turn their heads to one side near the end, I think the 3D stereoscopy turns about 45 degrees to the side too in expectation that at least some members of the audience will do the same thing. (3D with polarized glasses only works if your head is straight up and down. Turn it sideways and it goes all 2D again because the offsets are now vertical and your eyes just compensate for that rather than perceiving it as depth.)

Overwhelmed by pollen after going out to movie/mall/Fry's. We aired the place out with the window fan while we were away (after a few days closed up it gets a bit stuffy in here), but it's taken the filters a few hours to catch up after we closed everything up and turned the central air back on.

Food gets a bit boring after a few days of being totally unable to taste anything. Other than "salty", "sweet", and "bitter", the sense of taste is mostly smell at point blank range, and my sense of smell has not had a chance to recover...

February 12, 2009

Sigh. I had a longish entry for both today and yesterday typed up, but then Ubuntu 8.10 decided that my network card was going to spontaneously lose connection to the wireless hub, and when I told it to reenable networking it went "kernel panic". And of course the .swp files vi makes stopped working back around... oh, 7.10 was it? (Back about when you started having to "ln -sf vimrc /etc/vim/vimrc.tiny" each new install to pull vi out of its artifically braindead mode they put it into to punish us for not using Emacs. Dude, it's 2009, vi has supported the cursor keys in insert mode for 20 years now. I'm aware that historical versions PREDATE THE INVENTION OF CURSOR KEYS, but that's really not relevant anymore, is it?)

So yeah, Ubuntu's getting sad enough I may have to switch my laptop to Gentoo or something. Ubuntu is focusing on being an accurate Windows replacement to the exclusion of being a feasible development environment. I can't switch to Fedora, that's just Red Hat Enterprise Rawhide and not a real distribution in its own right. (They should rename it "Pointy Hair Linux" and be done with it.) Not SuSE, that's a Fedora Wannabee and a wholly owned subsidiary of Microsoft. What else does that leave? Debian lost me forever back in 1998. Slackware never scaled beyond one guy. Knoppix never grew past a live CD, and stalled. That pretty much leaves Gentoo.

What have I been doing. Went to the mall yesterday and worked from the food court again. Bought some very expensive air filters for the air conditioner on the way home, and they _worked_, and I can BREATHE again, and I actually got SLEEP last night, and I slept until 3 pm because I needed it. Met Mark for dinner at Fuddruckers and we updated our todo lists. There was more. Oh well.

Banging on sh4 a lot, because it _almost_ works in qemu svn. Finishing useradd. Reveling in being able to breathe again as long as I don't go outside. Catching up on work.

The usual.

February 10, 2009

How long is this whole pollen bumper crop thing supposed to last, anyway?

Strangely, Miso Soup seems to have a significant medicinal effect, but it only lasts so long...

Later, out at lunch at the highland mall, I've noticed that I have no sense of smell. None at all. I got a little "snack sandwich" which is covered in garlic, and seasoned breaded potato wedges, and I can't taste either one and they smell like nothing. Even my diet Dr. Pepper barely registers. I could taste the sushi and miso soup at breakfast, but now: nada. (I'm afraid to drink the bottle of tea in the side pocket of my backpack. I _think_ it's the fresh one I had with me in the car rather than one left over from yesterday, but I right now I really can't tell. Tap water normally has more taste than this.)

I never used to have to worry about the gender of trees. I feel like arranging for one of those "Y: The last man" things to occur to Texas' Cedar population, but am not quite sure how to go about it yet. More planning is required, and the acquisition of some serious herbicides, and possibly expolsives. (Living a block downwind of Pease Park isn't helping here, but I'm out at the mall now, in air conditioning. Something like recovery should occur eventually, you'd think. I'm no longer coughing and sneezing as much, but I think my nose is in shock.)

Ha. The sandwich was able to make its presence known anyway. (Licking largeish lumps of garlic off one's fingers afterwards is a bit of a giveaway; I AM GARLIC. BOW BEFORE ME, CEDAR! Sandwich wasn't a total waste, then.)

Poking at the sh4 image Shin-ichiro Kawasaki put up. It almost works for me. Trying his way to boot it before fiddling with mine, I had to make a couple minor tweaks to his qemu command line (the kernel name he packaged and the one his README said were different), but I eventually got a login prompt (in what looks like a vga window). Alas it refused to load the USB keyboard his command line specified, so I couldn't actually type anything. Rebuilding qemu with current snapshot to see if that helps. I forgot how long qemu takes to build (even with -j 3)... Nope, still can't add the USB keyboard. Hmmm...

Ah, that's a known problem mentioned in the original post to the list. (Duh.) No, the problem I _should_ be worried about is that the serial console isn't working. (Which would neatly bypass the need for a USB keybaord, but alas, doesn't if it won't work.) And the scripts/extract-ikconfig thing from the linux kernel source isn't finding a config.gz in this zImage, despite the README saying it should, so I can't get the .config out to start playing around with that way, either.

Darn. Tantalizingly close to working. Oh well, poke the list and move on until I get email back.

I note there's a working sparc test image up, except that uses glibc and my problem has been getting uClibc to work. (It would be nice to compare the kernel .config too, but proc has no config.gz in that image.)

February 10, 2009

I am sick. Presumably still pollen. Really looking forward to spending two weeks up in PA starting next Tuesday.

Mark who once said "Beware the Tokamak, my son" when captioning photos of things he bumped into around UT. It's a type of reactor that uses a magnetic bottle to contain plasma hot enough to eat through any physical material we've currently got. That by itself is sufficiently "star trek" to be cool. Upgrading a fission reactor into a fission/fusion hybrid reactor that uses fusion to break down the nuclear waste the fission produces, and basically sends the materials back and forth between the two until they stop being radioactive (getting energy out of each stage) is even cooler, and something I'll believe actually works when I see more widespread deployment.

But naming the invention that allows the hybridization to happen the Super X Divertor is just grandstanding. (They could have at least used an established name, like Interoceter, Oscillation Overthruster, Flux Capacitor, Heisenberg Compensator...) That sounds like a badly translated Japanese name, and is bound to attract Gamera or Godzilla or maybe Team Rocket...

I've gotten used to this whole "Living in the future" thing. Living in a "tinfoil and ray-guns" B movie is still somewhat unexpected.

Ok, there is deep and abiding suckage in attempting to cross compile libstdc++.

So the ./configure stage for libsupc++ builds on the ./configure stage for C, but it needs something _else_ to be cross compile aware. What, exactly, that something is, remains an open question, but animal sacrifice is near the top of the short list.

It tries to grab xgcc (which is a host compiler) as both the C and the C++ compiler to build libsupc++ with. This A) doesn't work at a theoretical level, B) explicitly specifies insane paths, and dies later in ./configure unable to build, well, _anything_.

I can use config.cache to manually feed it the target C compiler, but if I do that configure dies because first it figures out it's not using a native compiler, then disables certain tests, next it throws some highly verbose warnings about it, and finally sits down in the mud and throws a tantrum when it wants to run one of those tests and can't. (Specifically it tries to figure out if libm contains some symbol or other, which is a target libc thing that gcc didn't need and which hasn't been _built_ yet when we build gcc because you need a target gcc to build it.)

If I leave the C compiler alone and instead feed config.cache $ARCH-g++ as its c++ compiler, it happily builds stuff with it (assuming I move everything down after uClibc++ is built and installed, because yes it's trying to use the libc header files, even though I'm _just_ building libsupc++ and not the full libstdc++), and then it links the result with the C compiler (which is still a host compiler and is at _best_ producing the wrong elf signatures).

Ignoring both of these issues for the moment, the build can make it as far as the uClibc++ build... where it dies because it hasn't got libgcc_eh.a. Because the cross compiler was built --disable-shared (and thus that stuff is in libgcc.a), and if we _hadn't_ done that there would be a fun chicken and egg problem with the reference that leaks.

I could play whack-a-mole with this for another week, nailing every stray idiocy in place, but it would be simpler to just build gcc a second time with a completely different configuration, which is essentially what I'm doing during already, which is why building uClibc++ after _that_ stage works. As much as I hate the enormous layering violation of having install files into the cross compiler, the _simple_ alternative is to cut and paste that entire build into the cross compiler, and it's a toss up which is uglier.

So large amounts of ugly fiddly code duplicated in two places, vs important functions being done at the wrong time from the wrong place. (There are times when I REALLY hate C++.) As always, when faced with a choice between two hideously unpalatable options, the answer probably involves beat a third option out of the situation with a bazooka and some duct tape.

Ok, here's the least disgusting alternative I can come up with: make a tarball of the uClibc++ files out of mini-native and copy 'em into the cross-compiler from another script entirely. It's ugly, but it avoids both duplication of the build logic and having modify its cross compiler. (It's ugly however you look at it, but that's the _least_ ugly option I can come up with.) And it means that creating a static cross compiler after the fact doesn't require building mini-native again, so it doesn't impose quite as horrible sequencing issues.

I think all I actually have to do is "cp $TOOLS/lib/lib*c++* $CROSS/lib"... Except that won't work because according to ldd, wants

Nuts to your white mice.

[Goes to play "Mr. Do" under Mame for a bit.]

Ordinarily, when I order a Hot Chocolate at Starbucks, when the ask me the size I just say "egregious" and mime a vertical "the fish was _this_big_" with both hands. Every time I do this, they give me whatever the 20 ounce is called ("Vingt"). This time I just said I wanted a large, and they gave me the smallest one ("Alegretto").

Lesson for the day: feeling too under the weather to properly invoke sarcasm means I don't actually get what I want.

Ok, three obvious choices for uClibc++:

  • Build uClibc++ with -lgcc.a -lgcc_eh.a instead of (possibly modifying the build wrapper).
  • Build the sucker twice, once against the static libgcc and once against the dynamic.
  • Copy the dynamic libgcc into the cross compiler.

Once again, all the choices suck but I've already got my third option if I can just figure out which one it _is_...

I'd look at an option 4 (just copy libuClibc++.a into the cross compiler and force static linking) except that libuClibc++.a is 685k and the shared version is 159k. (Why is that? Ah, if I strip it the result is 255k, which makes more sense. Although it still looks like libgcc.a is getting sucked in, which you'd think the things building _against_ it would already do? Send email to Garrett asking what's up, and move on...)

Ok, if I retroactively copy the -lgcc* files over, and the rebuild the wrapper with -DGIMME_AN_S, is there anything else in the toolchain that's going to care? (I don't think so...) So I think I can retrofit a --disable-shared toolchain to be --enable-shared fairly cheaply, meaning the third option in the above list seems least disgusting.

Thunderstorm tonight, impressive one. Tore off the awning on the pizza place across the street from Starbucks (well, rotated it 90 degrees and punched a hole in it, anyway), and upended lots of chairs and tables on Starbucks' patio. Looked distinctly tornado-y for a bit, but decided it wanted to upend a medium sized lake on the city instead, and had a schedule to keep. On the bright side, the pollen is already noticeably reduced. (Probably pick up again in a few hours, the trees are still at it.)

Ok, what's involved in doing read-only hda mounts? I want to build static cross compilers, for all targets, in parallel. Meaning I want to fire up lots of instances of qemu and run them in parallel to take advantage of SMP. I'd like to have them share the same root filesystem image, meaning it should be a read only disk (or perhaps an initramfs). Ideally I'd like to decompress all the source into that as well, meaning setupfor needs to use symlinks instead of hard links.

So, modification to setupfor to add a SNAPSHOT_SYMLINK config option, modification to to add a read only option (technically --no-rw), and then a new wrapper script to create an image with all the source. Should the new wrapper run qemu read/write, extract the source, and then exit, or should it extract the source outside the emulator and run again to repackage? (It can be made to work either way...)

February 9, 2009

Still feeling horrible. Fade points out it's probably the cedar pollen being at record highs. She gave me a decongestant, we'll see if that makes things better or worse.

Had an eye appointment to get new glasses. (A few months back I found out that the coating on the surface of my existing glasses is soluble in bug spray. This was not a good thing.) Apparently, my retinas are still fine, good to know. (Which also means I haven't got any diabetes symptoms, which is also a good.)

Bugs in things I thought were done. The uClibc++ build during the cross compiler is dying because the ./configure part of the libstdc++ build (part of the gcc build) breaks. Once again, autoconf is horribly brittle and doing stupid things, and doing them _differently_ in contexts that have no obvious reason to be different. (Butterfly effects.)

It's too bad there's no a more automatic way to set environment variables. I tried "THING_{ONE,TWO,THREE}=42 set" and it told me "THING_ONE=42: command not found". Alas. (Yeah, I find corner cases.)

Three hours later, still fiddling with libstdc++. I could just cut and paste the relevant chunks of the native gcc build to the end of the stage, but duplicating such a large, complex, and _ugly_ chunk of the script is kind of painful. (The gcc ./configure invocation in is _nine_lines_long_, and is basically the result of me hitting ./configure over the head with a wrench until it stops moving. Not that the six line version in is much of an improvement, but having _two_ of them in would suck.)

I could teach to install uClibc++ in two places (retrofitting it into the cross compiler), but that's an ugly tricksy layering violation that does something non-obvious and is just horrible. I tried moving the entire uClibc++ block (including the libstdc++ build) after the uClibc build (the error is about being unable to build binaries, the compiler ./configure is calling can't find headers and it can't find libraries to link against), but that didn't help.

One problem is that I can't just cd into the subdirectory and run ./configure in there. I have to call it through the makefile and the best granularity I can get is "make configure-target-libstdc++-v3", and the makefile isn't passing on environment variables so I can't override things the way I'm doing for the other ./configure phases.

This means it's inheriting a lot of the config settings from the gcc build, which makes sense on one level because things like sjlj-exceptions have to match up with what the compiler's doing. (This is a flaw in the C++ language: it's full of non-obvious magic and fiddly sharp edges that have to match up in the ABI du jour. This is the fundamental problem that makes it hard to build a separate C++ library the way you can with a C library because the ABI isn't simple or well-defined. Of course the g++ guys have made zero effort to work around this problem, and happily glued it together into even more of a big hairball.)

But this also means that the second ./configure invocation is using the temporary compiler (xgcc) that the gcc build hacked together, which can't link anything because it can't find a C library or headers. Which might almost make sense if that was a _target_ compiler, because when gcc is initially built, uClibc hasn't been built yet because you need a cross compiler to build it. (There's a fundamental sequencing issue here, and I really don't want to build stuff twice.)

But xgcc is a _host_ compiler. (Of course before I worked this out, I tried moving the entire uClibc++ block (including the libstdc++ build) after the uClibc build, which didn't help.)

Once again, the "config.cache" mechanism allows us to manually answer the questions ./configure is asking. In this case, the questions are "what's our c compiler" and "what's our c++ compiler". I'm not quite sure if this is the host compiler or the target compiler it wants. (I think xgcc is a host compiler, but what does it need that _for_?) Feed it "gcc" and "g++", and... it's ignoring the existing values and overwriting them.

I really, really, really hate autoconf.

February 8, 2009

So this morning, unsuspending my laptop, sound is gone. (I don't know why.) Since I ripped out the horrible pulse audio thing, that can't be the problem, apparently alsa lost its cookies. So the obvious fix is to rmmod and insmod the sound module again.

So what's the name of the sound module?

snd_hda_intel 489264 7 - Live 0xffffffffa0386000
snd_pcm_oss 52608 0 - Live 0xffffffffa0378000
snd_mixer_oss 25088 1 snd_pcm_oss, Live 0xffffffffa0370000
snd_pcm 99208 4 snd_hda_intel,snd_pcm_oss, Live 0xffffffffa0333000
snd_seq_dummy 11524 0 - Live 0xffffffffa02c3000
snd_seq_oss 42368 0 - Live 0xffffffffa0273000
snd_seq_midi 15872 0 - Live 0xffffffffa026e000
snd_rawmidi 34176 1 snd_seq_midi, Live 0xffffffffa0259000
snd_seq_midi_event 16768 2 snd_seq_oss,snd_seq_midi, Live 0xffffffffa024a000
snd_seq 67168 6 snd_seq_dummy,snd_seq_oss,snd_seq_midi,snd_seq_midi_event, Live 0xffffffffa0234000
snd_timer 34320 4 snd_pcm,snd_seq, Live 0xffffffffa022a000
snd_seq_device 16404 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi,snd_rawmidi,snd_seq, Live 0xffffffffa0224000
snd 79432 19 snd_hda_intel,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_seq_oss,snd_rawmidi,snd_seq,snd_timer,snd_seq_device, Live 0xffffffffa020a000
soundcore 16800 1 snd, Live 0xffffffffa01f8000
snd_page_alloc 17680 2 snd_hda_intel,snd_pcm, Live 0xffffffffa01ed000

That's right, those crazy (read "insane") kernel developers shattered the sound module into 15 separate pieces. No idea which one of them is broken, and no obvious way to cycle this subsystem short of rebooting. (And that's separate from the hal/dbus/udev crap layered on top.)

Ubuntu becomes more like Windows every day. It's sad. Eric and I wrote about this years ago. If you have a module that cannot be used on its own, possibly it's not very useful to have it _be_ on its own?

Specifying WRAPPER_TOPDIR=/blah in mini-native when the native compiler is _also_ using the wrapper _does_not_work_. You wind up with uClibc++ (first build after that) trying to link a host binary against the target crtbegin.o. Yes, this broke FWL's ability to rebuild under itself, and it turns out to be tricky to fix. I have two different instances of the wrapper and I want different behavior out of them. I think I need to add the architecture to the variable names, WRAPPER_TOPDIR_armv4l=/blah.

February 7, 2009

Ok, 18 miles per gram is kind of impressive, even for elastic carbon nanotubes.

Ok, Mark pointed out that mke2fs isn't in the $PATH when tries to call it on a gentoo host, which I'd bumped into before, but had forgotten about. Not quite sure how to properly fix it. Gentoo doesn't put /sbin:/usr/sbin in the $PATH for non-root users, which is sort of the point of having those be separate from /bin:/usr/bin in the first place, but means that lots of stuff normal users can call (ifconfig is another one I've missed) isn't available unless you call it by absolute path.

I vaguely recall that Red Hat used to do this years and years ago, but Red Hat abandoned the desktop market so long ago the details are a bit fuzzy. (Fedora is Red Hat Enterprise Rawhide and nothing more. Deal with it.)

Adding a symlink to build/host is not a proper fix, nor is calling the tool at an absolute $PATH. I guess I need to put a test in and add /sbin:/usr/sbin to the $PATH if it can't find mke2fs. (That's a fairly specific fix rather than a generic one, but for the moment I'm only encountering a specific problem. I can always genericize it later.)

Ok, so if "groupmod -g" changes the gid of a group, what happens to users who have that as a default group? Do their user entries get updated? The man page implies no, but testing the command says yes.

Note to self: when debugging new user fiddling utility implementations, accidentally running "rm /etc/passwd" as root: bad thing. However, if you find yourself having just done this, remembering that you have a terminal you just did a "cat /etc/passwd" in and that you can cut and paste from: good. (Not panicing, also good.)

Merging Mark's distcc fixes. (I broke it in at least two places since the last time I properly tested it.) Now I'm running a build with distcc enabled, and the binutils build is dying. The reason binutils is dying is that the step sets $PATH to point to just the build/host directory (filtering everything out), and distcc looks for the second occurrence of the name it's called as (such as gcc) to find the local version of each utility. And host-tools doesn't _leave_ a second occurrence in the $PATH. (Kind of the point of it.)

Yeah, that's not going to work: what is trying to do (filter the environment, simplify it, remove anything non-obvious) is incompatible with distcc at the _design_ level. Right. So I can add explicit support for distcc to, or I can just skip the host-tools stage when building under distcc. (For right now, just skip it.)

I ate a 4-pack of Cinnabon "minis" yesterday and felt fine, but a single Starbucks hot chocolate gives me an upset stomach. This is a distinct trend (the salted caramel is worse, but not by much), and I don't know why.

Playing with the pliers Fade gave me for my birthday, working on the "1A" rings from Rosco Inc. (The size 1 rings are 1/4 inch wide and around 20 gague wire, this is the next size down from that. Darn fiddly.) I have 4 rows done. Making chain mail, of course.

I realize I'm unlikely to be able to hire the Mormon Tabernackle Choir to sing "My dog's better than your dog". I probably won't get Ralph Nader to say "Luke, I am your father" into a Microphone. (I did once get Neil Gaiman to say "By Grabthar's hammer, you shall be avenged" into the microphone after his reading of "Crazy Hair" at Penguicon 2004, but the Science Fiction Preservation Society never got us copies of the recordings of anything they recorded.)

However, it shouldn't be _that_ hard to get a classically trained soprano (one of the ones that sounds like she's inhaled helium and has so much vibrato in her voice you can't understand a word) to sing Yashimi vs the Pink Robots. (I mean, Alanis Morisette already did "my humps" and put the video up on Youtube.

This is a sign I need to get involved with another convention, to ground out these impulses. Alas, the armadillocon concom still doesn't seem to actually exist.

Fiddling with my cell phone/laptop connection again. (The bluetooth daemon segfaulted when I unplugged my USB bluetooth key, I had to re-run it.) At the store, the guy's blackberry could see my phone, and vice versa. My phone can see two other devices at starbucks ("jupiter" and "grace"). My laptop can see those same two devices. The only thing each of them can't see is the other one. I've blanked both of their histories so they shouldn't have any residue from previous association: no dice. I've run "buletoothd -n -d" to see the debug output, but it's not telling me what devices it sees, just "discovery session activated". Great, that's COMPLETELY USELESS. Tried disabling and re-enabling bluetooth in my phone: nope. Unplug/replug the bluetooth key (and relaunch the daemon): nope.

This is simulatenously hideously overcomplicated, totally opaque, and too brittle for words. And worst of all, it won't tell me _why_ it's failing. Not really impressed with the Ubuntu bluetooth infrastructure at the moment.

It is difficult to surreptitiously cook bacon. (Well it is.)

February 6, 2009

Went to see Coraline today, in 3D down at the theatre next to the Highland Mall. Excellent movie. (Nightmare before christmas was Tim Burton partnered with Harry Sellick, this was Neil Gaiman and Harry Sellick.) Fun watching them implement a 3D rendering engine failure using claymation. (The bit where they walk away from the house and there's nothing there.)

Fade got me a hematite ring. The last one I had I got at a gas station driving through Utah, and broke when it fell onto the pavement out in front of Eric and Cathy's place in Pennsylvania. (They're strongish, but very brittle.)

Got the useradd stuff refactored, which got me unblocked on it. (It was turning into a big tangle of complexity doing all those commands at once. Now, much less so.) Hopefully I'll get it done this weekend.

Mark's gotten a bunch of work done on parallelizing the build, starting by figuring out that I broke distcc again, and fixing it. (He says Python builds much much faster now.) Alas, in the evening the server ate itself. :( Dell's "dual power supplies" meant "twice as many chances for one to blow out", and their power distribution board designed to prevent surges from frying the other components apparently didn't. He plans to ask for a new one on Monday.

I am reminded that once one is married and no longer trying to meet interesting and attractive members of the opposite sex, one suddenly finds oneself meeting interesting and attractive members of the opposite sex, specifically at Starbucks. (I'm trying to figure out if this means something is deeply screwed up about humanity as a species, or whether western culture is to blame. I'm guessing both, although I'm not ruling out aliens taunting us either.)

February 5, 2009

Slept till noon, feeling much better.

I'm now working on the useradd, usermod, userdel, groupadd, groupmod, groupdel, chage, passwd, and groups commands all at once. (They're 90% the same infrastructure.) Right now they all depend on "useradd", I should figure out a cleaner way to configure that, but _after_ I get it working.

I'm not doing pwconv/pwunconv/grpconv/grpunconv because I'm only supporting one or the other at a time.

The server arrived! Mark set it up, and boy is it fast. I need to set up cron jobs on it, but lemme finish useradd* first...

February 4, 2009

Upset stomach. Tried biking to the north Kirby Lane location (up near 183 and 620), but turned back less than halfway there and went to the original Kirby Lane location (on Kirby Lane) instead. Having some garlic mashed potatoes (very good, can't finish 'em, feel nauseous).

I really hate Firefox. My laptop has 2 gigs of ram and Mozilla's clogged it all up. (Ok, I' trying to download 350 megs of podcasts, viewing a video, and running a build, but according to top the reason my desktop keeps freezing is Mozilla eating all my my memory (and using 57% of the CPU, for reasons unknown).

I miss Konqueror. Among other things, it didn't freeze to the point where it wouldn't even redraw its windows for more than 30 consecutive seconds on a regular basis. Tying it to KDE was sad. Webkit is based on Konqueror so Safari or Chrome would give me a Konqueror derivative and let me get away from Firefox. Maybe when I reinstall for Xubuntu when the next release comes out.

I should also try the deadline I/O scheduler. CFQ can get really unhappy when something in the background uses a lot of disk bandwidth.

[Hours later]

Did not sleep at all well. Something's screwed up in my head (maybe sinus infection, maybe a pinched nerve in my neck, maybe a head cold, quite possibly all three). But when I slept on my left side the little finger on each hand (and half of the finger next to it) went all pins and needles, then numb. When I slept on my right side, my left leg did something similar. When I slept on my back, my ears rang really loudly. That's just _weird_. (Also led to switching positions every fifteen minutes all night, which is not restful.)

Met Mark for lunch. He's made good progress on Gentoo From Scratch, by which I mean he's got a kconfig generator working from the portage ebuild data (albeit his first implementation's in perl, and it doesn't include dependency info yet). I thought Gentoo had longer and more elaborate descriptions ala the help in rpm and .deb files (which have multiple paragraphs for most packages), but apparently not. Sadness. Possibly the other repositories can be mined for some of it? I should ping solar.

Met with bank people about mortgageness. The _taxes_ on a new place as big as Fade and I were looking at would be around $675/month, plus another $100/month for insurance, and then the mortgage on top of that. This is not counting repairs and general upkeep to the place (currently covered by the condo association fee). That's... disturbing. I think holding off on getting a new house is called for.

I also got an approval letter so I can buy my sister a place. The downside there is that the _smallest_ mortgage my credit union will do is $50k and they want me to put 20% down, so they can't give me a mortgage for a place as _cheap_ as the ones Kris has been looking at. (There are apparently some very nice places in rural minnesota going for under $60k.) And no, you can't get a mortgage for more than the place is worth and use the rest for repairs. I asked.

Longish nap in the evening to try to get my concentration back, but it didn't help. I suspect I'm writing today off as a sick day. (Well, half day. I got maybe 4 hours of productive work in.) Hopefully I can catch up later this week. Groggily poking at useradd in the meantime.

I'm reminded of the part in The Art of Unix Programming where Eric talked about "Bernstein Chaining", which I pointed out during my editing pass was a horrible name for something that guy didn't invent. The technique Eric was pointing out was running the rest of your command line via the fun little trick of exec(argv[offset], argv+offset), the way "nice", "nohup", "detach", "setsid", and so on have been doing for decades. After I pointed out a few preexisting counterexamples, he decided it the name only applied to using it for security purposes... the way "su", "sg", and "chroot", do? And this isn't counting tools like ssh that wash a command line through a shell. (Did Bernstein do sudo, and if not does it also predate his stuff?)

Catching up on my podcasts, according to Russell Tice the NSA was recording every domestic communication for the past several years. No warrants, no disclosure, no oversight, no limits. Gee, what a surprise. (We didn't prosecute J. Edgar Hoover for doing that, so what's their downside?)

Mark's been amused that for years now I've said "hello office of domestic surveilance" into my cell phone when I think I've just said something that triggers an automated word searche that might flag an actual agent to listen to fifteen seconds of the recording. (Yeah I throw a little chaff into the radar. Force of habit. As Cory Doctrow pointed out, the best defense against universal surveilance is flooding it with noise so it's useless in practical terms. All those "automated filter and search" things are about as useful as net-nanny software. It's an AI-complete problem to do it right.)

What annoys _me_ is that my new phone has the "it can be remotely switched on to listen in speakerphone mode when it seems like it's off" feature all the new ones have, and when they do that it drains the darn battery. I can't _prove_ that random sampling of the population is the reason this phone usually has a couple _weeks_ of standby time, but sometimes it inexplicably goes from three bars to dead in an afternoon when I'm not using it, and not at the edge of signal range so that it would be searching. (Not only was this capability documented years ago, but there was a BATMAN movie using it as a plot point.) Privacy issues aside, nerfing my battery life should require a warrant.

Again, my biggest disappointment about Obama: voting for FISA during the primaries. My father works for the defense industry and has a reasonably impressive security clearance, my grandfather worked for the NSA for many years because he got into cryptography during World War II and couldn't get back _OUT_. (They threatened to draft him for the Korean war if he didn't voluntarily sign back up, and he was stuck there until retirement age. He still won't tell me what happened to him in Iraq in 1982, just that he's deeply pissed about it because he was NOT a field agent and should not have been used as one.) I stay way the heck away from all that stuff for a _reason_...

Lest I come off as a paranoid tinfoil hat type, I point out that an undirected federal bureaucracy remains largely ineffective no matter how much information it has. The feds already have your tax records, bank records, they _issued_ your social security number, they run the post office, and so on. Yeah, the pack-rats want ever more, but mostly they just shove it in big boxes and never look at it again. Just having it is the important thing.

Most of us just aren't very _interesting_, but boy are we chatty. The current lot of surveilance monkeys have a ways to go to be nearly as proactive as the documented activities of J. Edgar Hoover, let alone some of the cold war excesses. Collecting information and being able to do anything useful with it are two completely different things. Data mining turns out to be subject to Sturgeon's Law in spades (great way to find false correlations and random coincidences) and beyond that it information overload just sifting through the publicly available stuff. (I pity anybody assigned to sort through the contents of my hard drive. I sometimes have trouble with that, and I'm the one who filled it up in the first place. I'd spring clean, but they keep selling bigger ones... I'm far more worried about the Nigerian spammers and Russian organized crime guys trying to get my credit card information or zombie one of my boxes than I am about intelligence agency du jour. Who's more likely to figure out what they've got and act on it within my lifetime?)

Last I paid real attention to this (years ago) the spy agencies were most interested in tracking other spy agencies (China, Israel, the FBI and CIA spying on each other, etc), and the people most likely to effectively pay attention to civilians were the corporate espionage types (who accomplished more with less than any of the government-backed ones). Yeah, 9/11 made the US spies broaden their horizons (which is another way of saying "lose focus"), but data hoarding is not the same as extracting useful timely intelligence.

Their stockpiles might be of interest to historians if the information survives and gets declassified in 50 years, though. We _know_ Hoover had Einstein's phone tapped, but alas future generations didn't get to hear the tapes...

(Yes, this is what I think about when I have a cold and am too groggy for decent technical work...)

February 3, 2009

Jean Wolter, who insists "I am not that interesting as a user for you" because "I use [Firmware Linux] to get a chroot build environment for x86", has repeatedly reported an outright design flaw in until I finally figured out how to fix it. (I'm not sure users _get_ more interesting than that.)

I'm outright proud of myself for avoiding the use of a gender-specific pronoun in the above sentence. The language is against me here. At least we don't assign gender to radios and boots and such. (Ok, we have "mister potato".)

It's too bad there isn't a reliable way to find good anthropology classes. I actually enjoy reading this sort of thing.

Looking at useradd default values, and how to override them. (The actual plumbing and security implications aren't bothering me so much. Reproducing the poorly thought out user interface, however: nontrivial.) I guess the reason for -g is that if you specify -g again it overrides what's already there. It seems like -U is the default (create a new group for each user), then specifying -g would switch it off, and it does. And if you specify "-g 42 -U" it fails and spits out a usage message. (Yes, group 42 exists, it's "shadow". No, I don't know why shadow has a group, it's pretty much designed to be accessible root only.) So why does -N exist? (Specifying it without -g sets the gid to "100", which is a group called "users" which isn't used by any existing user.)

Ok, how to cut back some of this complexity overgrowth... Part of the problem is that /etc/default/useradd is a default command line, later commands need to be able to override it. So if you specify -g in the default command line, having another -g on the command line should override this, and they way I set it up -g and -G are A) equivalent, B) append to the list of groups this user is a member of. Huh.

I've already got one config variable (USERADD_FIRST_UID) specifying the first UID to auto-allocate. I can make -N use that, and -g override them both. So you don't need -g in /etc/default, so I _could_ still have it append, but I suspect I should only take the last -g (but multiple -G).

I can also make a config entry to enable support for -UN, and make it the default if it's supported. So you don't have to specify -U in the default command line either (which is good if -g needs to override a default behavior of -U but the old useradd bombs if -g and -U appear on the same command line.)

Come to think of it, do you really need /etc/default? If I'm already making the mostinteresting options selectable at compile time... Parsing a command string into an argv[] isn't free, then appending the existing options to it... Hold off for now and see if it's really needed later.

Ha! "useradd 42" was _not_ rejected. I can create users whose names are numbers, but not the corresponding uid. While I admit I need to be root to be able to do this, I can see a user named "0" confusing various portions of the sysstem. (It won't let me create a negative uid.)

February 2, 2009

Kirby lane for breakfast with Stu, looking at the suspicous grey stains on my bread and wondering if it was moldy. I ordered a portabello mushroom sandwich: yes, it has fungus on it. (Not awake yet.)

Where I left off last night: /etc/default/useradd provides defaults. The logical thing to do is just let it have command line arguments that get parsed before the ones provided on the command line. (No point having the same data in two formats.)

Where this gets tricky is that the command line options are parsed by the toybox infrastructure before it gets to the useradd_main(). It's easy to feed in new options _after_ parsing the command line ones, but overriding defaults means going the other way. My first impulse was to add a flag to tell the common infrastructure not to parse the command line, but I've already got a way to do that: it skips option parsing for the command if the command's option string is NULL. So feed in a NULL option string, then manually call the option parsing logic with the real option string from the useradd_main().

Wondering if I should bench various I/O schedulers for FWL native builds. Under the emulator, processor time is expensive and I/O sucks so both CFQ and NULL schedulers have distinct downsides. (I've been leaning towards NULL because the host has an I/O scheduler already, but if it reduces the number of system calls the emulator makes... Hard to tell. Probably have to bench it. Later.)

February 1, 2009

I'm 37 today. This is mildly disturbing.

Stressful weekend. Not sure why. It's not birthday stuff (I actually forgot until last night), I'm just doing a bit of a red queen's race right now with various technical projects.

Vaguely pondering a new laptop. (Mine still works fine, but it doesn't have the hardware extensions kvm wants, and maxes out at 2 gigs ram.) Except I'm reluctant to spend money on hardware just now because Mark spent over $5k of our company's money on hardware this month: $2k reimbursing his own laptop purchase and over $3k on a new server. (I admit an 8-way 2.5 ghz 64 bit system with 32 gigs of ram is kind of cool. Neither Stu nor myself could talk him out of buying Dell, only experience will do that, but at least I talked him out of getting SAS drives instead of SATA. Too bad FIOS isn't a viable option in Austin.)

The reason I'm pondering a new laptop is the existing Petalogix build (the one we need to get FWL to create a new toolchain for) requires Red Hat Enterprise (or its near-clone Centos). Stu installed a copy to play around with that build, and yesterday he gave me the Centos livecd image he'd used, which I just booted under qemu. It took about 10 minutes to get to the desktop. That's mostly because RHEL is a pig, but partly that unaccelerated qemu is slow. (And I've never had any luck with kqemu.)

Of course the Centos livecd hasn't got development tools on it, and it won't actually _install_ without network access. (See "pig". A cdrom just isn't enough space. Right.) Alas, Starbucks still has AT&T brand lack of internet access. (Failing to provide internet since 1995!)

Back working on useradd.

It would be really nice if the usleep() man page mentioned how many microseconds were in a second. (I could look it up if I currently had internet access, but again; Starbucks.) I think it's a million? (Makes a quick a.out... Yup, looks like.)

I'm fairly certain I can get away with "if (--len != '+') len-=5;" because the if() has to act as a sequence point. (I know that "len -= (len++);" won't do what you expect because even though the order of operations is darn clear from the _language_, it turns out the optimizer is free to reorganize things insanely when you assign to a variable twice in the same statement, and len++ counts as assigning to it...

Ooh, Cringely's column is a podcast now. Cool. If I was online I'd see how far back that goes and add it to my queue. (I really need to get my cell phone set back up.)

Evening, back online, catching up with email. Booked a flight to go visit Eric and Cathy up in PA (I leave on Feb 17 and return March 4), so hopefully we can finish the C++ paper and a follow-up to the 64 bit transition. Finally set up my Southwest airlines frequent flyer thing ("Rapid Rewards") and fed it the last few trips I've taken (so much for "Rapid"). Apparently I need 9 more to qualify for a free ticket...

Got email about tinycc (somebody on windows who can't get the sucker to build under Cygwin, wondering if I could help; alas, not really my area). Glanced at the tinycc mailing list archives to confirm that the CVS archive is still twitching. Wandering off to let it decompose another few months. (I deal with CVS when paid to do so. Pretty much by definition "something I do for fun" and "fiddling with CVS" are mutually exclusive. Thus I leave tinycc to rot until the CVS goes away, if only because the project itself goes the way of DOS. I tend to check back in every three months for some reason, although these days it's mostly due to external poking. (I could just start replying to the messages about it with "They get rid of CVS, I come back. Until then #%*(&%# off." But it seems impolite somehow...)

Broke down and ordered another $300 worth of "splenda quick pak" things before they go away. (They were discontinued a while ago, but still has some in stock, and they're less than half the price they used to go for in the store.) It's premeasured 1 cup packs, which is A) perfect for making a gallon of tea, B) not bad for making a pumpkin pie either. I've gone through half the previous order I did, and I dread trying to make tea after I run out (opening ~3 dozen individual splenda packets and dumping them in the tea is not much fun, nor particularly accurate), so I just ordered a bunch. (Let's see a "pack" is 24 envelopes, a case is packs, and I just ordered 10 cases... I think I just ordered 960 cup-equivalents worth of splenda. Let's say I use 2 per week (between tea and various baking uses), 52 weeks/year... That ought to last us a while... That's just under a decade. I can live with that.

Dragon got out. (Came back a few minutes later.) Kiggy knows where the food is.

January 31, 2009

Out to IHOP for Breakfast with Fade, then headed over to Stu's place to work on the petalogix/xylinx report. In the evening we all went out to Mr. Sinus (Me, Stu, his daughter Beth, and Mark. Fade decided against coming along.) They were mocking The Matrix this time.

No real technical work accomplished today (other than the start of writing a report). Too much running around doing other things...

Vaguely pondering a new laptop. (Mine still works fine, but it doesn't have the hardware extensions kvm wants, and maxes out at 2 gigs ram.) Except I'm reluctant to spend money on hardware just now because Mark spent over $5k of our company's money on hardware this month. ($2k reimbursing his own laptop purchase and over $3k on a Dell server Stu calls "a mistake". I don't really care as long as it works, but $3200 on depreciating hardware makes me wince. At least I talked him out of getting SAS drives, and I admit 8-way 2.5 ghz 64 bit processors with 32 gigs of ram is kind of cool. Too bad FIOS isn't a viable option in Austin.)

January 30, 2009

Dear useradd: you're doing it wrong.

Ok, the whole useradd/adduser split is silly, but what's almost as silly is useradd having both -g and -G. The first takes a single group name, the second takes a comma separated list of groups. Why not just have the first one take a comma separated list of groups, which can have a length of one entry? (Is it because comma might be in a group name? Punctuation other than underscore in a group is iffy at best, and -G already doesn't support it. A colon would violate the file format notation entirely, unless there's an escape sequence? The "man 5 passwd" page doesn't mention one...)

Right, so toygox useradd -g should take a comma separated list of groups, and -G should just be a synonym for it. I once wrote code for the busybox "mount" command to merge different comma separated lists, but it's probably less effort to just write it again than dig it up and port that. (Yes, even though I wrote it.) And while I'm at it, the easy thing to do might be to break down the various comma separated lists into a llist of individual entries. (Yeah, not quite optimal for nommu systems, but it should mostly work even there. It's a short-lived applet, we're fragmenting memory a bit but not pinning it.)

Hmmm... Given that compilers are now doing all sorts of funky inline-and-optimize tricks for C code across functions and even across files in "build at once" mode, what would be involved in having a source version of libc installed, and building against that to give the compiler more information to work with? (At least for statically linked programs?)

Obviously this would be useless until after "make" went away, because the dependency information about where to find each function (and the functions that one depends on) just isn't there at the moment, not when you break the one big library into hundreds of individual source files. You'd need something more like ctags and a build system with a brain, which make profoundly is not.

Still, it's an interesting thought. Yet another tangent I haven't got time to follow up on...

Chris Steel emailed me about mips:

I see from your development log that you're trying to get qemu to emulate a MIPS board with more than 256MB of memory. Not sure this is ever going to work due to the way that MIPS memory works.

The 32 bit addresses use the top 3 bits to determine if you're in kuserg (user space) kseg0 (uncached physical address) kseg1 (cached physical address) or kseg2 (some kind of MMUed kernel space, but I've never had to use that one). This limits the physical memory space to 512MB. Since most boards want part of this for mapped IO, they usually limit physical memory to 256MB.

So fiddling around with alternatives to the malta board, not likely to accomplish much.

If it is just _physical_ memory being limited, then I could put a 512 meg file in /dev/shm, feed it in as -hdc to qemu, then mkswap /dev/hdc it use it as swap space. (Swapping to ramdisk isn't ideal, but it's something.) On the other hand, if the virtual memory address range is similarly limited... I should ask him.

Made another stab at getting my cell phone internet to work, but I can't even get Ubuntu 8.10's bluetooth to pair with the phone. (The KDE variant worked great. The Gnome walkthrough wizard simply can't find it, although it found some other device at Einstein's named "barbara". (Mine's "Walrus".) Fiddled with the menus and as far as I can tell the thing should be announcing itself, but it isn't. Maybe I should go to the T-mobile store and make sad eyes at them?)

Oh, this is promising. From dmesg:

[188912.179445] usb 1-1: USB disconnect, address 3
[188912.180160] btusb_intr_complete: hci0 urb ffff88003e5a0d80 failed to resubmit (19)
[188912.183540] btusb_send_frame: hci0 urb ffff880029436240 submission failed
[188916.822489] __ratelimit: 1 callbacks suppressed
[188916.822507] bluetoothd[4911]: segfault at 46425f4282 ip 00007f85b89d86c5 sp 00007fffc1322b90 error 4 in[7f85b89c6000+3c000]

Bravo, dbus, you unnecessary layer of infrastructure you. I came up with the idea of power cycling my phone after removing the bluetooth USB dongle, but plugging the bluetooth dongle back in is now being completely ignored. That would appear to be why. Great. A hotplug layer that can't handle device removal. That's just STELLAR.

(Eight hours later, the desktop inexplicably froze for 15 seconds, and when it came back it had switched window manager themes. I have no idea why. This one's even uglier than the default one. Oh well, that's Gnome for you.)

January 29, 2009

Ah, correction. There's one busybox suit still in progress, the one Olivier Hugot is doing in France, where Erik and I teamed up with Harald Welte of and some other guys. (Just got a status update on that today. The SFLC isn't directly involved in that one.)

Last night I sent Mark a kconfig tarball so he can give Gentoo From Scratch a pretty user interface, but that won't keep him occupied for long. Poking at useradd to unblock him going forward (lots of packages portage can't build without that). Writing a message to the FWL mailing list about it though, because it's confusing.

Got a bug report that the toybox 0.0.8 release has an x86-64 kconfig/config binary in it. Oops. My release script does a "make defconfig" and doesn't do a clean afterwards before tarring it up. Um, right. Fix release script, cut a 0.0.9 release (which is just a repackaging of 0.0.8 without this thinko.)

Cleaning up all the 2.6.28 miniconfigs. i686, i586, and x86_64 were fairly straightforward, and both arm variants already worked (I really need to switch armv5 over to EABI, and v6 and v7 up too). Sparc was also using scsi so it's broken the same way it's been. I still haven't got a working qemu board for sh4 or m68k, and I need to fix powerpc or find an alternate board.

But mips is being stroppy. Mark send me a config that works, but it's based on the malta defconfig so it's got lots of unnecessary stuff switched on. The one I tried to put together didn't work, and I have yet to figure out why. Binary search comparing the two now, and... I think it's CONFIG_PCI_QUIRKS. I don't now _WHY_, but I'm going to stop fiddling with it for the moment.

I should poke at qemu. It would be nice if -kernel could always handle an elf image (vmlinux), since the kernel always makes one no matter what your target is. (Other formats are generated from the vmlinux.) It would also be nice if it had an option to feed in a binary device tree (the bamboo board now has the infrastructure, I should hook it up from the command line). Finally, qemu needs infrastructure to _parse_ a device tree and set up a board from it. That one's decidedly nontrivial.

Ooh, Miklos Vajna sent me a powerpc config that qemu can boot as a g3beige! (The svn version of qemu, now using actual openbios instead of open hackware, can -kernel boot a 2.6.28 vmlinux built from that. As a power mac instead of a hacked up prep with custom boot rom.) That is deeply, deeply cool.

Argh. Catching up on podcasts, the Jan 15th countdown contained Bush's farewell address in full.

Dear El Shrubbo: September 11th wasn't something you did. It was something that happened TO you. (And not even directly to you, the people it happened directly to are _dead_.) At best, it was something you failed to prevent. Talking about receiving momentoes from survivors in your farewell address is just pathetic; apparently the abject failure ever since wasn't your fault, it was society's fault, it was the environment you grew up in... Yeah. Not buying it. Go away now. ("Murdering the innocent to advance an ideology is wrong, every time, every where." Uh-huh, two words for ya sparky, "collateral damage". Or perhaps "civilian casualties". Well into six figures of 'em in Iraq, more in Afghanistan, although lots of those were merely crippled for life rather than killed. Still, the point is that your black and white view of the world DOES NOT WORK. Neither side consists of holy infallible warriors, what you have to do is pick the best available options and accept the consequences of your actions. If you don't understand that there _are_ downsides to just about every course of action in the real world (let alone what they are and how to mitigate them) you aren't qualified to handle any real power, period. I know you're a recovering alchoholic who switched your addiction to religion just like the 12 step programs encourage, surrendering yourself to a higher power and all. But dude, you ain't God. Deal with it. You're not Superman ("fighting for truth and justice"). You're aren't even Spider-man (he's smarter than you). You might be "the Punisher", if such a job could be delegated from behind a desk. "Unshakeable faith". "Never, never, never..." Drop the absolutes already, the real world is BIGGER than that. AAAAAARGH!)

SO glad that idiot's tenure is over. The cleanup continues... (The daily show mocked this speech _marvelously_.)

Whee, gnome's lost its marbles. 30 second delays on all sorts of actions, such as redrawing terminal windows and selecting pulldown menu items. It's not out of memory or anything, it seems ot have lost track of a semaphore? Sigh... Killing and restarting NetworkManager from the command line made it less unhappy, whatever it was.

Went to see the local LUG that meets down on tenth street on Thursdays. Remembered why it's been years since I've been there, and why Stu stopped going. There were only four other people there. They started by telling me all the people who'd died since the last time I was there (they're not the youngest bunch). Then one of them wanted to argue, with "about" being an optional extra. Yeah, not going back any time soon. I should look at one of the other three LUGs in town and see if it's less dead...

January 28, 2009

I haven't made any secret that the Cisco thing cost the SFLC my support, and today the disentanglement went through. (I asked them not to file any new lawsuits on my behalf, and they said "Ok, we will follow those instructions.") So that's it for the busybox lawsuits, unless Erik wants to do something on his own.

Overwhelmed with todo items. Trying to clear a few.

Mark says that Gentoo From Scratch is blocked on useradd, so when I asked him about it he sent me an example of it failing to use getent. Ok, looking at getent: it's a GNU extension that's not in SUSv4. Its description in its man page is:

The getent program gathers entries from the specified administrative database using the specified search keys. Where database is one of passwd, group, hosts, services, protocols, or networks.

Um... ew. Ok. Mostly it looks like they're running grep for "$2" against /etc/"$1", and there's only six options for $1, so it's basically a smallish shell script. Test a few corner cases... Done.

Poking at Linux network block devices. Downloaded the userspace package from, it's kind of sucky. (They added autoconf to build two executables, each of which is basically one source file. It has a ./configure stage for this spits out 107 lines of crap. Why?) This could theoretically allow real hardware to run builds without attached storage, and qemu could work the same way. But in practice, real hardware can usually take a USB drive: the real limiting factor for most of 'em is memory, you can't even necessarily give them a swap partition because beyond a certain point the page tables for it eat more physical ram and make the shortage worse.

At some point if I start to care, it looks like it wouldn't be too hard to implement nbd-server and nbd-client in toybox.

The qemu mips emulation is also being stroppy. The malta board maxes out at 256 megs, so I'm looking around for other mips boards that might be able to hold more memory. The "-M mipssim" option of qemu looked promising, so I created a hw-mipssim target and melted down the kernel's mipssim defconfig into a miniconfig, booted it... and it turns out to be hardwired to 32 megs (in arch/mips/mipssim/sim_mem.c function prom_getmdesc). Which is sad, because qemu lets me feed in a megabyte and I can specify "mem=1024M" on the kernel command line... it just ignores it. For that matter, specifying a command line at all involves modifying the one hardwired into the kernel, because the one passed in by qemu's "-kernel -append" combo isn't used. Oh, it also hangs calibrating the delay loop (init/calibrate.c function calibrate_delay, the "while (ticks==jiffies);" line never advances). I might be able to work around that by specifying lpj=12345 on the command line, but have no idea what value to feed it...

Before heading to california I was in the middle of switching FWL over to the 2.6.28 kernel as stable, and pieces are all over the floor as I'm coming back to it a week later. The config symbols for generic IDE hard drive support changed (you now need CONFIG_IDE_GD and CONFIG_IDE_GD_ATA), which broke every platform except arm (which is using SCSI, due to the board being emulated).

I thought I'd come up with a new i686 target already, but didn't replace the target's miniconfig. Ok, fixed i686 and i586, I'll fix the others in the morning.

January 27, 2009

Alas, the USB cable I grabbed was the "big usb to little usb" one used by a lot of cameras and such. This is not the cable my cell phone uses to charge from my laptop; T-mobile did a proprietary special instead. Sigh. Cell phone battery just died after a phone call from Vito and Stephen, asking why they can't get ahold of Mark. (Answer: He's asleep. I got to bed at 5am after meeting him at Whataburger to go over a big digression we're trying to get off our plates so we can get back to focusing on FWL and GFS, and he came out of it with at least 4 largeish todo items he planned to tackle before going to sleep.)

Met Fade for Lunch between classes (drove her to Zen), and now meeting Stu for Coffee at the north Kirby Lane location. I have my laptop, but I'd _meant_ to have my cell phone working too. Grrr...

Tonight I have to work at home because the next stage involves loading test kernels into the 610n hardware until it confesses. Can't exactly drag that out with me, doesn't run on battery and too many cables anyway. (But I found my debug adapter! Yay! I was blocked on that for a while. Now I just need to find the darn power supply for the thing, which I think is upstairs. Yeah, we've been testing on Mark's router a lot...)

Reading my old messages to linux-kernel, trying to find the one where I submitted broken-out patches with my changes against the stock linux Kconfig. They didn't merge 'em, but I'm thinking of grabbing the toybox kconfig to use in GFS, and first I should update it to the current kernel, and to do _that_ I need to figure out what changes I made to the thing almost three years ago. (They're mostly the ones I put into busybox, so I could poke at that source control too, but I've tweaked 'me a bit since. Not much since I broke them up nicely and sumitted them upstream to the kernel, though. Just re-applying that to current kernel kconfig would be the easiest way to migrate. If I can find where I submitted 'em, from a previous laptop...)

January 26, 2009

Am I the only one to think the Reason the republicans are giving Attorney General candidate Eric Holder a hard time might be because the Attorney General would be the guy responsible for prosecuting laws broken by the previous administration, and this guy hasn't said he _won't_?

Obama is being the good cop. He wants to work with the Republicans. He's delegating the bad cop duties of prosecuting blatant wrongdoing to his Attorney General. This seems like it might be relevant, somehow? Maybe?

Yeah, catching up on my podcasts. Behind the times, I know...

So qemu svn 6419 explains why 256 megs has been our memory limit on qemu's mips emulations:

mips: limit RAM size to 256MB on malta and qemu boards

This avoid crash when a bigger RAM size is requested (the devices are
mapped at 0x01000000).

That covers Malta and r4k. The other three options are Magnum, Acer Pica 61, and MIPSsim. Never heard of any of 'em, really. Let's see what the kernel's arch/mips/configs directory has... No mention of pica or magnum, but there's a mipssim_defconfig. (Which is somewhat out of date, but I'll try squashing it to a miniconfig and throw it into a hw-mipssim target and see what happens...

Today on Guadalupe, a big truck with the "Monster energy drink" logo on it was handing out free caffeine. (Yay!) They gave me a... coffee flavored energy drink. (Um, give it to Fade, I guess? Or Mark?)

January 25, 2009

How do the cats get cat hair all over my keyboard when the lid is closed? I am _impressed_.

Walking down Guadalupe to The Donald's was an adventure today. Einstein's arcade closed a year or so back, and the place has been abandoned ever since. After a few months it got grafittid, shortly after which it developed a sign in the front window saying "leased". Last fall a banner went up top saying some yogurt shop was planning to move in, but no actual yogurt shop has yet materialized. Today I noticed that one of the bottom window panes was smashed in, with broken glass behind it. This apparently isn't recent because a piece of plywood got put over it, and then dislodged again by whichever homeless person broke in to sleep there in the first place, presumably due to the cold.

A couple blocks later, I stopped to chat with the nice old homeless lady (too many adjectives, I know) who sells flowers on guadalupe next to what used to be the other arcade (Le Fun) before the Cult of Scientology raised their rent too much. (They own the whole building, and offer "free IQ tests". If you take them up on it, you flunked.) I can never remember if her name is Rachel or Rebecca, I should just ask again but I've done so at least three times already. (The mnemonic for one is the junior phoenix from X-men and the mnemonic for the other is the given name of "Newt" from Aliens, but that doesn't help me remember which one applies here.) For once I was carrying cash, so I bought three dollars worth of flowers.

A minute later (while probably Rebecca and I were still talking), a man came up trying to panhandle from both of us. His story is that he was relocated to Austin by Katrina and then abandoned by FEMA, and he has a wife and three kids living in an abandoned/condemned apartment building scheduled to be demolished soon. Rebecca more gracefully extracted herself from the situation than I did. I gave him some airline peanuts I had in my backpack, and the flowers (for his wife), pointed out I don't normally carry cash and had given all the bills I had to Rebecca, and continued on to McDonald's.

A block later there's a church which allows homeless people to congregate on its steps. There were eight of them, and a dog. I got hit a third time, and managed to find another fifty cents under my keys, then made it to McDonald's.

I think my mistake was both that I walked instead of biked, and that I'm wearing a shirt with buttons and a collar. (Well I'm out of T-shirts. I was carrying cash because I thought I should really do laundry, but I only found a little over $3 worth around the condo and I have a half dozen loads of laundry to wash and dry. Need to hit an Asynchronous Transfer Mode.)

But it's also that Austin ain't the city it was pre-Bush. My first four years here, I literally didn't see a single homeless person. I remember what a _shock_ it was when I saw someone panhandling at a street light near the start of Bush's first term, around the dot-com/worldcom/enron implosions. (I'd seen it in Baltimore and New Jersey and such, but Austin just hadn't had that.) Now I can't walk to McDonald's without running a gauntlet. (There were always "drag rats" on Guadalupe, but they used to be college students panhandling for spare change between classes. It was literally a hobby.) Ever since Katrina, when FEMA imported and abandoned several thousand additional homeless people into Austin (and far more to Houston and Dallas), this city seems a bit overwhelmed. It just hasn't got the facilities to cope.

I'm currently looking out the McDonald's window to a sign that says "NOTICE This Property Is Protected by 24-HOUR VIDEO SURVEILLANCE All Violators Will Be Prosecuted". I never noticed that before.

I really hope Obama helps the nation in a big way, but local impact is second derivative. We haven't cleaned the republicans out of state or city government, and it'll take a while for national changes to trickle down to Austin.

Apropos of nothing, I note that the hot chocolate from McDonalds' built-in cafe thing is astoundingly mediocre. It's a bit weak, the variety of chocolate they used is kind of bland, and they scalded the milk. Serviceable, but not worth coffee shop markup prices.

Right, programming stuff.

Toysh needs infrastructure changes to break the current 1:1 mapping between lines of input and commands to run. There's already a little bit of support for multiple commands per line ("one; two", "one | two", "one && two"...), and fleshing that out is fairly straightforward. But what I glossed over the first time is that commands can stretch over multiple lines (by ending with a backslash, by ending with pipe or logical operator, if statements, loops...) The command and pipeline structures have some support for this, but the function calling convention is wrong: handle() returns void. It needs to be able to let its caller know it couldn't complete and needs more data.

Ok, rename handle() to parse_and_run(), make it return a struct pipeline * and accept one in its arguments as well. Teach parse_pipeline to return NULL when it ran out of data without finishing its current set of connected commands (which is what a pipeline really is, since && and || set up a similar logical grouping without actually being pipes).

Ok, new fun corner case, what should a shell's -c option do if it gets an unfinished command? Trying bash -c "echo hello \\" just gives me "hello" (without even a trailing space, according to wc). It's apparently appending a blank line internally for \ continuations, but erroring out for &&. (Specifically, bash -c "echo hello &&" gives me "bash: -c: line 1: syntax error: unexpected end of file". That's a lot of colons.)

Meanwhile, dash -c "echo hello \\" outputs "hello \", for 8 characters according to wc as opposed to bash's 6. (Trailing newline.) Dash also errors out for a trailing &&, with "dash: Syntax error: end of file unexpected".

Actually, loathe as I am to admit it, dash seems to be a lot more sanely implemented. Bash has a different error message for bash -c "echo hello <" (two lines this time, "bash: -c: line 0: syntax error near unexplected token `newline'" and "bash: c: line 0: `echo hello <'". So why is this line 0 for that, and line 1 for the other line?)

Sigh. This was all clear in my head back in 2006, but now that I come back to work on it again I don't remember why I'm only harvesting exit codes if TOYSH_FLOWCTL or TOYSH_PIPES are enabled. (Goes to read the descriptions of those config options... Ah, flowctl is for if/while/for statements and pipes covers && and || and such. So yeah, without those the shell can't _use_ them. Admittedly $? might care too, but the shell couldn't act on the result...

Toysh started as an exercise in how far down you could strip a shell, and wound up being noticeably smaller than busybox lash. But in making everything configurable, you get a darn useless shell if you disable all of it.

January 24, 2009

Some time over the past week I got fed up and force uninstalled pulse audio. This turned out to be a bit of a trick because the ubuntu-desktop package depends on it, but that's just a metapackage so I let it uninstall that too and everything worked fine. And suddenly, all my sound stuff started working properly again. No more guaranteed silence after a suspend and resume unless I did cleanup with "kill" and "pavucontrol". No more problem starting a second app trying to use the sound and suddenly everything going silent. No more apps suddenly going silent for no reason in the middle of playing something.

Pulse Audio is a mixer that can't mix, and it's unnecessary since the ALSA layer does its own mixing. (So once again, it's a piece of unnecessary infrastructure that's supposed to do nothing, and fails to do it correctly. Did I mention that Gnome is the official desktop of the FSF?)

What I didn't know was that it couldn't uninstall competently, either.

Today, Fade and I went to Fry's to get her iMac upgraded from 1 to 4 gigs of ram, and I had them look at my laptop too (although they didn't have the memory it needed to bring it above 2 gigs in stock or something). To prepare for this, I actually shut it down instead of suspending.

When I got home, I booted it up, and gnome died. I got a pop-up window saying my session lasted less than 10 seconds and this was probably a bad thing. (No hint of a desktop, it went straight to this screen.) When I clicked on the only button it provided, it dumped me back at a login screen, which auto-logged me in after 10 seconds, to give the pop-up window again. There's no obvious way to STOP it from auto-logging me in (although the timeout goes up to 30 seconds if I type something), so I can't get it to just stay at the login screen. (Thanks Gnome developers!) I can however get it to stay at the "your session lasted less than 10 seconds" warning box, so ok.

Being a Linux geek of longstanding, I ctrl-alt-F1 and log in from a text console. Check dmesg: nothing. Some strange residue in /tmp, is the .gnome directory in ~landley horked, what's up with this? And where the heck does gnome log these sort of things. Shouldn't that pop-up window have showed me what the actual _problem_ was? Go back to the pop-up window. It doesn't have more than one button, but it _does_ have a small print "view details" checkbox. Click that... and it immediately responds (why is it a checkbox and not a button then?) by opening a sub-window and showing me the problem. A few lines of noise, ending with:

Couldn't exec /usr/bin/pulse-session: No such file or directory.

Come on guys, I uninstalled this through dpkg, not by ripping it out by hand, exactly to AVOID this kind of thing. Sheesh. (And why is it trying to exec something out of the $PATH via an absolute /usr/bin/blah path anyway? And why is the failure of an _optional_ service preventing the whole desktop from launching? Lack of sound is no reason to essentially brick the machine from a nontechnical end-user's perspective. Even Red Hat just did a pop-up warning about an inability to initialize the sound card, the last thing I saw kill the desktop bringup because something trivial was missing was Windows 98! (Back around 1999 Reese deleted fonts she wasn't using to free up space on her hard drive, but it turned out one of the help pages referenced one of 'em, and I got called in to figure out why her computer was booting into safe mode with no explanation. Ubuntu has now reproduced this experience for me.)

Layers of broken here. Layers. Something that shouldn't have existed in the first place because it did _nothing_ (or more accurately, failed to do nothing correctly) also failed to clean up after itself in a way that caused something way too brittle for words to die on startup rather than just warn and move on, and then the failure was auto-retried to death. And I didn't find out about it until I rebooted the machine days later.

Long story short (too late!) I deleted the /etc/X11/Xsession.d/70pulseaudio that dpkg left behind, and suddenly everything just worked again. (It somehow seems repetitive for X to have its own /etc/init.d directory. Why on earth would the sound mixer service be launched from X rather than from init? I also find it strange that /etc/rc.2 is a directory full of numbered symlinks into init.d but Xsession.d has the actual files _and_ they're numbered, but oh well...)

Dear Gnome/Ubuntu: you're doing it wrong. Linux should not work like an eggbeater. Intricately interlocking complexity may seem like an elegant design to you, but when a single piece of sand in the gears makes the entire thing grind to a sudden halt, and you can't even find where the sand _is_ unless you know how every single piece fits together... How exactly does this qualify as "good design"?

Darn it, and shutting down ubuntu without killing kmail first corrupted kmail's indexes. (The process was killed while it was saving its state.) Meaning I've lost track of which messages I'd marked read and which ones were to-do or important, and all the windows I had open to reply to (which ordinarily pop back open when I re-launch the program) are gone. I had something like 30 of them, pending half-finished emails to write...

I already hated gnome, but I think I'm working my way to "loathe"...

January 23, 2009

Hello from the Phoenix airport, during our hour and a half layover.

Only had one bite of the bread pudding and threw away the leftovers. (When you're not even hungry in the morning, you ate waaaaay too much last night.)

We got paid. Yay money. We can afford to buy a build server now.

Mark and I have decided to start a weekly technical blog over on, where we can summarize what's happened with FWL and GFS and such. (I've actually been doing it anyway for a few weeks now, but that was just something to send in with our invoices. I should clean those up and post them on the website...)

Home. Yay home. (I'm amazed that the gnome network manager auto-associated with the airport wireless in phoenix but not my router at home.)

Ok, it's crazy season again:

From: "Taxproficient@[deleted]" Taxproficient@[deleted]
To: "" 

can you please tell me how to remove busy box from a system denying me access
to it?  this is going on for beginnng six yrs please help [first name and phone
number deleted] thank you with all my heart 

Sent from my iPhone

I didn't edit anything outside the square brackets.

Do people in other fields get this sort of thing? If I'd volunteered for Habitat for Humanity in a lumber yard, and somebody tracked me down two years after I left and claimed that some of the lumber I handled has locked them out of their house for the past six years... This would be _unusual_, right?

I suspected that this person's first language wasn't English, except that the phone number has a South Dakota area code. (Is it possible to voluntarily immigrate to South Dakota from outside the US? Where from? Siberia? It's negative 33 degrees there right now according to the weather map.)

I'm not upset, and I'm not trying to make fun of the guy, but I get these kind of things on a semi-regular basis and it's always time consuming to tease out the information I need just to figure out what they're talking about, let alone how to actually help them (if I even can). Eric got enough of this he wrote a "how to ask smart questions" document, which I suppose is nice if they read it before they email you. (Pointing them at one seems a bit of a brush-off: if they can't communicate clearly in english asking them to read a lot of it may be a bit of a stretch. Then again, he's at least mildly famous and I'm not, so he gets even more email than I do, and needs coping mechanisms for the volume. I have yet to write up any form letters.)

January 22, 2009

Last day in Orange County, plane back to Austin leaves tomorrow morning.

Tired. Busy. Received many conflicting todo items.

Never did get permission to see the base files that the 185k patch applied to (going back and clearing more stuff with legal, not gonna happen before we head back), so I'm back poking at the OpenWRT patches. Mark's off doing something else at the moment so I'm going through the OpenWRT patches again to reproduce Mark's familiarity with them. Apparently OpenWRT has a fix to this DRAM issue somewhere in the pile (I did not know that), although spotting patches we need to add is easy; elimintating stuff we _don't_ is the hard part. (At least it is when there's no "obvious crap" filter like the past two days of filtering. All the OpenWRT patches do _something_, the question is whether it's something useful.)

I also need to build a stock 2.6.28 kernel and see if I can get it to do anything on this hardware. (I was working against a 2.6.25 base last time, and most of the patches I was playing with were either for hardware we don't need for an initramfs shell prompt via serial port, or it already got merged.)

The California offices have free diet mountain dew, and Kona coffee, which Mark really likes. I should try it some day. The caffeine is keeping me upright.

Evening. Hotel. Brain-fried. Words. Thingy.

Watched about half of "300" on HBO (switching channels a lot). Yeah, it's a comic book all right. I think the humans were the protagonists and the monsters from "Doom II" were the bad guys. Where they got a cave troll is anybody's guess, although I'm guessing Ra brought them. (The bad guy from the "Stargate" movie was calling himself "Xerxes" this time, apparently this covers the time period he was on earth the first time.)

Yeah, we went back to Claim Jumper's. We thought "maybe we can get a small". Didn't happen. Garlic cheese toast again, and this time I got the tri-tip steak thingies (so good). This togo box has half a giant sweet potato (covered with butter, brown sugar, and cinnamon), some excellent mashed potatoes, and a large pile of my half of the bread pudding in a to-go container. Absolutely zero urge to eat any of it so far.

January 21, 2009

So of the 185k patch, I wound up keeping 7837 lines total, and it turns out it's not complete. You know how OpenWRT has a directory full of files it adds to the kernel source, and then has a bunch of patches applied on top of that? (Yes, the patches modify files out of the directory of ones they added, and yes the patches add entire new files.) Well, it turns out they got that from broadcom. OpenWRT seems to have started out based on the Linksys source release (which upstream came from broadcom), and they never questioned their initial design assumptions, so all these years later they're still doing strange and horrible things.

Oh, and it doesn't apply to a stock 2.6.22 kernel. It applies to the 2.6.22 kernel off of Yeah. (Dear Broadcom: you're doing it wrong.)

So diff the old kernel against the old kernel, turn the directory of files into a second patch, appent the stripped down _third_ patch on top of that, compile and boot the entire mess to make sure that the 177k lines of patch I deleted didn't accidentally contain anything interesting, and THEN start trying to figure out what all this stuff actually does and how to transplant the interesting bits to the 2.6.28 or 2.6.29 kernel.

We went to "in and out burger" for dinner tonight. They do burger and fries, and that's pretty much it, but they do it quite well. (For once, no leftover bag, and I'm just fine with this.)

Back in the hotel, playing Mr. Do under Mame. No more brain today.

January 20, 2009

It's over. Thank goodness.

I'm enjoying the cornonation coverage, what little of it I've been able to watch on the TV in Stephen's cubicle. (Cisco's keeping me busy today, and the Linux flash plugin's never reliably been able to watch MSNBC content; if it starts playing it keeps going, but sometimes you reload a half dozen times and it never starts, and this is one of those days.)

I stayed up way too late last night watching the pre-game. My hotel room has cable. No Daily Show, but MSNBC was rerunning Keith and Rachel over and over (in a way, this is a vindication of them too), and the history cannel did a thing on the history of the white house, biography did one on Obama followed by one on Gandhi, and even BBC america was running a very interesting series on US history that I should track down and throw in netflix. (It's called "The american future", they were having a marathon.)

Evening now, and they're rerunning the coronation. (Yes, I'm willing to watch Faux News for this if they're the only ones showing it.) Switching away to watch some of the other biographical specials on Obama on A&E and such (MSNBC is doing a good one; hopefully this means Keith and Rachel get some sleep, they looked a bit punchy earlier today).

Heh. I liked Obama's little unapologetic smirk when he repeated his full name during the oath, despite the justice just using just his middle initial. (No, you're not getting away with that here; the president's middle name is "Hussein", deal with it.)

It's nice to actually feel patriotic again.

Cisco gave me a ginormous (6.5 megabyte, 180k+ lines) patch that's the diff between stock 2.6.22 kernel and the broadcom one they're running on the 610n. Most of this is serious crap: they apparently did a global search and replacing removing all comments containing the "fixme", "workaround", or "xxx". They removed any #if 0 segments, and trailing whitespace lines at the end of files. They updated every one of those magic CVS version comments (often reverting them, everything everywhere is now version, and similarly changed a lot of instances of dates to August 3, 2007. Oh, and they changed some indentation and wordwrapping too.

Spent several hours deleting pages and pages of noise out of this patch during the day, and now up late at the hotel, listening to reruns of the coronation coverage while deleting more unneeded patch hunks.

The trick is that if a hunk _only_ deletes code it's almost certainly zapping comments or whitespace, and if it's only modifying code lines it's almost certainly redoing whitespace or zapping comments at the end of lines. The interesting bits all add at least one line that wasn't there before. (There are occasional exceptions to this, but you can mostly spot the patterns and thus hold down the "d" key and scan the hunks as they're scrolling by for anything you don't recognize. If you overshoot, there's "u" to undo, which alas only has a hundred or so lines of backup you can do, which occasionally isn't enough if a long file has one viable hunk after several screens full of crap. So I save every time I get to a good stopping point and want to look up at the TV.)

I've deleted about 80k lines so far, and kept about 6500 lines of actual code changes that might possibly be relevant. Once I've got a reasonably coherent patch, I need to build it, test it on the hardware, try to port it to something more current than 2.6.22... Might just post what I've got to linux-mips as an "FYI", since Cisco's now got permission to open source this.

Last night Mark and I went to a place called "Claim Jumper's" for dinner. The appetizer was garlic parmesan onion rings, which were marvelous and enormous. I had the new england clam chowder in a bread bowl for the main course, which was good but the menu showed their beef stew in the bread bowl, and that's really what it was designed for I suspect. The top of the loaf the chopped off was toasted with cheese and served on the side, and that was really good too. Mark had bread pudding for dessert and I had the red velvet cupcake (which despite the name was enormous, and had a large lump of whipped cream/cream cheese frosting on top and chocolate syrup all over the place). I took half the cupcake home, took it to work in the morning, and finished it for lunch.

Tonight we went back to Claim Jumper's, split the garlic cheese toast as an appetizer (better than the stuff that came with the bread bowl last night) and both had the "whiskey chicken" main course (ooh, that was good). I got another red velvet cupcake thing (it's like 2 pounds, it's huge) and Mark got the bread pudding again, which I tried a bit of and wound up eating rather a lot of. I took a large to-go box back to the hotel with chicken and mashed potatoes and bread pudding, and ate it around midnight. I think I've eaten my entire week's worth of food already, and it's wednesday. (I still have a bag of Ritz chips left from the plane, which I ordinarily love but just haven't ever been hungry for. Mark and I are sharing a car and for insurance reasons I'm the only one allowed to drive it (he's under 25, it's darn expensive to get him rental insurance) so I have to drive him out for food, meaning even though I was fine with the red velvet cupcake for lunch I wound up going to Panera Bread with him anyway and so I had half an Asagio roast beef sandwich I couldn't finish at lunch on top of the Claim Jumper's leftovers. (I ate that too, around midnight. Well it was good.)

2am now. California time. I need to be up in 6 hours...

January 19, 2009

Trip to california, on our three hour layover in Phoenix Arizona, at "Jackalope Flats", some kind of bar thing in terminal C of the airport. Having a truly inferior diet coke, which tastes like somebody boiled a dirty sword in it. (David Mandala told me that the water in phoenix was ridiculously hard, but when the flavor shines through soda and a slice of lime that's kind of intimidating.)

Wireless internet claims to be available... And it is. Cool.

Wondering if I should care about the EFF's campaign to end phone locking...

YES! Squashfs got merged! (Finally!) Happy dance! (Well, virtual happy dance what with toe and airport and all.)

At the hotel. Still no internet here, but Cisco should have it in the morning. Out with mark to "Claim Jumper's" for dinner (really good, if a bit expensive). Catching up on podcasts some more. (I like the way Rachael Maddow is giving bartending advice.)

Building all targets with current scripts under 2.6.28, and poking at a toybox 0.0.8 release.

The next time I'm responsible for programming at a convention Neil Gaiman is attending, I have to schedule time for him to experiment with liquid nitrogen ice cream recipes. (Cranberry Mango sounds like it would be interesting. Or perhaps work some lime in there. Sea salt was suggested by Kingdom Hearts II, but I don't know anybody who's tried it yet.)

Oh, and next time Eric's in Austin I need to drag him to Firehouse subs. (Or just let Mark do it.) They have a wall of hot sauces.

January 18, 2009

Somehow, kmail has decided that that my outbox folder has 2 new message(s) in it despite only having 1 total message(s) in it. And the "send queued messages" thing in the file menu is often greyed out when there's stuff to send, so I have to go to the outbox menu, double click a message to edit it, and then click send in the new window. I really need to find a new email program that doesn't launch half of KDE behind the scenes just to be an email client. It gets _really_ slow at times because of this.

Toe feeling much better. Still probably broken, but no longer in constant pain. Which is good, because I need to get stuff done before heading to California tomorrow, and that was _really_ distracting. (I'd hoped getting a mortgage pre-approval letter thingy might be one of those things, but Monday's a holiday and then I'm in california for a week. Fun.)

Fun comment from one of Rachel Maddow's guests, Ben Nelson, (D) Nebraska, on January 6th: "This administration has equated democracy with having elections. You cannot have a democracy when one of the parties running has a militia." (Referring to the election of Hamas and the resulting fun in the middle east.) The later line "China, where copyright laws go to die" was a better one-liner.

Ok, added mkswap. I should do swapon too, but I have a note that I need to redo readlink, except I don't remember _why_. Rummage, rummage... Ah. It's because the realpath() function out of libc only supports "NULL" as the buffer when using glibc. For uClibc, it won't allocate its own buffer, it provides no feedback about how long the buffer should be, and silently truncates with no error when given a buffer too short. So yeah, the uClibc version of readlink() is more or less useless. So, implement my own, or fix uClibc's readlink? (I note these solutions will probably overlap.)

Playing with Ubuntu's usb-creator (which claims it can turn a bootable ISO into a bootable USB key). The progress bar claims to be 101% complete. I have my doubts.

January 17, 2009

Ow. Ow ow ow ow ow ow ow ow ow.


I may have broken a toe.

Fade and I spent most of today seeing houses for sale north of UT, about half a mile northeast of where we are now. With mortgage rates this low and the current condo mostly paid off, trading up to something with 3 bedrooms, 2 baths, and a yard for the cats seems like a good idea. Modulo being able to get a mortgage, which has always been interesting with my "work when something interesting comes along, take half a year off to do open source" job history.

When we got home, Fade wanted to go to the alcohol store because she's stressed out about the idea of moving again (especially having just signed up for a semester of classes within easy biking distance of where we currently are). She walks about twice as fast as I do, and she had me carry one of the bottles on the way back, and I also had the energy drink I'd bought, so I slipped on mud on the sidewalk with both hands full while trying to keep up, and came down with my entire weight on my right big toe.

This hurt, a lot.

On the bright side, it bled enough to get most of the mud off. Walking home was not fun. Big bag of ice on it right now, which wasn't fun to apply either. Too swollen to tell if it's actually broken or just really unhappy with me right now.

I'd planned to spend the evening programming (hence the energy drink), but my foot hurts too much to concentrate.

I'm trying to figure out how to cleanly printf() an off_t value. I know about PRIu64 for uint64_t, but there doesn't seem to be an equivalent macro for off_t. What am I supposed to do, typecast it? Just use a uint64_t all the time?

Nope, can't concentrate.

January 16, 2009

The best comment on Steve Jobs' announcement I've seen so far, from Eric Burns of websnark:

Of course Steve Jobs needs to take time off. The Lich process takes weeks to fully finish, and the Phylactery soulbind takes days to set.

Poking at the uClibc locale data thingy. Figured out how to get a current glibc localedata directory tarball (2.8 and 2.9 didn't ship tarballs, you have to grab them from Red Hat's source control).

I'm still fighting with initramfs. It turns out that if you don't enable CONFIG_BLK_DEV_INITRD in the kernel .config then you don't get initramfs support anymore. (They changed this back in February 2007, now it hardwires in "init/noinitramfs.c" instead.) So it worked fine on i686 where this was enabled in the miniconfig, but not in mipsel which didn't have it. Wheee...

Got it to work. Now to debug why the toybox df command isn't showing initramfs. Easy way to do this is to build toybox natively within an initramfs, and since next week's demo is mipsel I'm doing this under mipsel emulation. And wow, building under mips emulation is _insanely_ slow. (Much slower than building under arm emulation was.) I don't know if something's weird with qemu, or something's weird with gcc on a mips host, or what. It actually seems to be a gcc issue; the shell and makefile are working reasonably, it's seems to be native gcc being Unhappy. (Possibly running from initramfs is confusing its memory estimation algorithms...?) It could also have something to do with using qemu-svn instad of the last release version, TCG is new and not necessarily tweaked yet...

Nope, it's got to be gcc. Extracting a 200k bzip2 file seems to run at a semi-reasonable rate...

Ah, the reason df isn't reporting anything for rootfs is it's getting a total size of 0 for the filesystem. (It filters those out by default, so you don't see /proc and /sys and such.)

January 15, 2009

So today I bought a "monster energy drink" (citrus flavored), drank the whole thing, started thinking clearly, and finally fixed initramfs. (I have not made any tea in a week. I've been blocked debugging initramfs for a week. These facts may be related.)

So initramfs is working. Actually supplying the "console=" argument in the version of the script that gets _run_ is useful. (Poking through the code to see why an option you're not passing isn't getting parsed? Balancing the wrong tree.)

The toybox df command is broken. Need to fix that. And finish up the half-finished mkswap and swapon implementations I'm adding. That should be enough to cut a toybox release, clearing the way to upgrade the kernel to 2.6.28...

January 14, 2009

Met with Mark and Stu for lunch. Alas, I was operating on four hours of sleep so don't remember much. Apparently the trip to California's been rescheduled for tuesday-thursday, so I leave monday and come back friday.

Debugging early kernel boot without a working console is evil. (I miss User Mode Linux, where you could stick a printf into the source code and it didn't need a working console for you to see it. Alas, UML doesn't seem to like uClibc, so adding it as a hardware target is more time consuming than just hitting the qemu version of initramfs directly with a large rock until it gives in.)

Another long, frustrating day of debugging something that wouldn't speak to me, but I finally got it to confess.

So the problem I've been having turns out to be that putting /dev/conosole in initramfs and trying to use serial console is Unhappy. This took many many hours of staring at blank screens to track down. Eventually after spending yet another day frowning, guessing, reading source code, I just booted the vga virtual terminal and _that_ worked. I have an initramfs shell prompt; the wrong _one_, but it's booting, and it's in initramfs.

The console subsystem is the problem. It's got a /dev/console which is readable and writable and c 5 1, and when I don't feed it a console= argument it happily writes out to the VGA virtual terminal. (Why am I compiling _in_ the VGA virtual teriminal? I need to strip down .config a bit, it seems.) And once I've booted up to a prompt, if I mknod a /dev/ttyS0 that's c 4 64 and echo to it, I get serial output just fine. If I say console=/dev/ttyS0, I get nothing. If I add /dev/ttyS0 to the initramfs, it doesn't help either.

Right, NOW I can add printk() calls to the darn kernel to see what the heck it thinks it's doing. In the morning, anyway...

January 13, 2009

Met with Mark for lunch, got the demo of Gentoo From Scratch, got a fresh todo list.

Here's an extract from an email I sent earlier today, apologizing for being overworked:

Right now I need to get initramfs packaging working, finish debugging the new hardware target mechanism for FWL, figure out what the heck's wrong with trying to use User Mode Linux as a fake hardware target for test purposes (I'd hoped to use that as a test platform for initramfs but it's turned into its own bottleneck, segfaulting before boot and refusing to statically link against uClibc), get powerpc working in the 2.6.28 kernel so I can finally leave 2.6.25 behind, upgrade the broadcom patch to work with 2.6.28 (some of it got merged so the current patch doesn't apply), upgrade trx packaging so it's adding the extra 32 byte header that the wrt610n needs, test out the uClibc-svn version, reimplement squashfs packaging and get the appropriate patch from the .git tree and report feedback to the kernel list in hopes of helping get it merged, make travel arrangmets to go to Cisco next week with mark, work out what Mark and I are demoing for the Cisco guys, get the xylinx toolchain paperwork signed and sent to that guy in australia for the toolchain work Stu is doing through Impact Linux, meet Stu for dinner and get a status report on that project...

That's what I remember off the top of my head.  I'm know I'm forgetting all sorts of stuff.  That's not counting the todo items for me Mark posted to the impact list (all those extra flags various commands need), or the two new todo lists he sent me during lunch (where we went over GFS and argued about the design a bit).  Oh yeah, and I promised the uClibc guys I'd get some converted locale data for them.  And the Neuros guys sent me some set top box hardware I haven't had a chance to _look_ at (took it over to Mark's so he could poke at it a bit, but I haven't got an HDTV to hook it up to yet).  The guys seem to have given up on me, I have their hardware sitting under my kitchen table but haven't had time to look at it in two weeks; their bug was almost certainly load-spike driven memory exhaustion, which they didn't want to hear but it was probably the truth.  Couldn't easily _prove_ it under a 2.4 kernel, though.

So yeah, I'll try to work cmake and dvv in there.  Sorry for letting this drop on the floor so long...

Since then, the trip to California has gotten dates attached: monday-wednesday next week, meaning sunday and thursday are taken up with travel. (Travel takes a day. You can work with your laptop at the airport and on the plane, but trying to stick it in at the beginning or end of a day of doing other things just doesn't work. Maybe under Obama it'll start to again; he might dismantle the TSA completely.)

Things I forgot to mention: put out a toybox release. New one: dig through the openwrt (or old linksys build) infrastructure and see what kind of kernel image it's building if it's not a vmlinux and it's not a bzImage. (Done.) I'd like to push my miniconfig stuff upstream into the kernel...

Feeling just a touch overwhelmed with todo items.

January 12, 2009

Catching up on podcasts. I think the December 23 episode of Rachael Maddow's show was better than her end of the year awards ceremony.

Ooh, David Lang pointed me at Firefox's about:config entry "toolkit.networkmanager.disable", which if set to true rips out something firefox should never ever ever have implemented in the first place. Thanks!

Working on a hw-uml target to build User Mode Linux, which is fun to debug things like initramfs with. (A preposition is an excellent word to end a sentence with. And we start sentences with "and", and have been known to boldly split infinitives that everybody has split before.)

I'd like to tell UML that its base arch is i686, but I can't see a way to specify that. Digging in the makefile... I have to specify "SUBARCH" on the command line, which means I should add a LINUX_FLAGS like GCC_FLAGS and BINUTILS_FLAGS. Except this is a side issue, the x86_64 ARCH=um build just went:

fs/hostfs/hostfs_user.c: In function 'set_attr':
fs/hostfs/hostfs_user.c:302: error: implicit declaration of function 'futimes'
make[2]: *** [fs/hostfs/hostfs_user.o] Error 1

I am _so_ glad that I don't depend on this crud anymore. Ok, try 2.6.28... Ah yes, I need USE_UNSTABLE=toybox in order to apply the patches to USE_UNSTABLE=linux. And at least part of the broadcom patch is already there in 2.6.28, so just symlinking it doesn't apply (yank it for now)...

And 2.6.28 has the exact same bug. So in three releases, an obvious build break hasn't been fixed. I guess both the other users of User Mode Linux gave up on it too? Ah, no, it's a uClibc thing. So UML won't build against any release version of uClibc or older versions of glibc. Right.

Ah, updating qemu-svn and rebuilding fixed the powerpc segfault I was seeing. Cool.

Wow User Mode Linux is crap. I built an x86-64 version and it paniced before giving me a shell prompt. I tried to build an i686 version and it went "no, you're building on a 64 bit host, you must be building a 64 bit version". The top level Makefile unconditionally calls "uname -m" and parses its output; switching it to ?= isn't good enough because it's recursive makefiles that don't pass all the environment variables through. So it cannot be cross compiled without patching the source, or providing a special uname that lies.

Patched out uname -m and just hardwired in "i386", and that worked but then the link died trying to statically link against uClibc. Which is weird, because it dynamically links against it just fine?

January 11, 2009

Renamed to, which is something I've been meaning to do for a couple weeks. Now the tarballs and the scripts producing them all have the same name, but it'll be a while before I stop typing "pack[tab]" to try to open the file. (I'd have a similar problem still putting around "2006" on my checks, except Fade does the bills requiring mailing things, so I mostly pay for things with a debit card or cash).

So in theory the "kill" command can specify jobs instead of PIDs. (This is why there's a shell builtin version instead of kill.) Jobs are things you background with & (or with ctrl-z and the bg command), and they're numbered just like pids are. The first one is #1, just like PID #1. So "kill 1" would try (and fail) to kill init.

The bash man page says "jobspec", but grepping for each instance of jobspec it never once says what a jobspec _is_. The "bg" command accepts just raw numbers. Luckily, SUSv4 does say what a jobspec is, it's got a percent sign in front of it.

And after all that, it doesn't help. Jobs aren't visible to subshells, so attempting to kill jobs from a signal handler doesn't work. Luckily, I solved this problem before (for the distccd called from 'trap "kill $(jobs -p)" EXIT' does what I want because the $() inside the "" is evaluated when the trap is set, so it resolves to process ID numbers before the subshell gets run. The problem is, you have to run that line _after_ you background the process, so there's a one line window where they can hit ctrl-c and not kill the background process. Still, at this point I don't really care, it's primarily an aesthetic problem that's hard enough to hit I'll stop fiddling with it and move on.

Sigh. Except it's not working. Killing the pipeline doesn't kill the make process that's a child of it. (I could trivially "killall make" but that's the wrong solution; other make processes could legitimately be running as the same user on the system; comes to mind.)

And "killall -g $(jobs -p)" isn't working because the busybox killall command doesn't understand -g. I can't do it myself with ps and grep because the busybox ps command doesn't show parent process IDs (or any parent/child relationship, really). Groveling around in /proc myself seems a bit overkill somehow.

The pipeline in question is (make blah-blah-blah || dienow)& and although I could turn that into (exec make blah-blah-blah)& and thus eliminate the subshell, then it wouldn't be able to report an error exit back to the rest of the build.

Ah, according to SUSv4 "kill 0" takes out the current process group, and THAT does what I want. Right.

Mark told me the name of the kernel file I need to grab for hw-wrt610n, but I don't remember it and the .config I have only seems to be building vmlinux. He also told me the name of the file that adds the extra header to the trx image, but I don't remember which one it was. (I'm guessing addpattern.c which produces a 32 byte structure, rather than add_header.c which doesn't.

This is why I like to have things emailed to me: I forget otherwise...

The gnome xterm highlights URLs when you mouse over the, but when you right click on one it doesn't light up the "copy" option (you have to drag the mouse and highlight just like any other text to get that), instead it has a "copy link address". What's the point here?

Sigh. The 24 hour places around UT play loud annoying rap after midnight, to keep homeless people from trying to come in and sleep in the booths. I need much better sound cancelling headphones; removing 20 decibels from (c)rap is not nearly enough.

Should I find it ironic that there's a 24 hour donut shop down the street from Kirby Lane that I've never seen a single police officer in, but Kirby Lane has two tables of 'em right now? The police in Austin eat healthy, it seems.

January 10, 2009

Ok, this might be a disruptive technology that breaks the size limitations on the displays for mobile devices. (It needs something to project on, but that's easier to come by, or transport for that matter.) Ok, it probably doesn't work very well in direct sunlight but neither does my laptop's backlit LCD here in Texas.

Elze would prefer I link to her website rather than her flicker photoset. (Judging by the photos, she has a more interesting life than I do.)

I recall having written a simplified mkswap.c, although the one in busybox 1.2.2 isn't it and it isn't in toybox... Ah, I checked it in as busybox svn 15704. A reminder that "last release before Bruce made working on the project intolerable" is not the same as "where I actually stopped working on it".

This comes up because I have various changes I need to make to busybox and I'd just as soon replace the appropriate applets with toybox instead, but one of those is ash, and reviving toysh is a biggish unsticking. The point I'm stuck at is that argument parsing happens in the wrong order for handling multi-line options (which includes everything from here documents to if statements). So I have to back up and redo plumbing before moving forward. Eh, it happens.

January 9, 2009

Lunch with Mark and Fade at Schlotzky's, now hanging out at Epoch. Some Dell engineers at the next table said that IE 8 is based on Mozilla 5. (The frightening part is they would know.)

Teaching FWL that a target can have a BASE_ARCH, where only packaging varies (new kernel, same root filesystem). (End result is that $ARCH is set as normal, and $ARCH_NAME is the one you called. Which was, in fact, already the case.) I'm trying to figure out if creating cross-compiler-$ARCH -> cross-compiler-$ARCH_NAME symlinks (and so on) are worth the complexity here. (The problem is that the path $CROSS is set by the build and just uses it. It got abstracted away, now I need to manipulate it. Maybe I should just link the tarballs...)

On the other hand, the end of does a "cd $BUILD; tar -f blah.tbz blah" so those paths aren't _really_ abstracted away, are they? CROSS is the path to the cross compiler you build things with.

Hmmm, doing parallel builds with a BASE_ARCH is kind of tricky. Assuming that everything with a BASE_ARCH will have a hw- prefix (it's hardware) then I need to teach buildall to build all the non-hw targets first, and then do the hw ones. And replicate the trick that if the tarball's already there, don't rebuild the base.

There's always been a bit of tension between buildall reproducing internally, vs just calling The problem was not only do we not need to reproduce that part each build, but there was a potential race condition where doing things like unconditionally reproducing the symlinks could cause a build happening at the same time to not find a command when it tried to use it.

Now that it checks for the existence of target stuff before trying to rebuild it (including symlinks to the host toolchain), I think it should be ok. There's still the whole "$CROSS_BUILD_STATIC" rigamorole (buiding an i686 target and rebuilding in that under qemu), but I'll worry about that later...

So I've got to rewrite the buildall script for the... fifth time? Sigh.

January 8, 2009

Phone call with David Mandala yesterday about the amazon computing cloud. He's put together some Ubuntu images to feed into there, and got a little bit of budget for Mark and myself to test out our FWL builds under that. I immediately handed this over to Mark (who is much better at this sort of thing), and he reported back that what we want is their High-CPU Extra Large Instance with cheese fries, which gives us an 8-way 2.5 ghz 64 bit processor, 7 gigabytes of ram, and 1.6 terabytes of disk for 80 cents/hour. Between runs they give you persistent storage at 10 cents per gigabyte each month.

Assuming we run two hours worth of nightly builds ($1.60 times 30 days is $48) and grab maybe 20 gigs of persistent local storage in their cloud ($2), that's about $50/month. If we bought our own build server it might _depreciate_ faster than that.

Mark is now more or less drooling with anticipation, and configuring this sort of thing is an area where he's way better than me anyway so I'm happy to stay out of his way for the moment. What _I'm_ looking forward to is making this work out of the box with FWL, and documenting how to use it so _other_ people can easily use FWL to do native builds under emulation on monster hardware at prices available to hobbyists.

Oh wow. It's 2am at Jimmy John's and the music is doing a rock version of "What a wonderful world", apparently by the Ramones. (Due to its position at the end of the BBC miniseries version of The Hittchiker's Guide to the Galaxy, I've always considered the original Louis Armstrong version of that to be "end of the world" music. Even more so than Carmina Burana, which is merely "somebody set off a self destruct, please either reach minimum safe distance or go switch it off" music. Not quite the same life flashing before your eyes quality.)

Because of this, my brain is more bent than you'd expect by a rock version of what was originally a melancholly song normally TOTALLY outside any normal genre I'd be interested in, but grandfathered in because The Hitchhiker's Guide to the Galaxy was fairly close to a religion for me growing up. (It teaches that refuge in absurdity can cope with anything, you just have to get absurd ENOUGH.)

At some point over the past week, I broke FWL. The images it's building won't boot; the kernel messages scroll by but they never find their root filesystem. I have no idea WHY, and no idea what I did, but they're working for Mark. So...

Grab a fresh snapshot from hg into a new directory, do the PREFERRED_MIRROR= trick and... I apparently forgot to copy the current versions of uClibc, busybox, and genext2fs into the mirror directory. Right, fix that. NOT using the UNSTABLE versions of anything, let's build and... it doesn't boot. Ok, set PATH to exclude /usr/local/bin so I'm using the Ubuntu version of qemu instead of the svn one... That wasn't it.

Binary search time! Ok, try commit 540... and I forgot to check busybox 1.13.0 into the mirror too. (Once upon a time, my main working sources/packages directory was a link to the build directory, so it automatically updated with every package I actually used. But then I implemented the logic to delete old unused versions, and separated the two. I need to do add some copying logic to

And I have no internet access at Jimmy John's... But I can just tweak to have busybox-1.13.1, I know that's not the problem. Zero out the sha1sum so it doesn't have an immune reaction to the wrong tarball and rebuild... And commit 540 worked. (Which is good, it _used_ to work, and that means it's not something funky about my laptop or qemu.)

But 560 is broken. Ok, it's been dead for a while longer than I thought. (You'd think I would have noticed. But I was busy exorcising perl...) 550... worked. 555... failed.

Continue this in the morning.

And it's morning. (Miracles of the internet, eh?)

553 failed, rechecked 550, it worked, so the mercurial "forgetting" messages weren't about leaving sources/patches files behind that might screw up the build. (Good to know.) 552... worked. It's 553. Which was moving the kernel build from mini-native to

Alright, what changed? Try booting the new kernel with the old root filesystem... failed. New root filesystem with old kernel... worked. It's the kernel. Which is 32 bytes larger...

HA! Ok, USE_INITRAMFS needs to start with a $ or it's always going to trigger the if clause. Right. That took _way_ too long to track down. (But of course I didn't see it _looking_ at the code. I knew what I meant.)

January 7, 2009

So the reason that initramfs isn't working is that is using "find -printf", which busybox doesn't support. Specifically, it needs %T %p %m %U and %G. And yes, it's a gnu extension that isn't in the SUSv4 find.

Hmmm, do I patch busybox to add -printf to find, do I try to push a patch up into the kernel to change to stop using the gnu extension, do I implement find in toybox, do I write my own

Mark is blocked and waiting for this functionality. Right...

Ok, so I wrote a new very small script to create the text file from the directory, but if I then feed it into the kernel infrastructure it goes "boing" because it needs find -print in order to figure out it's argument is a text file.


I hate makefiles. I'm creating a usr/initramfs_data.cpio.gz file in the kernel source. It's newer than any other file in the source. But it's rebuilding it. Why does make decide to rebuild it? How does one single step through a makefile's rebuild decisions? Ok, with make -d, but when is says:

Considering target file `usr/gen_init_cpio'.
  Considering target file `usr/gen_init_cpio.c'.
   Looking for an implicit rule for `usr/gen_init_cpio.c'.
   Trying pattern rule with stem `gen_init_cpio.c'.
   Trying implicit prerequisite `usr/gen_init_cpio.c_shipped'.
   No implicit rule found for `usr/gen_init_cpio.c'.
   Finished prerequisites of target file `usr/gen_init_cpio.c'.
  No need to remake target `usr/gen_init_cpio.c'.
  Pruning file `FORCE'.
 Finished prerequisites of target file `usr/gen_init_cpio'.
 Prerequisite `usr/gen_init_cpio.c' is older than target `usr/gen_init_cpio'.
 Prerequisite `FORCE' of target `usr/gen_init_cpio' does not exist.
Must remake target `usr/gen_init_cpio'.

Where does "FORCE" come from? The only instance in linux/usr/Makefile is:

$(obj)/initramfs_data.o: $(obj)/initramfs_data.cpio.gz FORCE

And that's for a different target which is a downstream consumer of this stuff... (And removing it doesn't help anyway.)

Ok, build the kernel, _then_ replace the initramfs_data.cpio.gz file, then build the kernel _again_, and it works. Right. So now it's a question of getting it generated right.

I can get the whole mess parallelized by using the funky bash argument pipeline syntax ( going <(command) looks like a command line argument but works like a pipe), so the creates a file list in parallel with gen_init_cpio creating a cpio image, and then piping the output of that into gzip runs in parallel as well. Plus I can background all that so it runs in parallel with building the kernel the first time. (A three way process plus make -j $CPUS... eh, reasonable.)

Now the problem is that the initramfs could have spaces in filenames (or more likely, the absolute path in $NATIVE could have a space in the directory the build is running in). Ooh, which option is least ugly... I guess trim off $PWD/ if $NATIVE starts with it?

This is fiddly.

Add an /init symlink to which lives in either usr or tools depending on how $NATIVE_TOOLSDIR is set, and cares what name it's called under so switch that to check that it's _not_ (because /init should perform qemu-setup, not chroot-setup)...

No, hang on, that's stupid. If we're pid 1, perform qemu setup. If not, perform chroot-setup. Why didn't I do that in the first place?

January 6, 2009

Too busy to blog for most of today. Caught up with email, then wasted some time trying to figure out why the kernel isn't mounting the root device (it's _finding_ it, according to the init messages, but then insists that root= is incorrect) before just focusing on making initramfs work and so I can debug it from a shell prompt. Before I could finish taht, went with Fade to ACC so she could register, we stopped at Starbucks for a bit, showed her the Curves location on Lavaca and Guadalupe (she enjoyed going there in Pittsburgh, and it turns out we do have one near us), and then headed home. Conference call with Stu and John from Australia at 6pm got delayed to 7pm, didn't manage to catch up with Sal in between (left a voicemail), and then dinner with Victor Meyerson (Gentoo embedded guy in town for a conference, calculus on freenode), quick nap, then off to Whataburger for some quiet time with my laptop.

I keep forgetting how broken Ubuntu 7.10 is. Somewhere along the way (starbucks, probably) I queued up some Youtube videos people linked me which I didn't have time to watch, but any programs running when I suspend the system confuse the horrible Gnome sound mixer server, and I have to kill them (losing the cached data I told it to download by hiting play and then pressing pause) and restart in order to get it to work. The way Firefox is designed, this means I have to kill the flash plugin taking out every flash instance in all windows.

Bravo, Ubuntu, firefox, and gnome. A trifecta of suck. (I miss Konqueror, kde 3, and its not-brain-damaged sound mixer. I don't actually know what this sound mixer is FOR since I'm told ALSA in the kernel now does it, but as with much of the gnome, or the HAL network status network mangler sets, or libtool: it does nothing when it's working right. It's an unnecessary glue layer that exists to add the possiblity of failure to something that would otherwise work reliably in its absence. Simple systems are nice because infrastructure that doesn't exist can't break.)

January 5, 2009

And so the first day with Mark working full-time for Woot. (He's likely to accomplish more in a week than I do in a month, but he's a lot more focused than I am. I tend to wander off onto a dozen different tangents at once, he grabs one and FINISHES it, then moves on. Breadth first vs depth first approaches; we work together very well.)

I hung out at Chick-fil-a for half the day, trying to bang on the initramfs packaging but mostly getting distracted by busybox and kernel stuff. (My email overfloweth.)

Mark's banging away at building gentoo embedded under mini-native, and already has a patch for building python on mips submitted upstream.

The embedded world in general tends to lag several releases behind cutting edge for userspace tools, just because so much breaks. (Crazy guys like us are allowed to do the sheep across minefields thing first.) Now try adding cross compiling complexity to that...

January 4, 2009

Lots of bug reports from Milton Miller last night. (He's using FWL to target the PS3, real hardware rather than a qemu instance.) Half of them were in Toybox, which I need to get a new release out of soonish.

Mark got the trx packaging script (for linksys router images) to produce the right crc, but it turns out it needs to be initialized to FFFFFFFF instead of 0, and there's no switch for cksum to make it do that. (Well, none in busybox or the gnu version. The old unix one from the dawn of time apparently had it, but it fell by the wayside over the years.) Added a -F flag to toybox cksum to do that, might push it up into busybox, might just say USE_TOYBOX if you want this functionality...

Met Mark for a late lunch (at firebowl) and we went over todo items for this week. He's finishing up the first stage trx packaging (getting a trx-packaged image to boot a kernel to a shell prompt via initramfs) once I get him the new crc stuff and get initramfs packaging finished and checked in, and in the meantime he's working on the gentoo build stuff.

I have those two, plus following up on the perl removal push, plus trying to cleanup/push the openwrt patches needed for the broadcom stuff (they have patches in their svn to support this for 2.6, they just haven't pushed 'em upstream for some reason; I should ask). Plus working on the Linux From Scratch build script and trying to get other hardware targets working.

The download link for SUSv4 isn't up yet, so I'm using wget --mirror on the URL instead. (Vaguely impolite but I'm not always on the web when I want to look something up, and still using SUSv3 when SUSv4 is out.)

Ooh, to mount an ext3 partition read only without replaying the journal (and thus writing to the underlying block device):

mount -o ro,noload /dev/XXX /mntpt

Good to know.

There should be a phrase to say "this objection is a stand-in for another objection, you're not really upset about the thing you're objecting about you just think it's a more effective argument than the one motivating you to do this". Proxy objection? I googled a bit and the closest I got was real and unreal, which isn't right. ("That's not your real objection" doesn't make the other objection _unreal_. It makes it a stand-in or a proxy, something you're only motivated to care about by the real objection.)

I called Cathy Raymond (a full-time lawyer) and she didn't have a legal term of art for it. (Apparently it's _too_ common there to have a specific name.) I guess I'll use "proxy objection".

Yeah, this came up in regards to the perl stuff. I'm told that in the past year Peter Anvin has changed his other projects (syslinux and klibc) to require perl to build, when they didn't used to. (I know Perl has fallen from third most popular programming language to eleventh in the past few years, but isn't taking the "Windows everywhere" approach overreacting a touch?)

I strongly suspect that all the stuff about dash vs bash is just a proxy objection, which nobody would care about if it wasn't for perl vs not perl.

January 3, 2009

Fixed the overflow error in the shell script. It was actually pretty simple. Also incorporated the feedback Sam Ravnborg sent me the first day. Now comes the hard part: wading through the email from the mailing list and replying to it all.

I'm installing a more recent 32-bit xubuntu in a qemu image so I can test Matthieu Castet's assertion that current versions of dash on 32 bit hosts provide 64 bit math, but the cursor keys aren't working in the current svn version of qemu. Pinged the list, and installed the release version of qemu in the mean time. Much email to reply to, much testing. (The patches are up in the FWL repository already, by the way. alt-linux-noperl*.patch)

Trying to make it over to Mark's place so we can go over the router stuff. This perl removal stuff has eaten my entire week, work-wise.

Ok, I should shave.

Intalled a 32-bit Xubuntu 8.10 because I'd heard that dash supports 64 bit math on 32 bit systems in current ubuntu. (Confirmed.) It took about 4 hours, and was so slow that at several points I thought it had hung. I note that Xubuntu is the lean and mean, stripped down desktop. And I gave it 256 megs instead of the 128 qemu defaults to. (Maybe I've been spoiled by TCG, but the current qemu svn doesn't have working cursor keys for some reason so I had to use the release version.)

January 2, 2009

Spent a couple hours today removing grafitti from the alley behind my condo. I bought about $50 worth of paint and stripper and tools at Home Despot a couple days before New Year's, and it wasn't enough. (I still have some paint stripper left, evil gloop that stings when you drip some on your foot. How they got gel to work in spray bottle is a mystery for the ages.) But the big wall on the back of the abandoned strip mall Cuatros bought to use as extra parking was at least a two gallon wall, maybe three, and I only had one. Still, managed to produce a statement of intent, if nothing else. Need more paint. (And another roller/tray pack.)

The trigger for this was some idiot tagging the front of my condo's building with orange spray paint, and the stoneleigh garage I have a parking space in. They also tagged up and down the alley. (It's like when dogs pee on a wall, only with spraypaint. Alas, animal control won't mount rooftop snipers to dart 'em, tag 'em, and release them into the wild on some other continent. Budgetary issues, I guess.) The alley's had grafitti in it for years, but it used to be _artistic_ and now it's just tags. And then last week some idiot tagged the same alley in eleven places, plus elsewhere outside the alley. (Can we say "overcompensating"?)

I suspect it's the "Mad Honker", the idiot who honked a brass instrument off his balcony randomly between midnight and 5 am (sounded like an air horn, but wasn't) every night for a couple months until the police tracked him down and actually enforced the noise ordinances at him. I'm just guessing that the tagging's the same guy, but the timing is right, and both activities have the same sort of desperate urge to make people notice whoever it is, without providing anything anyone would _want_ to notice, and while remaining anonymous. (See, this is what the INTERNET is for. Get with the terminal program.)

Anyway, grafitti attracts more grafitti, so I thought I'd do my whole White Boy Possee thing (our tag is solid colors over large areas, and we OWN this town) and see what happens. I suspect I can roller more than they can spray, partly because I don't worry if somebody else sees me do it.

Wound up napping for a few hours after that until the fumes wore off. (Painting is way more aerobic than you'd think, and combines it with things you aren't supposed to breathe.)

Lots of replies to the perl removal patch series I posted last night. Respinning the patches to fix the things Sam Ravnborg commented on.

And it's miscalculating USEC_TO_HZ_ADJ32 for 24hz and 122hz. (All the other HZ values give all the right values, and the other values are right for those two HZes. What is this, an integer overflow? Yup. Ok, that's fixable, but probably not tonight.)

Darn it, Thiemo Seufer died. (Mips maintainer for QEMU, among other things.) Car crash on december 26th.

January 1, 2009

Hello world!

Back banging on the first perl removal patch, rewriting the time constant generation script in shell.

So dc doesn't support bit shift operators (<< and >>). I can fake it with exponentiation, but although the gnu dc supports "^", the busybox version doesn't recognize it. Great. I have to write a dc for toybox. Ok, I can do that _after_ getting the perl removal patch rewritten and submitted. I hope merge window for 2.6.29 stays open a little longer than usual due to the holidays...

And looking up the "dc" spec in susv3, I find out there isn't one. The "bc" command is in susv3, but that describes an entire _language_ and busybox doesn't implement it so it's apparently not that widely used...


You know, it's possible to fake 64 bit math from 32 bit math. Presumably if the shell $((blah)) operator can be trusted to handle 32 bit math...

Actually, the fmul() stuff this is _never_ going to be more than 64 bits because it's ((TIME<<32)+HZ-1)/HZ and TIME is either 1000 (for miliseconds) or 1000000 (for microseconds). Oh hang on, that's just a fixed position binary fraction with rounding up. (The upper 32 bits is the integer value, the lower 32 bits is a binary fraction of what's left over.)

Ok, some comments really would have helped here. That doesn't need fixed point, that just needs 64 bit math. Can I rely on the shell to give me 64 bit math? As does busybox ash in my mini-native images, and even the Defective Annoying Shell. Let's see, I've got an old red hat 9 image lying around that probably _won't_ do so. (A 32 bit system from 2003, using gcc 3.2 and bash 2.05b.) And yes, it gives me 64 bit math!)

Ok, that becomes fairly trivial then. I can just use shell math for all this...

Ah, no I can't. The Defective Annoying SHell on a _32_ bit host says that $((1<<31)) is "-214783568", and that $((1<<33)) is "2". Yup, it continues to screw up any attempts to write shell scripts.

Ok, just say #!/bin/bash at the top.


And posted. Made the merge window, now let's see what happens. In the morning...

Back to 2008