Rob's Blog rss feed old livejournal twitter

2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2002

July 31, 2015

Taking another swing at the full nommu conversion for toybox. I've had "make change" working for a while (against uClibc at least, musl still provides a broken fork() you can't probe for at compile time because forcing the inclusion of intrusive nommu workarounds in the common case with-mmu software is totally, totally going to get this support widely accepted. It's as if you have to specify CROSS_COMPILE=x86_64- when building natively.

But I digress. One of the impacted commands is nbd-client, which both uses xfork() (to asynchronously force a partion probe when the new block device comes up) and calls daemonize() which internally does fork() twice.

It would be really nice if I could save a function pointer and call that function instead of main() after the re-exec, but there isn't an easy way I can see to do that which isn't horribly exploitable; neither binflt nor fdpic is guaranteed to share text segments between programs (it _can_ but it requires extra finagling in the loader we haven't set up yet for either code path), so each command's function pointers are _different_. You could recalculate them as an offset from the base of the text segment and pass it through in an enviornment variable, but see "horribly exploitable" above, it would mean anybody who set the right environment variable could call any function at program start.

Even adding an extra function pointer to NEWTOY() with the re-exec point doesn't help because there's just one and it's a constant, I.E. it's just a cosmetic difference from calling command_main() with toys.recursion == -1. If I want to specify multiple reentry points, I need a way to indicate which one to go to, and that' basically means defining an environment variable which is ugly because A) there's no environment variable name that's guaranteed not to be used, B) the environment space could be full. Repurposing the high bit of the first byte of the filename is a clever hack because A) it's guaranteed to be there, B) I can guarantee that no toybox command name sets it.

Hmmm. I guess I can use the high bits of more characters of the name and set recursion equal to negative however many of them were set? It's ugly, but doesn't require new resources...

July 30, 2015

Glanced at the busybox nbd-client command, which I wrote in 2010 because I needed it during my dayjob at qualcomm, and which I contributed to busybox because toybox was mothballed at the time, and I felt I should publish it somewhere people might actualy use it.

(Toybox was on-again off-again for many years as "I can do a better job" warred with "why am I undermining my own hand-picked successor on a project with a ten year headstart including several years of my own work, and there just isn't enough headroom between what busybox is doing and perfect to displace such an entrenched incumbent on technical merit alone". By 2010 it had conclusively rolled to a stop and was definitely no longer a thing, remaining that way until talking to Tim reminded me that relicensing it could open up a niche busybox wasn't already in. Wikipedia's continuing failure to understand this by saying the relicensing _predated_ the hiatus instead of being the reason the hiatus _ended_ is deeply of annoying.)

Anyway, I looked at the busybox version because I have a longish todo list comment in the toybox one about stuff I didn't get around to in 2010, and I was curious if busybox had filled any of them in yet (which might save me looking up an ioctl name in the kernel source or running ubuntu's nbd-client under strace).

The answer was "it's hardly been touched since I left again", but... Does every single busybox command _really_ have to annotate its NEWTOY() line (or whatever they call it) with SUID_DROP? You have to _affirmatively_ drop permissions rather than that being the _default_ if you don't explicitly retain them?

Really? Was it always like that?

Ouch.

July 29, 2015

Finally got the xilinx toolchain install HOWTO finished. That took longer than expected.

Now working on an index page for the directory, because there is SO much that needs to be documented to let people pick this up without personal handholding.

You have to download the bitstream build from git, which involves converting a nest of mercurial subrepos into a unified git repo. (We uploaded a tarball, but converting the history is gonna be fun.)

You have to install the bitstream compiler.

You have to install the sh2elf compiler, which is used to build the bootloader rom, and no you can't use sh2eb linux compiler for this because the two standards have different symbol prefixes, and the bootloader expects the sh2elf version. (I looked at converting it once. It was more than an afternoon's work, and apparently lots of bare metal code expects this "standard".)

Those three (plus standardish host toolchain stuff like make and flex and bison) are enough to build a bitstream... if you don't modify any files. If you _do_ modify files, you also need to install a package called leinengen (a lithp dialect called cloture running in a jvm) because that does stuff like read an openoffice spreadsheet with instruction lists and corresponding bit patterns and microcode) to produce VHDL output files that implement those instructions.

Oh, and I need to document where to get an FPGA board from. And eventually how to adapt the build system to a new FPGA board using the semi-standardized pinout documentation file on the board vendor's website. (At least the Avnet guys and the Numato guys both had the same sort of file if you dug for it, I'm told it's standard.)

So I'm documenting using the Numato board (it's cheapest and has an SD card slot built-in so development there's easiest), and they have a python 3 tool for flashing the sucker (Yay! Standalone! I really need to port it to Python 2 so people can actually use it.)

And then there's how to build a vmlinux (kernel patches and config), and put an initramfs root filesystem into it (aboriginal doesn't produce one automatically yet, I've been hand-hacking it), both of which need the _other_ sh2 toolchain (sh2eb, aboriginal produces both but until I can get a release out there's no binary download URL for prebuilt tarballs and although I came up with The Wrong Fix for mips, sparc is still broken too for reasons I have yet to root cause yet but were due to a change in toybox).

And don't get me started on teaching people VHDL coding yet. I have yet to find a good place to learn it. As far as I can tell, Jeff derived a technique from the european space agency's LEON project and then taught the rest of the team himself. We may have to WRITE a book, which would suck...

So the Avnet board requiring extra hand-hacked hardware was blocking, then lack of easily reproducible toolchains were blocking, then the xilinx documentation was blocking, now...

Working on it!

July 28, 2015

Drainage guy walked the property, estimated it would take about $15k to fix our stuff and would require an electric pump. I pointed out that the power fails, eh said I could go out with an umbrella (assuming I was here) and switch on a generator, plugging something into an electrical outlet in the pouring rain.

My enthusiasm level for this solution is not high. Fade's got outright PTSD about the flooding and needs money thrown at the problem ASAP for the sake of her emotional state, but... that's not a fix.

Meanwhile, the white house replied to a petition about Snowden by basically saying "how dare he not already be in Gauntanamo", so no surprise there.

Oh, and after every mass shooting du jour they say we can't regulate guns because hunters? Yeah, about that.

July 27, 2015

Any projects still hosted on sourceforge should probably be aware their robots.txt prevents archive.org from indexing them, so if they have another outage: there are no backups.

Seriously, if you're still on sourceforge, you might want to move.

In other news, our local wendy's is _not_ permanently closing, just being remodeled. (Yay. I like frosties.)

July 26, 2015

Last week I tweaked the status.html generator to point to the "official" man7.org page instead of linux.die.net to display section 1 man pages for commands that haven't got a posix or LSB page. The problem is that on die.net 38 man pages were missing, and switching to the official site reduced that to... 73 pages.

So, not an improvement then. Sigh.

I'm wondering if maybe I could poke Michael Kerrisk to fill out his page list? I'm not sure how section 1 is generated here. (Section 2 comes from the kernel source, that's why there's an official man pages site. Section 3 is libc, section 1 is command line utilities, and so on. (Type "man 1 intro", "man 8 intro" and so on to see descriptions of each section. there's probably a better way to get a page list than "ls /usr/share/man/man7" but I dunno what it is. Well, ok, going to the relevant web page, but how did THEY generate it? Probably ls the directory...)

July 25, 2015

People are impressed by internet-stoppable cars but personally I'm waiting for the first self-driving car reprogrammed to kidnap someone. (Come on, people, this stuff is _obvious_.)

Why am I involved in open source and now in open hardware? Audit all the way to the bare metal. Now we're up against state actors adding backdoors which leak out to organized crime even if the state actors themselves don't do more than ensure their own unlimited funding with them.

(The purpose of NSA's omni-surveilance is to blackmail future politicans with the porn they read as teenagers to ensure they vote to keep the funding going forever. They don't know who today will control the purse strings tomorrow, so they get blackmail material on everybody, just like J. Edgar Hoover did. They'll never use it for any law-enforcement purpose because that could compromise its real reason for existing. The problem is that stored data inevitably leaks: if it's collected it's a race between destroying the last copy and leaking it to the whole internet, so organized crime will party with it eventually. Remember: the USA trained the taliban to fight the Russians. This is the country that invented the junk-bond leveraged buyout. We've been our own worst enemy for a while now. Not that China's "great firewall" or Russian surveilance exiling dissidents to siberia or our own McCarthy-era blacklisting and communist witchunts are exactly _new_... Nor is right-wing nutballs confusing dissidents with terrorists. I mean, that was sort of Nixon's entire Schtick, wasn't it?)

Sunlight remains the best disinfectant, but sometimes you've got to pull out a magnifying glass to focus it. And the sun goes down; the rallies against Sopa and Pipa won't stop them from trying again and again until we get tired and lose interest. (Let's hope they're not smart enough to schedule the next round during a Wikipedia begathon, betting they'll never go black during a fundraiser.)

July 24, 2015

Austin has a cat cafe, doo doah, doo dah.

Biked to see Furiosa's Road at the drafthoue last night. Good movie.

Hey, MacOS X is so secure that you can crack root in one line of shell. Steve Jobs: still dead.

July 23, 2015

Wound up just reverting commit 46490b572544. The first patch broke it, the later fixup patch didn't fix it. Poke the kernel guys and get them to sort it out. (It's trivial to test: "awk '{print $2}' /proc mounts" hangs using the busybox mips binary that's been on busybox.net for years. Used to work, doesn't now, kernel is what changed.)

Meanwhile, Intel has acknowledged that the Moore's Law S-curve is flattening out. We knew it couldn't last forever (atomic limits ho!), the question is how much longer we've got? And the answer: "not long".

(Hardware advances don't _stop_ after this, but they're no longer driven by die size shrinks. Open source hardware development becomes important to make better use of the transistor budgets available to us, the same way open source software makes better use of available computing resources. It's a bit like solar and wind being driven by the end of cheap fossil fuels: it's not that the new technology _requires_ this to take over, in theory it can compete and win on its own terms. In practice it's nice for the new thing to be ready as the old model falters.)

July 22, 2015

Still debugging mips so I can cut an aboriginal release. Checked out linux git 46490b572544^ (the ^ at the end means "commit before that one", I believe it defaults to lefthand branch if it's a merge commit but I don't use it on merge commits so it doesn't come up much). According to the diff of the commit that breaks stuff, the function mips_set_personality_fp() got its head handed to it, and that's an OABI thing (which is the mips ABI variant we're building for).

On line 155, add "printk(KERN_ERR "state->overall_abi=%x\n", (int)state->overall_abi);" which makes the kernel a bit chatty but tells us that when it works state->overall_abi is 1 for every program run in this context, including awk. The case statement after that is mostly looking for MIPS_ABI_FP_* and a further grep for that says MIPS_ABI_FP_DOUBLE is 1.

The 4649 commit renamed overall_abi to overall_fp_mode, and changed its value from 1 to -1. It didn't remove the MIPS_ABI_FP macros but the logic of mips_set_personality_fp() changed to look at an enum at the top of arch/mips/kernel/elf.c, in which 1 is FP_FR0. The old behavior of mips_set_personality_fp() for overall_abi of 1 was to call set_thread_flag(TIF_32BIT_FPREGS). Now we have a set_thread_fp_mode() shim that calls it for us, but it looks like FR0 would trace through the gratuitous wrapper to the right set_thread_flag() if it could get called, I.E. if overall_fp_mode was still 1.

The weird part is that the printk to see the value of overall_fp_mode only trigger if I put it BEFORE the "if (!config_enabled(CONFIG_MIPS_O32_FP64_SUPPORT)) return;". If I put it after, it never gets called. Because the kernel config says that CPU_MIPS32_R6 has this, but CPU_MIPS32_R2 (which I'm using) doesn't.

This worked fine in the old kernel, but I'm running it under qemu. I know this _does_ have the right floating point, whether or not it _should_ is a question requiring research. (The routers I was working on at Pace were real mips hardware, running the same r4k compatible 32 bit OABI stuff (not N32 or N64) as these binaries, but I don't remember what actual chip was. Some broadcom thing.

According to Wikipedia[citation needed] the mips r4000 (I.E. r4k, introduced in 1991 and thus older than linux) included on-die floating point support (nominally the r4010 but it was built-in), which could operate in 32 bit or 64 bit mode. So OABI 64 bit floating point support _predates_linux_, and I'm feeding the kernel that's three architectures newer than that (and backwards compatible) so yes, CPU_MIPS32_R2 from 1999 does indeed support bog standard mips floating point intructions. So leaving the floating point version uninitialized is NUTS. Where did this MIPS_O32_FP64_SUPPORT symbol come form in the first place?

It came from commit 597ce1723e0f which looks really broken. It says the 64 bit floating point support is only sometimes available (instead of a baseline part of r4k), and that only new-in-2013 toolchains set a special flag to request it (my toolchain is circa 2007, but the kernel flags said it was using 64 bit float right before it broke).

More to the point it says that saying N to that config option just forces 32 bit floating point, and the 4649 commit is testing that symbol to see if we're in OABI mode, which is _wrong_.

Possibly the "fix that didn't fix anything" commit (git 620b15503457) would work if I yanked that stupid test? Let's see, change both MIPS_O32_FP64_SUPPORT checks in arch/mips/kernel/elf.c to MIPS_O32, rebuild and... nope, still hangs.

July 21, 2015

The Aboriginal Linux release is blocked by a mips failure, attempting to natively build Linux From Scratch hangs right after the chroot setup.

Over the weekend I debugged to "awk '{print $2}' /proc/mounts" hanging, or more specifically endless looping because the 2 in $2 was parsing as NAN, since busybox awk uses floating point.

I haven't updated the busybox version for over a year (because I'm trying to replace it and only using a small part of it anymore). I haven't updated the toolchain since the last GPLv2 releases (binutils moved through source control to the last GPLv2 _commit_, but that was almost 3 years ago now). And uClibc hasn't had a release since 2012. This worked in the last Aboriginal release, and there have only been so many commits since then, so I bisected through the aboriginal commits and hit... the upgrade from linux-3.19 to 4.0. The kernel broke it.

So I bisected through the kernel, doing the normal dance of "what moron is adding -Werror to files in arch/mips and thus breaking bisect, surely you can CFLAGS that locally or just _notice_ when it produces warnings" until I got to commit 46490b572544 "MIPS: kernel: elf: Improve the overall ABI and FPU mode checks" which is what broke it. Doing a git log on the main file that touches, I see commit 620b15503457 "MIPS: fix FP mode selection in lieu of .MIPS.abiflags data" which specifically says in its description that it's fixing the 4649 commit. So I go "great", add that to my patch stack for 4.0, rebuild, and... awk is still hanging trying to parse its command line options.

So it looks like the vanilla kernel, current git pull, still can't run the awk command out of the busybox-mips binary I uploaded _years_ ago now.

And in the evening, I dropped my phone again. This is sad. It fell less than 3 feet onto the hard ceramic tile at In-N-Out and hit flat so I thought it would be ok. Nope. Screen cracked again, and this time it's ignoring the top 5/6 of the touchscreen area, meaning I can't swipe to unlock. (You can get all the notification icons to overlap by repeatedly swiping the bottom bit, but that's just some gratuitous animation that apparently has no relation to actually letting you into the phone.

Likely to make the morning call a bit awkward, since Fade gets on an airport shuttle for a week's visit to maryland at like 4:30 am.

I had this phone for over a year before breaking it. My previous three phones never had a cracked screen. But this one lasted a week since the last repair, and neither the phone repair place nor T-mobile has cases for a Nexus 5. (Fade ordered me one of Amazon which should arrive Friday.)

At least this reminded me to wander into the T-mobile store and remove "jump". They insist that the previous T-mobile representative I spoke to in the same store lied to me, and it's really just a $150 deductible to swap a damaged phone for a new one (when you qualify for an upgrade anyway) as opposed to the $130 repair. I told them the time I cared about that was a week and a half ago: it didn't work when I tried ot use it and I'd like to stop paying $10/month for something useless.

July 20, 2015

Toybox 0.6.0 is up!

The release notes start with me railing at Wikipedia[citation needed] again. I trimmed it waaaay down from what I original wrote and it's still too long, but I've been publicly complaining about this for years and not only is their article still wrong, but the wrongness got copied into the busybox article.

Relicensing ended the hiatus, but they say the project was relicensed _then_ mothballed. So they don't understand why we're doing it, and then go on to make up reasons such as not caring about compatability with the gnu versions of things. (Is that why CP_MORE exists or toys/other has 80+ commands in neither posix nor LSB? Also, rather a lot of packages like util-linux and ext2progs and infozip and such aren't gnu. Plus the first from-scratch clone of unix was Coherent in 1980, BSD was proven in court not to have any AT&T code left in it, the Linux 0.0.1 announcement was posted to comp.os.minix because Linux forked off of that clone, and I was attracted to busybox in the first place because it _wasn't_ gnu. So if I don't consider gnu stuff special, it's because it's really not.)

What I care about is what people use. If LFS is calling mv -v, then I implement that regardless of its absence from posix. If people email me wanting "ls --color" or "cp --preserve", I add that. (I may wait for them to email, but feedback from users drives the project, that's how open source development works. We need a word for projects where the source is available long after the fact, but users cannot provide significant input into the project. Where the discussions are on private lists, or you need to sign a copyright assignment form before they'll take a patch from you which seems to cut out about 90% of the drive-by developers who would otherwise grow into regular contributors. People call that stuff open source but its development isn't a positive feedback loop collecting community input.)

July 19, 2015

So lemme get this straight. Last time the democrats won the presidency due to unusually high minority voter turnout, so this time they're fielding a 100% old white people slate to prevent a recurrence.

And expected favorite who is "inevitable" this time around as she was against Obama in 2008, the Democrats' version of Jeb Bush, is Hillary Clinton. Except really she's the democrats' Bob Dole: runing because it's "her turn". Assuring donors you have a lock on the party machinery regardless of what the electorate thinks is not _comforting_ to actual voters. Although my problem with her is she wants to extridite and prosecute Edward Snowden to defend and expand the surveilance state, which is a bit of a hot-button issue for me.

I would be enthusiastic about Elizabeth Warren running. Hillary Clinton? Not so much.

July 17, 2015

48% of github projects have a bus number of 1, and 28% have a bus number of 2. Good to know.

Oh, here's another way the past was more advanced than the future. Beyond obvious stuff like "we used to be able to go to the moon", I mean. New York City's Pneumatic Tube System.

July 15, 2015

Anybody left who still thinks the euro is something other than the deutschmark with a hat, sunglasses, and fake mustache? Germany is furious that now they finally own europe the conquered countries aren't groveling deeply enough to the almighty deutschmark.

This should end well.

July 13, 2015

Watching various people I know apply for jobs, and it's... sort of painful.

Something open source taught me is that in creative work (and these days if it's not creative it's either automated or inherited), you have a resume and a portfolio. Ideally you want to knock both out of the park, but weaknesses in one can be compensated for by strength in the other, and there's a bit of multiplying together, so you want both to be strong.

It would be nice if schools taught this more explicitly. Open source builds your portfolio. If all you've ever done is closed source stuff for companies you can't show to new employers, you haven't got a portfolio. That's why it was so revolutionary from an employment perspective.

College mostly counts towards your resume. A bachelor's demonstrates that you can consistently show up on time and reasonably sober, and complete assignments by deadline. (High school doesn't count because you're legally obligated to be there and would be less tightly supervised in prison.) What your degree is actually _in_ is secondary. (At the undergraduate level, anyway. A dissertation counts towards your portfolio, but it's not a very efficient way of building one.) Not quite sure where associate's degrees from community colleges slot in, they're certainly popular to _get_.

At the other end, Cirriculum Vitae (CV) is latin for "I've been doing this so long there's no way I can fit a resume on one page anymore without leaving most of it out". Sometimes people use it to avoid having to look up how to do the little marks over the e's, but most people don't bother with that anymore. (Yeah "resume" has another meaning but you can run a race, an engine, your mouth, in the family, your nose, for office, into trouble, in your stocking... A dove is a bird, dove into the pool. Somehow, we cope.) Mostly people still call a CV their resume to avoid sounding pretentious, and only pull out CV if somebody complains about the one page thing. (If you have less than 5 years experience, it's probably not a CV and you can fit it on one page. More like ten years, really.)

When applying to a job, you need a resume, portfolio, and cover letter. The cover letter can be an email, it's just "Hey, I'm addressing you specifically, this isn't spam, I know who you are and what you do and want to work for _you_. I can communicate coherently and reasonably politely! Talk to me." It's not a big deal, but you write a new one from scratch for each employer. (Not using a template, just write a darn email.)

July 12, 2015

The "inevitable march of progress" is a fun american myth. Pity it's not true. There's stuff like losing the cure for scurvy, having to take apart stored saturn v rockets to see how they worked, or being unable to expand our old oil refineries because the plans were lost and the maintenance is magic incantations nobody understands anymore. (Yes, we have to perform archaeology on 40 year old technology because the people left and assumed everyone who came after them would be smarter and better informed, which was not the case. And that's without bringing ludicrous intellectual property expiration times into it.)

But there's lots of resistance to new technology too. Everybody's probably familiar with the fight against the fact bacteria cause ulcers, but it turns out baby incubators had an even harder time.

So of course old men yelling at The Cloud because they're terrified the series-of-tubes will contaminate their precious bodily fluids... That's to be expected.

People treat the US as this great inevitable thing and go "oh sure Rome fell and there was a thousand years of religiously-restricted darkness refusing to allow anyone to fix anything because clearly we were after the end of the world and resisting might delay the rapture, but that could never happen again _today_, Rome lasted almost a thousand years and we're not yet a quarter that old, clearly we're superior." But people remain people, and reactionary herd-driven old fogies tend to grab the wheel and not let go until death. (Younger fogies are called libertarians.)

And now they have a lock on the political process and armed drones. This should end well. (Then again, our government actively working against its citizenry is nothing new. But sometimes it gets fixed.)

July 11, 2015

Virtualbox on MacOS X found a new failure mode! Hide the host cursor and only show the guest cursor, trailing the touchpad moves by multiple seconds. It's almost impossible to steer when it does this.

Luckily I'm figuring out some of the reproduction sequences. When the mac runs out of battery, it saves the vm to disk, and when it loads back from disk it shows both the host and target cursors (so you can move the host cursor and then see what the target cursor is actually doing, trailing it up to 30 seconds later.)

July 10, 2015

What's my todo list... sh2-elf toolchain for bootloader, sh2-linux toolchain for kernel and userspace, kernel (patches for 4.0 including 0pf_defconfig), userspace/initramfs (aboriginal linux with hush instead of ash, toybox fixes)...

What am I working on? Researching various deflate implementations to answer a question the deflate RFC doesn't, specifically: when do you perform dictionary resets? That and how long to follow match chains are the two big tuning parameters. (Modulo a smaller sliding window, which isn't really interesting anymore.) People have written some interesting pages about the implementation details of other attempts, but there doesn't exactly seem to be a consensus. Still, cloudflare's performance oriented fork does resets every 64k (and longer gaps don't save a noticeable amount of space), so I'll probably just go with that.

Another tweak is "resyncable", which allows scanahead. What it does is insert zero length type 0 blocks, gives a known aligned byte signature (new block outputs bits padding to the next start of byte) so you can scan ahead and find the next start of block and speculatively parallelize your decompression. To make that work the output has to insert the anchors, though.

July 9, 2015

Um, guys? The reason letting billionares run things is a bad idea is that we've tried it before and it SUCKED. The Rockefellers did The Ludlow Massacre and then 6 years later the Battle of Blair Mountain and then gave enough money to charity (Rockefeller center, etc) to make everybody remember them fondly. You wonder why Bill the Gates is giving away his fortune (and making it tax deductible so the 30% that would have gone to schools and healthcare and such anyway, he gets personal credit for)? It's to make you forget about bundling a ham sandwich circa the antitrust trial. Dude was evil for years, uses the ill-gotten gains to buy the country's love.

He is by no means alone. They all do that. A century ago William Randolph Hearst ran Fox Newspapers back when it was called Yellow Journalism, and is the guy who personally made Marijuana illegal back when Hemp had been around for literally centuries of industrial use without incident), and was PISSED that Orson Welles screwed up his attempt to rehabilitate himself with Citizen Kane.

If you're living in a new guilded age, you should really pay basic attention to the previous guilded age to see what we're in for. That one ended with two world wars bracketing the great depression, after which the top income tax bracket was 92 percent and remained there for a couple decades. How we handle that kind of societal reset in the post-nuclear age, I couldn't begin to guess. Unlikely to be pretty. (Then again, the "enemy" this time around is probably gonna be climate change. I wonder when we start losing cities? Other than detroit and new orleans I mean...)

July 8, 2015

The phone repair place was open today, and ordered glass to fix my phone. Ok, backing up:

Last week (July 2) I dropped my phone, which spun onto concrete and hit on a corner, cracking the entire screen. It still works, but I'm reluctant to "swipe to unlock" or do the various pulldown menu stuff over broken glass.

The phone repair place across the parking lot from HEB was not just closed for the holiday, their sign said they were out through the following wednesday.

I went to the T-mobile store and was told that my "Jump" plan only lets me swap an intact, working phone for a new phone. Otherwise I have to pay off the balance on my current phone. I.E. it's a useless extra monthly charge to pad T-mobile's profit margin.

So I limped along with a semifunctional phone until the repair place opened, so they could order new glass, so I could have them fix the thing. Still cheaper than getting a whole new phone.

July 7, 2015

My big project at $DAYJOB is still putting information online that lets you actually do the things we talked about in our linuxcon japan talk. The initial instructions I rushed up have large gaps in them, such as "how do you build the bitstreams from source anyway?"

There's... a lot. I'm scheduled to speak about this at the west coast Linuxcon, Linux Plumber's and then either Texas LinuxFest or Cool Chips. (All four are scheduled on overlapping days, I suspect doing all four is a bit much. Three is a bit much.)

It'd be really nice to get this to not just a "tech demo" level but "here, play on your own" where people can actually dig into it and rebuild it and modify it and all that. Toolchains! Repositories! Build instructions! Hardware ordering and configuration info!

Once again, slowly and painfully learning stuff so I can write the documentation I want to _read_...

July 6, 2015

On github, a "bug report" was a random link, the first words of which were "stubborn and ignorant use of"... at which point I closed the tab and added github to the "don't read the comments" site list.

Meanwhile Chromium "upgraded" itself, and now the new tab page has a little throbber animation that triggers the virtualbox "mouse cursor takes 30 seconds to trail across the screen" bug. Isn't that lovely. And no obvious way to stop it from doing that, or downgrade back to the contextually non-broken one.

A while back Jeff registered lastgplv2.org and I'm tring to come up with some content for it, which means finding the last gplv2 version of gcc, not just binutils. Except gcc isn't one package at that point, it's 3, and they have three different repository _types_. There's a git mirror of gcc, mpfr is in svn, and gmp is in mercurial.

Meanwhile, on the linux-kernel front, a DECADE ago I asked linux-kernel for a way a program can re-exec itself without knowing where its binary lives in the filesystem (because /proc/self/exe relies on proc being mounted _and_ you not having chrooted to a context the binary isn't visible from since that's a symlink). The kernel devs added a new system call (which exec(NULL) wouldn't need), which didn't fix the problem (you still can't re-exec self without referring to something in the filesystem), and which quoted a reply to my message but didn't address the issues raised in my message.

The ~~aristocrats!~~ linux-kernel development mailing list!

July 5, 2015

But the technical complaints in the message went unread, as Kroah-Hartman seized on an incorrect, but offhand, comment early in Lutomirski's message to stop reading at that point.

Gee, and you wonder why I refer to Greg as "my nemesis"? Yes, he pulled that on me way back when, and if I seem a little bitter about it... There's a reason I haven't found kernel development fun in a long time.

Backstory: After I wrote mdev for busybox, kernel changes broke mdev a bunch of times. After something like the third, I started researching the sysfs api, which pissed off Greg KH and his partner Kay Sievers, who insisted that nobody but them got to say what's stable in sysfs (and that it was a private export only to be consumed by udev). When I started publicly researching the topic to write documentation, they wrote their own document and checked it in to head me off. (Greg did that a lot, I'd post something to the list and he'd check something in to the repository in reply, hoping to end the debate because he had commit access I didn't.)

To say this new document didn't explain what I was asking about (and what mdev needed to be a functional udev replacement, did I mention Greg and Kay wrote udev and couldn't _comprehend_ why anybody else would want to write much less use an alternative) was an understatement. The new document didn't actually SAY anything, it started with a statement about how "the kernel developers have agreed sysfs doesn't provide a stable API" (the kernel developers in question being Greg and Kay, other kernel developers strongly disagreed), followed by a blanket prohibition that "you can't use anything in sysfs that this document doesn't allow you to use". It then followed this with a list of prohibitions: don't do this, don't do this, don't do this, without a single concrete thing you WERE allowed to rely on. So "blanket deny", followed by specific denials, and this was the document everybody other than them was to use to read sysfs in a way that wouldn't break every other kernel release.

So I asked about it , and spent weeks asking the same question over and over and over. (I still honestly don't know if Kay didn't understand "how do I distinguish char from block devices to get the third argument mknod needs" or if he was trolling me.)

When I explicitly pointed out that the document did not in fact contain useful information, Greg pulled exactly the trick in the lwn article up top, throwing a fit at an unrelated comment and refusing to address any of the technical content, and ignoring apologies and technical follow-ups alike. His delicate feelings had been hurt (I.E. how dare I not bow down to his inherent superiority) and he had his excuse to drop the subject permanently. (To this day, I do not believe their document was written in good faith, but hey, he's the #2 guy in kernel development so what do I know?)

After a month or so of silence I wrote and posted some hotplug docs to try to get clarity on what I was and wasn't allowed to do with mdev, and as expected the existing docs (not covering the same material, and _explicitly_ not doucmenting a stable API) were used as the excuse to ignore them. So I just made mdev work and reported breakage to the list as any other bug, as usual.

So yeah, that gets back to me being a little bitter about kernel development. It hasn't been something I do for fun in many years.

July 4, 2015

So Rich has made sure that there's no compile-time way to tell if you're building for nommu on musl. He has a fork() that always returns -ENOSYS for example.

He's basically taking the position that every nommu system is just a subset of a system with an mmu, so you don't program for them specially and only ever add nommu support code if you want it to always be there on systems with an mmu too, I.E. nommu code is perpetual bloat in all packages that add it. (I disagreed and we argued on IRC.)

Anyway, this is why the toybox stuff that I made work on uClibc isn't working on musl, my compile time probe for whether to enable the nommu support never triggers because there's always a broken fork() linked in, and I _can't_ do a runtime probe to set a config symbol when cross compiling.

I'm seriously contemplating maintaining my own musl patch for toybox.

July 3, 2015

I've been trying to make the mac side of my new work mac work, and I've come to the conclusion that although the hardware may be nice (I still hate the keyboard), I hate macos. (Or at least 10.10.3 Yosemite, which is what it came with.)

Yesterday I went "maybe I can repartition and reboot between linux and macos and stop using vmware that way". (This may be another way of saying "can I just use this thing as a linux box and sacrifice half the space to pretend I might care about macos someday").

So I read about it online and found out that "boot camp" is completely useless. There's another mac bootloader that used to work, but it's no longer maintained, and there's a fork that's still maintained but nobody's quite sure if it works. But ok, it might work, so step one is to repartition.

I pulled up "Disk Utility" following the eldrich search instructions (it's not in any of the menus that I can find), and ran it, and the instructions said to shrink the existing partition and leave the space unused... but this version of disk utility wouldn't let me do that, it automatically created a new partition to use the rest of the space. So I went "ok, delete that and merge them back". And there is NO WAY TO DO THIS.

I click "Macintosh HD 2", click the minus sign under it, get a "this partition will be erased" popup, click "remove", and get an error message saying "You may only merge this partition with the one before it. To do this click -." except that CLICKING - IS WHAT I JUST DID...

So now half my mac's terabyte drive is tied up in unused space on an unmounted partition, which I can't install a new OS on but the system tools can't seem to remove. Great.

That was yesterday. Today, I tried to run a video. There's no sound! This is a video my old wind-up netbook could play just fine, but apparently mac hasn't got quite the right drivers for it? Or something?

This entire experience reminds me of a douglas adams quote.

July 2, 2015

Last night the mac spontaneously rebooted. I came back from two minutes in the kitchen to find it booting up, which lost all my tabs and blanked the VM state so I lost all the tabs in there. The machine hadn't been DOING anything, the best I can guess is the constant nagging about "installing updates" which doesn't have a "don't call us, we'll call you" option (the closest is "nag me again in 24 hours") decided in an unguarded moment of being switched on and plugged into wall current to install updates and reboot afterwards. My coworkers insist it doesn't do this, and instead it must have kernel paniced while idle.

My linux box had a chronic problem where I'd go more than 6 months without rebooting it because I had open tabs I didn't want to lose, so my kernel would get a bit out of date. Apparently, this isn't a problem on macs.

July 1, 2015

I got a cp --preserve patch from the Smack guys a while back, and it's yet another instance of comma separated list arguments, specifically another one that wants to set a bitfield.

The dd infrastructure wants conv= and oflag=, but all it does is record which ones it's seen. And mount wants -o stuff, but it cares about "no" versions of most of it to switch things back _off_. Then there's ps -o infrastructure that wants to parse "blah=" but also "blah:", so to genericize that I'd need an array of "arguments seen". (And then is it one argument or a linked list of arguments? Mount -o accumulates one big comma separated list, but I've got infrastructure for that already. In fact the reason mount -o is careful not to reorder arguments is you could theoretically get a "-o thing=/path/to/file,with/comma/in/it" or similar, which it breaks up, parses, and reassembles to pass through. Yes, you could have "-o thing=/path/to,remount,blah" but I'm calling that pilot error.)

Except for ps, none of the above use cases actually do more than match and set a bit. They record "we saw this argument", and that's it. So some generic "check list and set a bit" infrastructure seems useful, but... what do you do with unrecognized arguments? And what format does the list of known arguments need to be in, ps has it as {"one", "two", "three"} instead of "one,two,three" because it wants to refer back to the strings later without needing a length...

This isn't quite as bad as the human_readable() mess, but it's up there. This is too similar and occurs too often to want to hand-code it in each command, but not quite similar enough to collapse together in an obvious way...

The real problem with wanting to parse an array of arguments into a bitfield is you need an enum to go with your array. Meaning the data has to exist twice, which sucks. If I have char *blah[] = {"one", "two", "three"}; and we set "two" it sets bitfield |= 2 and later if I check if (bitfield & 2) I need a hardwired constant in there, and this doesn't scale. I can make it suck slightly less by doing:

char *blah[] = {"one", "two", "three"};

enum {blah_one, blah_two, blah_three};

But they can still get out of sync: if I edit one I have to edit the other. Also, an enum is basically an array of #defines, in order to use it I need to stick it up at the top of the file before code that uses it, but I don't like global variables outside the GLOBALS() block and any parts of that initialized to nonzero get initialized in main(), so in toybox code the declarations wouldn't go next to each other.

For FLAG_x macros I solved this by generating headers, and I think I need a way to do something similar here. COMMA_ARRAY(cp_fred, "one", "two", "three") maybe?

June 30, 2015

The pinched nerve seems mostly recovered. Still a little bit off, but not really impairing my typing anymore. How long was that, a month and a half?

June 29, 2015

Back to the Linux Security Blanket stuff.

If LSM means anything, then we don't want a race window between creating a file and labeling the file. The optimal thing to do is create the file with permissions 000, apply the label to the filehandle, then chmod the file to the normal permissions. Except can the label prevent us from doing the chmod?

People keep sending me "security" code with obvious race conditions in it. I don't like feeling like I'm putting more thought into this stuff than the people who created it. I'm aware this is "security" the way the S in TSA stands for security at airports, and Linux's "security blanket" from peanuts has security in it. But _dude_. Fake it better.

June 25, 2015

So work bought me a mac pro as tricked out as it gets (terabyte ssd and 16 gigs ram; the processor's only 2-way but at least it's a 3ghz i7), and during my most recent trip to Japan helped me set it up with a CD of install software (xcode and such), and told me to use Oratroll's virtualbox for my linux VM.

I hate the keyboard, but can live with that. It's got a "fn" key (as opposed to the "function keys", F1-F12, because that's not confusing), which is the only way to get page up, page down, home, and end: hold down function and guess the appropriate cursorkey. It has the normal control and alt. And then it has Apple's magic option key, so that instead of ctrl-W to close a tab in chromium for apple, it's option-w instead (which I NEVER REMEMBER and always just wind up clicking the close X on the tab after ctrl-w doesn't work). Add in shift and that's five different chording keys. The old song "double bucky" to the tune of sesame street's "rubber ducky" comes to mind. (And of course they didn't get rid of caps lock. Page up and page down, sure, but they had to key caps lock because reasons!)

Before this, I had a Linux box. It was an underpowered netbook which I carried around 24/7 (three consecutive "Acer Aspires" which I was not afraid of damaging because it was like $200 to replace it, and wasn't a theft target for the same reasons). But yeah it got to be a bit of a bottleneck waiting for long compiles, and if work wanted to buy me a mac, I was curious and they were all mac experts so sure. (And hey, I could try to get the low-hanging fruit of toybox working on macos, why not?)

But I always expected to do most of my work in a Linux VM. I was going to use Parallels because Mark used to use it on macs and when I did a gig for them in 2010 the mac vm stuff was their bread and butter, but I went with the recommendation of the mac people and installed Oratroll's Virtualbox.

I've spent a month fighting Oratroll's Virtualbox, which is kind of evil and broken. Specifically, its video driver (without which it forces 640x480 mode because cthulu forbid it understand vesa bios calls from the 1990's) requires a proprietary linux kernel module that does VERY SLOW screen updates at a higher priority than any other task, meaning user interface requests such as cursor moves queue up if you have a single animated GIF on the screen (such as the tiny little swirling balls animation chromium gratuitously added to it's "new tab" page because bling), and all mouse clicks and keyboard keys are processed in strict sequence along with this flood of mouse moves (which are NEVER COLLATED), and the queue is THOUSANDS of entries deep, meaning you can easily get 30 seconds behind on this nonsense and sit there waiting for your "close tab" click to register, watching the little x eventually highlight a full 10 seconds before the tab closes).

Yes, the virtualbox I/O drivers have bufferbloat. I finally figured out what was going on because when I exit and reenter virtualbox, it doesn't always enable the same features each time (some sort of race condition?), and one of the times it forgot to disable the xfce mouse cursor, so I got to watch the xfce mouse trail slowly along the screen long after the apple mouse. Yes, even in fullscreen mode. And yes, a scrub of the mouse over the screen and it slowly retraces that path for the next minute while I can't do anything with the UI, so it's easy to not notice it's gone all slow due to a TINY animation somewhere, and queue up a 30 second delay without realizing it.

And keys like option-F (toggle fullscreen and windowed mode for the emulator) are handled in the same event queue, meaning the problem is on the virtualbox side feeding events slowly into linux. (Linux will happily collate mouse moves if the queue gets to deep. But if virtualbox is only feeding it a half-dozen per second and has queued up several hundred...)

How do I know that this problem isn't enitrely on the linux side? Not just because of the option-F thing, but because the apple menus are ALSO tied into the same queue and will delay your actions by a good 30 seconds until it gets around to processing them along with all those single pixel mouse moves with a #*%&#^ apple host process scheduler context shift and back in between each one.

Did I mention that even in fullscreen mode, if your mouse cursor hits the top or bottom fo the screen it brings up an apple menu overlaying your fullscreen linux session? And that there's no way to make it ever stop doing this, so that if something like xfce has stuff at the top or bottom of the screen you have to be VERY CAREFUL mousing towards the edge, and if you hit it you have to mouse way far away from the edge to dismiss the window because no matter how long you waiting with the cursor too near an apple menu it won't go away ever?

So yes, I regularly hit some autoplay video or flash ad that pins my UI before I notice it's gone unresponsive. The little "progress bar" that isn't a progress bar in thunderbird does this, the little left-to-right-to-left sweep that says "I have no idea how long this will take but we have this slot for a progress bar, so". The fun bit is when I tell it to send email, and it pops up an "enter mail server password" dialog that doesn't block that animation in the bottom right corner of the thunderbird window, and the password keys take 30 seconds to start showing up, by which point the network transaction to the server has timed out... (Here's hoping I typed it in right so the retry won't pop up the dialog...)

When I notice the UI has stopped responding carefully queue up a close tab action if that's the problem (note: click exactly ONCE becaue if you click twice chrome will close the window and have time to complete its "expand tabs into the space, moving the close X of the next window into the space the previous close X was in" before the second click registers; and did I mention if you don't hold the click down long enough virtualbox doesn't always register it, so you may be waiting for a close that will never come but the wait can be a good 20 seconds when it _does_ happen?"...)

So while that nonsense is sorting itself out, hit option-F to make the bogged UI go and use the apple copy of chromium to browse the web a bit, then remember that option-F won't be serviced for a minute or so either, so move the mouse cursor to the bottom of the screen to pop up the unblockable apple taskbar (it's only the _top_ menu that's bogged, it'll pop up but clicks on it send events to the virtualbox process that get serviced in order with all the other trash in the single giant loop of slowness). If I click on the running apple host chromium instance, it switches away from the semi-hung fullscreen virtualbox to the apple desktop when it gives that window focus, and then I get maybe 30 seconds of web browsing before virtualbox finally gets around to the option-F I can't exactly RESCIND once I've hit it, and then grabs focus away from the web browser to switch BACK to the fullscreen virtualbox, so it can windowize it. This is about 3 seconds of "mac UI not listening to me" due to sideways swoop animations I haven't figured out how to shut off.

I may go to the Apple store and buy a copy of parallels and try to transfer over my vm because OW this is crappy vm software. But I'm not sure which subset of the problem it would fix or make worse, it's a "devil you know" thing. I can't exactly ask Mark for opinions (who got a new religion, changed name, moved to another state, massive physical transformation starting with full body tatoos and getting ears pointed, cut off all ties with everyone from old life... hard to get tech support there) and I when I worked for the parlallels guys it was on OpenVZ (different product entirely, their container offering instead of their virtual machine offering). So I _think_ parallels is better here, but haven't actually used it.

June 21, 2015

I have a pending set of ls fixes, which means I'm back looking at the testfiles problem.

One of the fixes is that "ls -o" is producing an extra space before the file size, and when I fixed _that" "ls -og" had an extra space. I'm running a lot of tests on the command line to pin down the behavior, and every time I do that what I _should_ be doing is adding a test to the regression test suite instead of just doing stuff on command line.

The problem is that my toybox directory has 51 files and directories in a variety of sizes and statuses. I don't want to make the test infrastructure depend on the specifics of the toybox source because if I change the source it would break tests, but it's really convenient to test against stuff I have lying around.

Producing my own sufficiently varied set of test files brings up the "utf-8 filename issue" and so on, which brings back the "should the files be checked into source control or in a tarball?" The first puts us at the mercy of source control and people's local filesystems and has the problem that you can't have device nodes and fifos and such checked into git (and couldn't check them out as a non-root user if you could). Having a tarball means you can have arbitrary complexity you can skip extracting when you're not root (tests can check if they're root up front), but it means you change the whole tarball every time you change a file in it.

Probably I need to have two: one a directory of test files and the other a tarball that gets extracted on top. But which goes in what is one of those judgement calls. (The long japanese filename, is that tarball contents to avoid a git clone breaking horribly on somebody's system who doesn't care about internationalization or the test suite? Does that mean we extract the tarball when we're NOT root? Presumably, yes, because I really don't want to have a directory of files and _two_ tarballs...)

June 20, 2015

Banging on sh2eb toolchain creation. Binutils libbfd is just sad. ELF is an archive format, produce ELF output and then postprocess it into any other output format you need. But no, that's not what they did...

I've decided that if I ever have to write a replacement for libiberty (oh PLEASE no), instead of calling it "libdeath" it should be called "libcake".

June 18, 2015

[EDIT: less evil than I thought, see end.]

Google is officially evil, chrome downloads a binary blob that sends your laptop microphone's audio to google 24/7. Hands up everybody who thinks android phones _aren't_ doing that.

Their excuse is to implement "ok google" functionality and have the _server_ listen for the phrase because obviously the client can't recognize two words. We know for a fact that the NSA is intercepting all internet traffic, so now not only are all your phones bugged, but your computer is listening and sending it to our KGB wannabe vouyerism-as-a-service out of control secret keystone cops.

Don't be evil google. *pat* *pat*

This puts Clay Shirky's tweet in perspective, eh? Ordinarily I'd cheer on Google fighting against the secret police forbidding them to talk about an instance of warantless search and siezure, but their _objection_ appears to be about exclusivity. That's _our_ surreptitiously collected private data, how dare the government copy our secret snooping. We patented it or something! Foul!

Sigh. Google is not a monolithic entity, I know. Their left hand doesn't know what their right hand is doing. But I'm no longer convinced the left hand would care, either.

[EDIT: Looks like twitter's back and forth eventually settled out into this being less evil than I thought, it isn't constantly phoning home, only when you say a trigger phrase. The "sends data even before you say ok google" is sending buffered data so the server can confirm you actually said "ok goggle". So it _is_ having the server detect the "ok google" phrase, but only after the client's detected it. The "initial network data" that got them investigating this was just downloading the binary blob to bypass open source developers' ability to select what to include in the system and audit the code we're running. So still ouch, but not as bad as I thought.]

June 17, 2015

It's ~~the stay-puft marshmallow man~~ another house flood.

So our totally-not-climate-change weather gave us the third flood of the century in a two year period, this time Tropical Storm Bill. I went outside in it and dug a drainage trench for a couple hours (meaning only _half_ the house flooded this time), but by the time the water gets up over our horseshoe-shaped driveway there's two or three inches of it aboveground on the west side of the house. (In large part because our neighbor's hard drains into ours, and apparently that continues through their neighbors all the way to the street.)

Unfortunately, the drainage trench I dug could get through the narrow strip of not-concrete between our driveway and the neighbor's fence because of two trees in the way plugging that gap, and a shovel is not going to make a dent in several inches of concrete.

So we get the disaster recovery people back for Fanpocalypse III, and then we get landscapers out here. Maybe they can put a pipe under the driveway, maybe they can take out the two trees, maybe we can take the fence down and dig a drainage trench under it.

Either way: yet more large expense, stress, moving all the furniture again, and lack of sleep for several nights while fans and heat do their thing. (After the previous two floods the air conditioner is on its last legs; the humidifiers getting the inside of the house up over 90 degrees is not kind to it. Last service guy said it was pulling twice as many watts as it should, and will need to be entirely replaced when we have a chance. But right now, that money is going to large men with loud blowers. Luckily the baseboards still haven't been replaced since the _last_ flood.)

I am _so_ glad we got ceramic tile after the first flood.

June 15, 2015

The lwn.net article goes unembargoed tomorrow! Race race race to get everything up! What is the todo list:

nommu and 0pf mailing lists up and running
Kernel patches posted (against current kernel)
Bitstream compiler install document (xilinx horror HOWTO)
VHDL build posted somewhere (release tarball)
VHDL git repo on github with actual history
VHDL compiler install HOWTO
sh2eb compiler (binary and build)
sh2elf compiler (binary and build)
Bitstream build HOWTO
Where/why to order a numato board
How to flash a numato board
vmlinux image (binary and build)

I've been working on the toolchain stuff as part of Aboriginal's sh2eb and sh2elf targets. (I need to document why we need both; basically the ROM bootloader in the bitstream uses different ELF prefixes, and you can't change them at runtime because libgcc and such will have them wrong).

I also need to be able to build a vmlinux as part of aboriginal, the kernel patches are almost ready to go upstream (for a definition of "ready" that's skill fairly ugly, but that's quite a step up from "disgusting" and at least they're against the current kernel now).

But looking at the above todo list... yeah. Do all the things!

(Sigh. A single missing rung means you can't climb the darn ladder. I have to walk people through all the way from ordering the board to getting a shell prompt they can type at, and if you're building the bitstream and vmlinux from source that's a _lot_ of steps to document. Just setting up the build environments is... quite a thing.)

June 14, 2015

Grinding away at toybox release notes (long overdue) and what I _really_ want to write up is a long history of wikipedia[citation needed] getting toybox's history wrong.

Again, toybox wasn't relicensed in 2006 and then mothballed. It was relicensed in 2011 as it was revived. Relicensing it was the _reason_ I started working on it again, that's what distinguishes it from busybox.

I started toybox after Bruce Fscking Perens sucked all the fun out of busybox doing a SCO Disease style "there may not be a line of my code in the project but I own it outright anyway, I take credit for everything at the expense of everyone else you work for _me_ grovel before me etc etc" thing. Since I worked to undermine SCO's lawsuit in a professional capacity for a couple years there, I may have been a touch sensitized to this sort of thing.

I did toybox because I had a lot of unfinished command implementation ideas and I still liked doeing them, and I felt I could do a better job than busybox. (Heck, I'd already rewritten the busybox mount command three times, redone sort and sed and bunzip when they had a version of each before I started... An excuse to do it all again and get the base infrastructure right wasn't a bad thing.)

But the problem was busybox had a ten year headstart, several years of which were my own work. There wasn't enough headroom between what busybox was doing and a _perfect_ implementation to displace it on purely technical grounds. Besides, after Bruce happened rather than just coast until I felt better (as previous maintainer Erik Andersen recommended), I did what I felt was in the best interest of the project and handed off to the person with the best technical judgement among the most active contributors. Turning around and undermining my own hand-picked successor seemed kinda impolite. Especially, although I felt fine writing new code that was better than busybox, telling anybody else "use my thing instead of busybox" seemed impolite. Why use toybox when they could just use busybox? Why contribute to toybox when they could improve busybox?

So my motivation to work on toybox waned, although the process was so gradual I was still occasionally blogging about it a few months before the relaunch. I tried to push code and design ideas from toybox to busybox with some minor success, but things like the remaining busybox developers forgetting what the existing a href=http://lists.busybox.net/pipermail/busybox/2010-October/073674.html>infrastructure was for (And dismissing the topic as a coding style differences)... I never quite went back to busybox development because I was tired of shoveling out the endless mess.

By 2010 toybox was to the point where I sent new commands I wrote straight to busybox rather than bothering to add them to toybox, but actually trying to get back into busybox development was incredibly frustrating.

Then Tim Bird contacted me about his bentobox project, as part of his effort to bring Android and vanilla Linux back together, and I went Oh, of COURSE. Tim was trying to heal the split from both sides, his Android Mainlining Project took android kernel changes and fed them one at a time into vanilla Linux, and his Bentobox Project proposed writing new userspace code under a license android could use to get a more standard/full featured posix command line. (Remember: busybox predated android, so the fact they weren't using it wasn't something we could wait out.) Tim contacted me as a potential contractor to write new code for bentobox, I went "I did toybox for years, I've got hundreds if not thousands of hours of work in there already, and I only have a half-dozen external contributors so getting clearances or removing their contributions so I can relicense the whole project isn't a big deal"...

I'd been pondering along those lines anyway, but it took me a while to connect up the damage wrought by GPLv3 with Google's response to it, and talking to Tim connected the dots for me.

So Tim when off to update his proposal and I started working on toybox again without waiting for him, which immediately set off a "how dare he" firestorm on the part of the FSF's armchair admirals. They focused on attacking Tim, since attacking me (ex-busybox maintainer, guy who started the busybox lawsuuits) kinda undercut their message a bit.

Eventually my periodic whacks at this pinata of stupid (along with a basic understanding of unix history) turned into the Rise and Fall of Copyleft talk (mp3, outline).

Wikipedia[citation needed]'s insistence that toybox was relicensed in 2006 and _then_ mothballed seems to be advancing an FSF zealot storyline that a project cannot _possibly_ be motivated (let alone become important) by NOT being GPL. And yet, that's exactly why toybox got restarted, and what gives it a niche busybox wasn't already filling. If toybox does beat busybox in the marketplace, it'll be because busybox is GPL and toybox is not. After the damage Stallman did splitting "the" GPL into warring camps, being GPL is now a net negative for a project.

But you'll never understand that by reading the persistently incorrect wikipedia[citation needed] article.

June 10, 2015

Our talk got covered by lwn.net!

The nommu.org website still isn't up, nor is 0pf.net's download area. I have to turn the xilinx bitstream compiler install instructions into something less horrible and actually followable. We need to clear and post VHDL bitstream source. I need sh2 portable compilers (aboriginal linux cross-compiler.sh not simple-cross-compiler.sh) and by "compilers" I mean linux and elf. (Because the symbol prefixes are different and the ROM bootloader's written in assembly using the bare metal elf compiler prefixes, and no I can't override it on the command line because it expects to links with libgcc using the other symbol prefixes, and I've looked at changing the prefixes in the bootloader and it's an extensive rewrite _and_ the bootloader apparently isn't the only thing written that way.)

Working on it...

June 9, 2015

Chrome found a new way to break. The version running in virtualbox on the mac thinks it's running on an android phone. Giant blue drag pins as soon as you click on any text, hilariously large on a PC screen. I'm not sure what triggered this, it's the stock xubuntu apt-get install of chromum-browser. Maybe it thinks the virtualbox funky video driver is a touchscreen? Either way, it's completely unusable like this.

The _hardware_ of the mac is lovely. (Given how high-end it is, this is unsurprising.) The software side... well if I use the native mac apps it's merely gratuitously foreign. (The trick is to think of the keyboard mapping as broken, so it wants the flower key where everrryyyyything else would would want control).

But virtualbox? The above repeated keys in everything weren't any different tthan when I typed normally, I'm just not removing the sad repeats it's always putting into stuff for the moment. (Note, it has dual processors and seems to think that all of one and the majority of the other is unused. But sometimes it schedules away and back between the key press and release events long enough to cause duplicaates, especially when the battery is low and the host reclocks itself but doesn't inform the guest it's done so. Which it currently _isn't_ by the way, this is just its noooooooooooormal level of gratuitous silliness that I'm constantly correcting. The _fun_ part is when it does this while I'm holding down the delete key in a situation I can't undo. I live in fear that it'll do this in thunderbird while I'm deleting spammy email messages between real ones. Yeah, I can kill -9 thunderbird and go into the mbox file and unmark the delete before it compacts, but that's a serious pain.)

And don't get me started on the mouse cursor. There are various tricks to get it to show me both he host and the target mouse cursors (which technically is me exploiting a bug in the hiding logic when apple overrides virtualbox's full-screen-ness for a pop-up menu; the one at the top seems to be based on the guest cursor position, thus under virtualbox's control, the one at the bottom is based on the host cursor position and thus apple's control). Anyway, when it's showing both mouse cursors then I can position the host cursor with the touchpad and wait for the guest cursor to catch up (upwards of 30 seconds later depending on how busy the graphics is; the horrible virtual video driver schedules a separate apple host process, handles one screen update, and yields, meaning there's a round-trip process schedule cycle for every mouse cursor update AND every tick of an animated gif or swirly progress bar, and mac only seems to schedule a dozen or so times a second (so a half-dozen round trips) and it never drops intermediate updates so it can get very, very far behind. As in scrub tover the track pad and it can take 20 seconds to catch up.

When all I can see is the _guest_ cursor, positioning it accurately becomes very difficult. And if I hit the top or bottom of the screen it pops up a macos overlay which only goes away when the cursor goes a certain distance away from it; if you move the cursor just off the edge of one of those and wait it will NEVER time out (I gave it over a minute). So you've got to navigate away and back again without touching the edge. I.E. do the trick to show both cursors or you're going to have a fun afternoon.

Oh, and the keyboard has "fn", "control", "option", and "command" keys (not counting shift and such), each of which do different things in different contexts, and since the keyboard _doesn't_ have page up, page down, home, and end, remembering which one does which thing in a given context? "Oh hey, I switched windows when I ment to advance one word to the right. Again. Use the mouse to fix it. Wait ten seconds for the mouse cursor to catch up..."

Sigh. Much better hardware than my netbook, a whole second OS of apps to play with (some of which may be useful), and I'd feel guilttttttttttttttttttttttttty about wasting work's money if I didn't make the effort to use the nice thing they got me. But sometimes I pull out my netbook for a couple hours of getting stuff done in a context that's less... not fun.

June 8, 2015

When the NSA's entire data hoard leaks en masse (give it a decade), future historians will be mining it for centuries.

Bruce Schneier says that China and Russia and such already have copies, the reason neither's trying hard to get it from Snowden is they had it before he did. If they couldn't keep the diplomatic cables from a low-ranking functionary and let a contractor have root access to all their systems, clearly this stuff is already out there, let alone fresh leaks du jour.

The limiting factor of such a megadump today is storage and bandwidth, "the cloud" in Utah stores a _lot_ of data, more than wikileaks could reasonably disseminate. But the intractable data sets of a decade ago fit on a $100 retail USB drive today,USB3 is orders of magnitude faster than USB1, "broadband" keeps getting redefined (and is waaaay behind in the US what countries like Korea and Japan have had for a while)... so today's terabyte storage and gigabit transfer rates should go up a bit more before the S-curve of moore's law flattens the rest of the way out.

As far as I can tell the _purpose_ of all this voyuerism is getting blackmail material on tomorrow's politicians to force them to keep the endless funding going, or so that the next reformer trying to stop them through non-guillotine channels gets jailed for childhood music piracy. (It's Clay Shirky's observation, sometimes called "the shirky principle", that organizational survival becomes the primary goal of any organization that lasts long enough. After all, the organization has to survive in order to pursue it's other goals.) Gonna be interesting how those plans match up with the reality of electronic data leaking like a sieve over long storage periods. Once it's been collected, it tends to get out eventually unless all copies are destroyed, and the NSA _can't_ destroy the copies other countries' spies have already copied from them. Russia didn't have to bug every american's phone, the NSA did it for them, they just need to crack the singlularity of failure that is the NSA.

June 7, 2015

Flight back to the states all day today. Yesterday was a bit awkward because I thought I needed to be out of the hotel by checkout time (and remembered this around 3pm), but it turns out they had last night as a paid night in the booking (even though my plane left at 1 am).

I couldn't go back to the hotel if I wanted to yesterday because we were meeting with Yoshinori Sato, the guy trying to revive the h8300. He also used to be the sh2 maintainer, before he dropped the platform when Renesas discontinued the original hardware. Really cool guy (if a bit shy in person) who lives in Tokyo so we thought we should all get together and buy him lunch.

Jeff gave him our turtles all the way talk and walked him through the basics of our build system. We tried to interest him in what we're doing (successfully, I think), and also talked a bit about possibly making an FPGA h8300, but the multiplication part of our ALU needs to be redone (it's the only part that's full of superh assumptions, the rest of it is generic microcode).

So, flight back to the states all day today. Yesterday was a bit awkward because I thought I needed to be out of the hotel by checkout time (and remembered this around 3pm), but it turns out they had last night as a paid night in the booking (even though my plane left at 1 am).

The flight is on United, so of course zero work can be done on this trip. I dowanna fly united again. They are aggressively no fun. (Longish early morning layover in a california airport, so a bit of laptop charging time, but too sleep deprived to concentrate. On the Delta flight to Japan I wrote about half of ps.c, and on the way back I wrote an interactive hex editor. This time I'm trying to do j2 documentation, but am not making appreciable progress.)

Got home early morning Austin time on nominally the same day I left, but it was a full day of travel that crossed the international dateline, at the 1am departure time meant I hadn't gotten sleep there so I'd been up for somewhere over 30 hours.

In theory we were doing the "Avengers Assemble" thing at 3M (which I flew back when I did, onsite customer meeting in Austin). In practice, I begged off and went home to bed because Zzzzz.

The disaster recovery people have come and gone (remember how we had our second annual "flood of the century" while I was out?). The furniture is mostly back in place. The baseboards are missing, but eh. Yay creamic tile, replacing all the flooring last time meant this time wasn't quite as bad. (I say that not having been here for it...)

The bed now has a metal stand under it, with space for the cats to hide. This means it's much further up off the floor. Weird.

June 6, 2015

Android photo gallery updated itself, then tried to upload all my photos to the cloud without asking. It announced it had done so _before_ walking me through the setup wizard for the new vesion (which I couldn't opt out of despite having used it before it upgraded, "oh no, you can't just ignore this and go on using the app you never asked to change. You have to stop what you're doing and listen to this unskippable commercial for google before we'll deign to let you continue using your device."

Google is Microsoft now. Can't opt out of cloud rot. (Even if you tell the play store not to install updates, it installs them anyway once they're old enough. With complete user interface rewrites at least once a year, so you have to learn to use the darn thing all over again.)

Meanwhile at work, I was pretty sure the Numato's problem was they only wired up rx and tx in their serial port: no RTS/CTS pin, not DSR/DTR pin. Meaning to make that work in Linux, you have to run stty on the serial port with -crtscts (and possibly -clocal and -cdtrdsr) to switch off hardware flow control. (I've encountered this before, it's a common way to half-ass serial ports at the wiring level. Usually involving bare wires, sticky tape, and either plastic clips or twist ties. The people who actually solder stuff tend to run more wires, dunno why.)

I did this on my linux netbook, and it worked! It talked to the numato board using the numato's actual serial interface for serial console! Then we tried in in the linux vm of the mac and it didn't work. Nor did using the host's stty, which has -clocal and -crtscts mentioned in its man page (possibly redundant but I'm doing the "big hammer" stuff right now to Make It Work) but they don't seem to _do_ anything.

Jeff's theory is that the USB packets the thing is producing via its built in serial to USB converter (yes, really; only wire up 2 serial pins to the FPGA, and then export it from the numato board as a USB micro-A plug) is producing bad packets that windows and linux accept but macos filters out due to some unnecessary sanity check. And macos is running this sanity check _even_ when you've told it to pass the raw USB device through to virtualbox, it still vetoes USB packets that smell bad to it.

Anyway, we made it work on _a_ test system here, and filed a bug report with numato's website. Once again the bit that didn't work was the mac. I'm trying not to judge.

June 5, 2015

Attended an excellent git internals talk, which I think gave me enough info to make my own git clone. Except for the network bits. It basically boils down to "git objects are zlib compressed and the headers it displays are pretty much there at the top of the file". He used a perl invocation to un-zlib them, but toybox's deflate has the right code except for the adler32 stuff (different checksum, because of course zlib wouldn't use crc32 like everybody else).

A non-gpl git is one of the big missing pieces in a non-GPL development environment for self-hosting Android. (As opposed to a a href=/talks/celf-2013.txt>non-GPL _build_ environment, which we're most of the way to. Build environment is a headless compile box churning out nightly images, a development environment is something a programmer sits in front of and interacts with. The line's a bit fuzzy at times, but your build machine doesn't need "make menuconfig" to work, but your development machine does.)

To start I just need enough git functionality to satisfy repo, which comes from "git clone https://gerrit.googlesource.com/git-repo" and the only interesting file in there is "repo" but they have 27 others for some reason). The "repo" file defines a GIT environment variable and uses it for everything _except_ version checking for some reason. That calls "git version" directly, by name. And is hardwired to care about version 1.7.2, so this is another sed --lie-to-autoconf situation, looks like...

Anyway, that's waaaay down on my todo list. What is my current top of todo list, anyway? Dig dig... in "recently pushed on stack" terms (not necessarily priority) it's:

http://lkml.iu.edu/hypermail/linux/kernel/1505.3/00454.html
h8300 guy
sh unimplemented syscalls patch
4.x patch cleanup
  device tree conversion
buildroot release - toybox patch?
qemu-m68k - almost working!
  - coldfire aboriginal target, document on nommu.org
three aboriginal packages still on sourceforge!
  - (gene2fs, ext2fs, squashfs)
  - sha1sum should catch malware, but _dude_...
uclibc-ng target list (musl copies these, they lose reason):
  http://lists.uclibc.org/pipermail/uclibc/2015-May/048991.html
  NIOS2, ARC, Microblaze and Xtensa. Nios2 and Microblaze are
  supported by GNU libc. ARC and Xtensa are only supported by
  uClibc/uClibc-ng.
  - buildroot qemu xtensa target

And so on, and so forth. And that's not even counting the high priority but not recent stuff, like smp and dma support for j2. (I continue not to be a kernel guy. Oh well.)

Alas, my notes from the git panel (and the systemd stuff) are on my netbook, and this is the shiny new mac that I'm gradually trying to migrate to. I rsynced over all my files, but it was a few days ago and they're getting a bit out of sync. (The mac is much more powerful, more memory, more disk and it's an ssd, bigger screen... but I hate the keyboard and am still having issues with the VM Linux is running under.) But hey, I'm making a concerted effort to use the mac...

June 4, 2015

We gave our talk at LinuxCon, and it was good. Jeff delivered most of it, and it was much better than the Jamboree talk, although it's still evolving.

I was surprised how _shy_ Jeff gets in front of a crowd of technical people. He's quite a decisive and commanding presence in person, or on the phone with investors, but if give him a microphone to talk about tech to techies he umms and speaks at half volume, goes slowly and a bit rambly... I think he's out of practice, and we worked on the material right up to the deadline so never got rehearse the delivery. Still, it went quite well. Pity the room change means they didn't record it.

Yes, our original timeslot turned out to be double booked. We got picked off the waitlist at the last minute by the California event organizers to fill in for a cancellation... and the Japan-side event organizers _also_ provided a replacement. We found out about it on the day when we look at the room schedule on the big monitors to find out which room we were in, and weren't there. they were very apologetic, opened a new room for us, and put signs everywhere telling people how to find the new room (on its own floor, through a special elevator). Mostly worked out fine, but although the room had a projector and A/V equipment, they didn't seem to have the video camera set up in the rooms. (Oh well, we can do it again at Plumber's and ELC Europe. Tell the world!)

Jeff bypassed the numato's broken serial port by repurposing some GPIOs, so his actual demo involved horrible wires stuck into the board in a way that we can't ask anybody else to reproduce, but we did manage to show them two different boards we got this working on. (Technically. We couldn't demo the red board live because all the sdcard extenders are in canada since I mailed mine to Rich, and the method Geoff came up with to load a kernel via the jtag takes 15 minutes and is a horrible "xilinx GUI plus digilent plugin" sequence that we really don't want to force anybody else to deal with.)

Meanwhile, most of _my_ last minute panicing involved cleaning up and posting material on nommu.org (a big nommu programming basics page, my old memory faq, and more to come. I've got about half the bitstream toolchain install howto cleaned up.

I also sent the broken-up kernel patches to Kawasaki-san, to post on 0pf.org. That adds basic support for our board, although it needs more cleaup: our config symbol names are unfortunate, I need to convert it to device tree, several of the broken-out patches are cleanups that we _use_ but aren't directly related to our board...

And one of the things I still need to do is get out an Aboriginal Linux release with the fixed sh2eb-flat toolchain, So I'm banging on aboriginal linux and while I'm here adding coldfire support, which has been a todo item forever...

It's... not something other people can reproduce from the website yet. We still have to personally walk them through it. I'm working on it as fast as I can but OW...

June 3, 2015

Attending Linuxcon Japan, at the snootiest high-end hotel/spa/resort I've set foot in this decade. (Wow, the linux foundation has money to throw around and is trying hard to show it.)

Sat down with Lennart Pottering, and talked with him for an hour about what subset of systemd toybox nees to understand to support packages that expect it. He was actually quite reasonable, and it turns out I _can_ ignore 95% of what that thing is doing and force its clients to speak a very simple protocol by setting the right environment variables. So I have a pile of notes I can't work on now because _my_ talk is tomorrow, and there's all these other panels to attend...

(The subset of container stuff his talk was about looked interesting too, and once again "I can chip out of a tiny bit of this hairball and ignore the rest of it once I figure out what it's doing, and then be compatible with an independent implementation." (If I can't, I'm not interested.) But I only caught about 5 minutes of that talk (scheduled opposite stuff), and need to wait for the videos to go up.)

Attended a very, very strange "vip dinner" in the evening which speakers were apparently invited to but mostly chose not to attend, and most of the actual attendees were corporate executives. I shouldn't have been surprised, it _was_ organized by the Linux foundation. (Lennart was there but I'd already talked to him.) Sat next to Karen Sandler of the Software Conservancy (who I last saw at Texas Linuxfest, she attended the sadly incoherent talk I gave right after getting heatstroke). May have talked her ear off about licensing, but I have... opinions.

Returned to the office to work on presentation stuff. Jeff made the numato board work! Awkwardly, the serial's sort of brokenish, but it boots from the sd card!

The original premise of the talk was "download this and you can reproduce what we just showed you right now, it's all documented on the website". This is... somewhat ambitious, shall we say.

June 2, 2015

Jeff found a potential new board to use instead of the red boards, the "Numato Mimas v2". This one has an sdcard built in, retails for $49.95, and is from india. Here's hoping we can get it to work, the little red boards are awkward to make work without extra hardware we can't point people at so they can order it retail...

Anyway, he ordered two of them, and they arrived today. (Apparently india to japan isn't a big deal.) The question now is whether we have time to make them work by thursday.

Jeff showed me how to download the pinout file from the numato website, but it has to be adjusted by hand to fit into Geoff's build system. (It gets broken into two input files, and symbols have to be slightly renamed to match the labels it expects.) It's all very foreign language to me, I could probably make this work on a less tight deadline but there is SO MUCH CONTEXT necessary to get a light to blink on and off at the far end...

What we really need is the hardware/FPGA equivalent of "linux from scratch", walking you through the process to get a basic working system. Unfortunately, it looks we're the ones who'll have to _write_ that...

Working on it.

(The main blocker here is that Jeff himself has a stomach bug, and has mostly been out of the office since the Skype with Rich.)

May 31, 2015

I'm possibly under a bit of stress recently, given that I've ranted on both kernel.org and uclibc.org about how both projects are essentially based on a false premise these days. (Neither is new. My "rise and fall of copyleft" talk was in 2013. I already declared uClibc dead in 2007, and went over the reasons it's development collapsed back in 2006, search for the word "telethon").

Linux obviously isn't dead, but between the license becoming toxic, the creeping bureaucracy, and the aging developer community, they're doing a good job of reproducing the descent of 1980's unix into entrenched grognards circling the wagons around their historical niches. Oh well, at least another decade before it becomes a pressing problem, which means everybody can ignore it until it becomes a sudden crisis nobody could have predicted. Nobody will want to hear about it until then.

May 29, 2015

Last weekend I set up a skype call so Rich and Jeff could talk directly to each other about adding sh2 support to musl. (I am a terrible intermediary for this sort of thing.)

Monday I mailed my little red board to Rich, who believes he can get musl working with FDPIC. There are more little red boards lying around, but the green extender thing on top was something they botched up in canada that (among other things) adds an sdcard, which it loads vmlinux from. Jeff was sure there was another of those lying around the Japan office... until he tried to find one. There may be a way to load a kernel through the jtag, but it would probably take about 10 minutes because jtag.

I've still got the big blue lx45 board, but I can't exactly document how anyone outside the company can come up to speed on that, since we had it manufactured special.

Today the board arrived and I walked rich through setup over IRC. (Yes, the awkward green extender thing plugs in backwards, they wired it up wrong. Sent him a new vmlinux image and source to build his own, plus a cpio snapshot of the root filesystem I've been testing with which is crappy but gives a shell prompt.)

May 28, 2015

I've wanted a directory of standard test files for the toybox test suite for a longish time. I've sort of been deferring it until I get "tar" promoted, because the logical thing to do is have it be a tarball, but modifying the tarball contents becomes awkward. Would it be better to just have a directory of files in source control?

One of the test filenames is a bunch of utf8 characters (a japanese poem, apparently) and I actually _don't_ want to be testing that every development filesystem toybox is ever checked out on handles utf8 characters properly.

I sort of want to hijack aboriginal to run toybox test cases in a known environment, specifically to handle the root access tests and the ifconfig tests, things that expect there to be a device called "eth0" or a /dev/loop1 that I know is _not_ in use by anything else. I can create a build control image to boot, run the toybox tests, and exit. This was one of the design goals of aboriginal.

But I haven't got git on aboriginal and am not building it as a prerequisite to running tests. I'd rather use a tarball I can extract a fresh copy of for each test, but that makes updating the contents in source control tricky.

There's some design work to do here. Figure out what to trade off...

May 27, 2015

I'm told that disaster recovery people were contacted, and Fanpocalypse II is go. Sigh.

It looks like the thing we need to do is get landscapers to dig a foot deep 6 inch wide trench along each side of the house (or at least perpendicular to the road) and fill it with gravel, all the way down to the road. Hard to do from here and Fade and Fuzzy have their hands full. (I'm glad Fuzzy _didn't_ come with me this time, Fade really needs the help...)

Meanwhile, I have a low level but persistent cold that's making it hard to sleep more than a few hours a night. Kinda minor in comparison, but I've gotta get stuff done for the talk on the 4th, and there's so much to do...

May 26, 2015

The house back home flooded today. Or possibly yesterday. (It's the 26th in japan, the 25th in Austin. International Dateline.)

Not the quarter inch of last time but more like an inch this time. There's basically nothing I can do to help from here. Kinda distracted from working right now.

At least it's all porcelain tile now, no carpet. But the walls are still drywall that wicks water right up and holds it until giant blowers spend two weeks feeding into refrigerator-sized dehumidifiers.

I need to find landscaping people who can improve the drainage. The house is half a foot above street level, and the street slopes downhill all the way to the river (another hundred or so foot drop over the course of 3 miles). There's no _reason_ for the house to flood except when we get such an intense storm that the water can't drain away fast enough. This used to never happen (google says our house was built in 1950, hadn't ever flooded according to the disclosures when we sold it), and now it's happened two summers in a row.

The floodwaters have rechared Lake Travis from 38% to over 65%, and still rising. Did I mention that the super intense thunderstorms come in the middle of massive droughts?

Climate change is _annoying_.

May 25, 2015

Apparently you can finally boot uboot on qemu-i386. Yay. That only took what, eight years? If it had happened a little earlier it might have displaced grub as the standard PC bootloader back when the PC still mattered. (Android's not going to use something licensed under a GPL.)

Hilariously, my bluetooth headphones (same ones!) associate just fine with the mac, and remain associated with the android phone, but still won't associate with Ubuntu.

The two stacks they will associate with are Apache and BSD licensed, the one that doesn't work is GPL. But of course copyleft produces better software, they say...

There was an earthquake today. The room wobbled for like 15-20 seconds. I thought the guy behind me was wiggling his foot WAY too much then went no, that's gotta be a _really_ big truck going past, but it just kept going. And then tailed off, with no actual damage to anything. *shrug*

I really need to migrate the FPGA board kernel patch series to device tree. In theory between the device tree docs and the mips malta device tree conversion series I have the info to do it. Just need to claw that far down in my todo list.

May 24, 2015

Setting up the new fire breathing mac, which has a terabyte SSD and 16 gigs of ram... but only 2 processors. (Apple doesn't seem to offer quad processor laptops. A couple years ago I got an Acer Aspire One that at least hyper-threaded to 4 processors, but Steve Jobs is dead.)

I generally install QEMU from source, and even though I rsynced over the /home directory (and thus the source) it really did not want to install. The whole "git sacrifice-kittens pixman" submodule thing insisted that I install libtool, which ain't happening. Libtool is of zero use on Linux. (It makes non-ELF systems act like elf, Linux switched from a.out to ELF almost 20 years ago, but the FSF is still pushing it because they want you depend on as much of their stuff as possible and they don't REMEMBER why it was there in the first place.)

Unfortunately, the pixman autoconf crap is extensively infected with libtool, and I eventually remembered that the trick is to install the pixman-dev package on the host and NOT use the git submodule. THEN you don't need libtool installed.

Happily, this means I don't need autoconf installed either.

May 19, 2015

The magic to write a character at the bottom right corner of the screen (without scrolling the screen) is:

echo -e '\033[25;79HX\033[25;79H\033[1@\033[0;0H'

I.E. ESC[1@ inserts a blank character at the current cursor location, scrolling the line after it to the right. (This is from man 4 console_codes.)

Sigh. Wanna go write command history and less and more and vi and so on, but right now I'm up to my neck in sh2 stuff. It's fun, just... more things to do than time/effort to do them.

May 17, 2015

Woke up at 6am due to flash flood warnings on my phone. May be a bit grumpy as a result. (House flooded last year. There is STRESS.)

My three previous international round trips were all on Delta, which is nothing to write home about but got me there. This time, United was cheaper. Now I know why they're cheaper.

Upon arrival in Japan, Jeff confirmed that the Deep Vein Thrombosis that killed two salesbeings at a previous company he worked at (during the dot-com boom) wasn't due to pressure changes, but due to being confined to an uncomfortable chair for 13 hours straight. (Last time he told me to take asprin before the flight. So that's _two_ painkillers I'm supposed to take for non-pain resons. Yes, my pinched nerve is still bothering me.)

So yeah, happy to be back in Japan. Part of the happy is not being in anything owned by United. The new hotel has an Emergency Evacuation Fish right outside. Still Japan out there, then. [update: This.]

Internet doesn't seem to work at the new hotel. There's an ethernet cable, but it transciever toggles every 15 seconds and renegotiates dhcp. It has what looks like at leat a hundred feet of coiled cable behind the desk, which is bolted to the wall so I can't easily remove it.

May 15, 2015

New mac arrived! Yay!

I get on a plane back to Japan in 2 days. Not sure I can do any serious setup before then, but hey: I can take it with me.

May 14, 2015

Work finally ordered the macbook yesterday. Got an email that it should arrive on the 18th. My plane back to japan leaves on the 17th (and the flight back is the 7th). This should end well.

Circling back around to ps now the dirtree thing's fixed, and the behavior remains _odd_:

$ ps -o c time
ERROR: TTY could not be found.
[dump of help text]

What? (That's not mine, that's ubuntu's.)

I'm also sooo behind on the Tizen (and android!) guys' LSM changes, mostly to ls but there's queued up patches to a half-dozen other commands in the smack tree.

May 13, 2015

Left my phone home and my bluetooth headphones won't associate with my netbook. (Android ditched the GPL Linux bluetooth implementation and wrote a new apache licensed bluetooth stack. Android's new one works just fine with my headphones. The ubuntu one thinks it associates, but no sound happens. Hilariously the FSF zealots were insisting that the opposite would be the case, because a billion users sending you bug reports is totally outstripped by a "linux on the desktop" userbase so thin it's quite likely nobody else has ever _tried_ to use desktop linux with this brand of bluetooth headphones. Sigh.)

Anyway, it's annoying me because I can't drown out Faux News playing in this food court where the "story" is apparently some woman insisting she had Obama's love child, but it was deported, and now Kim Khardassian is suing obama on her behalf or something? It's not very coherent, but it's very angry, and very well funded by billionaries mad that the continued existence of the federal government prevents our contry from collapsing into feudalism allowing them the opportunity to be kings of tiny little despotic banana republics with slaves and harems and sedan chairs and so on. You just can't get good help these days if aren't allowed to brand them with hot irons and use leg shackles, let alone have to pay them minimum wage with OSHA regulations. (You know, the french revolution seems to have turned out all right 200 years later. If you declare billionaires a game species, presumably the nice ones can give away money down to $999 million... Oh well.)

Now they're talking about Lindsay Lohan's community service. This is the "News" part, I take it? Sigh.

May 12, 2015

The ELC videos finally posted. My two are in there if you scroll down far enough.

Today was the deadline for submitting a talk to Linux Plumber's conference, and since the device tree people emailed me a while back to ask if I was coming (and if so whether my work might sponsor the trip because the Linux Foundation probably wouldn't), I submitted an updated version of the Jamboree talk (4pm in that page should have a link to youtube, and I cannot understate how jetlagged I was during that, and yes Kawasaki-san's section was in Japanese, but Jeff's third is nice).

It's possible we'll still get a slot to give it at Linuxcon Japan (we're waitlisted), but either way: different audience, and I should be able to do it myself by then...

Meanwhile, I had to fix dirtree because "ps" was spitting out warnings about /proc files going away while it was trying to stat them. For ps, this isn't an error: the /proc directory is dynamic and if you're running a compile or a shell script lots of short lived processes spawn and go away really fast. The correct thing for ps to do is ignore them, but dirtree doesn't know when not to complain about being told to do something it can't do.

So I taught it, which involved a new DIRTREE_SHUTUP flag, and the _problem_ is the infrastructure only previously had to deal with one input flag (not a return value from the callback function, but something passed _in_ to node initialization). That flag was DIRTREE_SYMFOLLOW, and the functions that took a "symfollow" argument treated it as a true/false value, either zero or nonzero. This let callers go "toys.optflags&(FLAG_L|FLAG_H)" and pass in several different values, which told it to do the thing or not do the thing.

Now I need to specify _two_ possible things, which means I need to feed in the right bit value, which mean the callers changed to stuff like DIRTREE_SYMFOLLOW*!!(toys.optflags&(FLAG_L|FLAG_H)) and that... is awkward to change a lot of callers to. So I added a wrapper, changing the old dirtree_add_node() function to take flags instead of symfollow, but adding dirtree_start() to do the SYMFOLLOW*!!(blah) trick itself. (It's the same as blah ? SYMFOLLOW : 0 except there's no test adding a bubble in the pipeline for speculative execution and branch prediction, and we don't need to store a constant for the alternate value. Honestly the compiler's optimizer can probably turn each into the other, but on pipelined architectures avoiding a test and branch is generally polite, so that's what that trick does. Multiplication is one clock cycle these days, it's divide that's still painful.)

Where was I? Texas. Right.

Anyway, I'd already found all the dirtree_add_node() calls in toys/*/*.c and changed the first few, and it turns out all the ones outside of dirtree.c passed in 0 as the first argument (no existing parent pointer), making a dirtree_start() wrapper with one less argument _and_ treating the symlink argument as true/false instead of requiring the bit value was a cleanup.

And after all that, the actual change to ps to _use_ it was to the return value from the callback, because the top level call should complain if "/proc" isn't there. But it needed the plumbing cleanups for that to be passed down where it was needed, so...

May 11, 2015

Thunderstorm knocked the power out this morning. Managed to leave one light in the bathroom flickering on and off. How? Your guess is as good as mine...

So the reset command is basically write(1, "\033c", 2); and the scripts/single.sh build of that should be a reasonable test about how much of the toybox infrastructure is dropping out:

$ ls -l reset
-rwxrwxr-x 1 landley landley 14536 May 11 14:51 reset

That's wince-worthy, but nm --size-sort toybox_unstripped shows that most of it is probably because the --help infrastructure is enabled, which is sucking in error_exit() and xprintf(). If I rebuild it without --help I can presumably get that down, and we _are_ talking a 64 bit build (bloat city at the best of times). So I should do some work here, but it's more annoying than alarming at the moment.

One interesting note is that "this" is 0x2028 bytes of BSS (which matters on nommu systems, eating 8k of data per process for no reason). The reason is that generated/globals.h doesn't have USE() macros around the union members, so the size is _always_ the largest possible size, regardless of configuration. That's fixable, but I need to teach the build infrastructure to add that. (Also, one of the pending commands is eating 8k of globals... It's ip.c which has char gbuf[8192]. Sigh. modprobe.c having an array of 256 pointers isn't a huge improvement. There's a reason for the pending directory...)

But the _frightening_ bit is that glibc seems to have managed to get EVEN WORSE about static linking. Compare!

$ ls -l reset.static.gdb
-rwxrwxr-x 1 landley landley 794336 May 11 14:46 reset
$ ls -l reset.static.musl
-rwxrwxr-x 1 landley landley 30368 May 11 14:47 reset

Yes, that's 30k for musl and almost 800k for glibc. For a 15k dynamic binary.

May 10, 2015

Adrienne was visting for a couple days (crashing on our air mattress in the office, just drove back home) and she said she gets carpal tunnel and treats it with ibuprofen, due to the anti-swelling properties. Trying that, no idea if it's helping. (It's not that it comes and goes, the edge of my left hand is always noticeably screwed up when I poke at it, and that finger hits a vowel, shift, and control key and has to NOT hit the epically useless "caps lock" key they still put on modern keyboards due to hysterical raisins).

But it doesn't seem to be making things _worse_. I'm really hoping it fixes itself, it's been screwed up for a week now. I have to get on a plane sunday, it's one of those "if I don't get medical whatsis for this soon I there's a 2 week block where I'm out of the country" things. As you do, apparently.

I _think_ it's my shoulder, or at least that's the thing that screws it up when I try to sleep in any of my usual positions. Might have something to do with the elbow. Probably not wrist. Maybe neck? (My neck's been so screwed up for years it's buried the needle diagnostically speaking...)

Sigh. The thing about getting older is all the little stuff you recover from in 5 minutes as a teenager _lingers_, and eventually they start to accumulate and overlap. Last year's heatstroke eventually fixed itself, it just took about 8 months. My "allergic to mosquito bites and they're accumulating to the point I'm getting a systemic histamine reaction bordering on a skin disease" responded to cortisone pills to knock it back down to the point where my immune system trailed off into surly grumbling and lost interest. The lingering cough that may have wound up mildly dislocating a _rib_ it was so bad went away after two months (and the swollen lymph node that lingered afterwards for almost a year but finally went away). The scar on my foot from 2012 is about half as bad as it used to be. (Given that a similar scar on my left arm took like 7 years to go away in my _20's_, that one could take a bit.)

So I'm staring pointedly at the pinched nerve thing and going "is this gonna be a one week thing, a 6 month thing, a permanent disfigurement...? Let's SPIN THE WHEEL OF HEALTH!

It would be nice if there were competent medical professionals I could go to, but obamacare didn't fix the fundamental problem with the US health care system, I.E. the fact the medical cartel/guild (ahem, "American Medical Association") decided back in the 1970's to limit the number of doctors allowed to enter medical school (via insane entry requirements masking quotas) so the artificially restricted supply of doctors would dramatically raise all their salaries. Unfortunately this has combined _badly_ with the aging baby boom, and the cartel has fought hard against attempts to supplement the artificial doctor shortage with "nurse practitioners" or visas to let doctors immigrate from india and such. The result is doctors tend to be swamped. You're the 30th person they've seen today and their goal is to make you go away, they don't really have time to form any sort of relationship with a patient who isn't actually hospitalized or offering enough money to carve out a disproportionate chunk of their time.

This sucks rather badly, but short of abolishing the american medical associaton (which would be the first to point out that the return of homeopathic snake oil salesmen with traveling medicine shows would not actually be an improvement), nobody seems to have any idea how to fix it. It's the whole "regulatory capture" thing, even Obama didn't have the political capital to stand up to the drug companies _and_ the medical licensing cartel.

May 6, 2015

Apparently between https as a ranking signal and downvoting any site that doesn't explicitly pander to iphone bugs, Google is about to remove landley.net from its search entirely.

Oh well. I'm sorry that Google is self-destructing, but I waited out AOL, I'm waiting out Facebook, I can wait them out if they're going to be that stupid. I almost always "request desktop site" on my phone because places like wikipedia give such broken mobile sites it's not funny.

May 4, 2015

Went to see the chiropractor about the annoying pins and needles in my left pinky (and half the finger next to it) that started Saturday night and is still there two days later. Most likely a pinched nerve, and that's what those guys _do_, so...

He didn't fix it. Suggested a long course of treatment at $90/session. Since I left Pace's corporate insurance and switched to obamacare, he's out of network now. He's also way the heck away from home now that I don't commute to The Arboretum every day, there's a chiropractor in the Hancock center next to the T-mobile place if I just want some guy to fiddle with my neck. (They don't have my X-rays on file, but presumably if they had to take them again they wouldn't use equipment out of a 1970's dentist's office still using physical film and big lead aprons.)

The other problem is the treatment is the _exact_same_ set of exercises I did for months last year. (So this whole "adjust my spine so it's not pinching a nerve" thing doesn't work when I'm coming IN for a specific pinched nerve?) So I thanked him, paid for the session, and went home.

Going home was by way of the apple store in The Domain: work's been threatening to buy me a mac for ages and today they poked me to figure out what I want. Looks like a 13 inch macbook pro is the smallest one you can get 16 gigs of ram into, for reasons that remain unclear to me given the memory itself is less than two inches by one inch. Definitely _not_ the "only one USB port to the outside world and it powers the machine, try not to break it!" new macbooks. Somebody wants to be Steve Jobs a little too badly, I think.)

Sigh. I had an anatomy and physiology class that covered this, what is it, ulnar nerve? Could be my perpetually screwed up neck but that's not _new_. I was leaning on the armrest of Fade's chair (playing Sims 3 on her mac for many, many hours) so I could have screwed up my elbow, or shoulder, or wrist. Could be anything really. Here's hoping it goes away on its own.

April 27, 2015

Promoted and checked in hexedit. The advantage of this for the rest of toybox is it's the start of the long awaited not-curses infrastructure. (blessings.c? foiledagain.c? interestingtimes.c? No idea what to call it yet.) It's basically plumbing I need for command line editing in toysh and to implement vi and less and so on.

I've ranted before about how terminal control is obsolete, but let me try to summarize why I'm taking this approach.

Back in the 1970's every different brand of teletype machine (the standard I/O device of the time: a combination keyboard, serial port, and daisy wheel printer writing in ink on paper) spoke a slightly different protocol across the serial port, so the unix "tty" system grew a bunch of status bits (look in the tcgetattr man page for all those ISTRIP and INLCR constants) to humor the different variants.

Then in the late 70's we got "glass tty" dumb terminals (a keyboard and serial port hooked up to a television instead of a printer, saved on paper) that let you move _around_ the screen and change color and such, but all the ASCII values were taken so everbody used multibyte escape sequences to represent new things like "cursor up", and again each vendor used different incompatible escape sequences. So another driver layer showed up to interpret this mess using the "$TERM" environment variable to specify _which_ set of escape sequences your glass tty understood.

And all this became COMPLETELY useless when minicomputers gave way to microcomputers so by around 1982 the keyboard and display were built IN to the computer (or at least connected directly), which had complete control over it (the video buffer was memory mapped instead of only accessable through a serial port, you could draw pictures if you wanted to), so now you were talking to a terminal program running on the same machine which was _emulating_ a terminal device to work with the existing software.

This is how we reached the point where two pieces of software are talking to each other using a dozen different protcol variants (different escape sequences specified by $TERM) even though it DOESN'T MATTER which one they use as long as they agree. Dumb terminals went away before Linux got started in 1991, so all we _ever_ had to do to do is pick a common subset of these sequences, hardwire in support for that, and bury this termcap/termios/curses nonsense in a sea trench alongside EBCDIC.

It turns out there's even a standard: the American National Standards Institute documented a common subset of escape sequences over 30 years ago, and DOS implemented these "ANSI escape sequences" back in the 80's. They're loosely based on the DEC VT100 escapes, which works out especially well for Linux because Digital Equipment Corporation was not just the biggest minicomputer vendor but also the hardware that Unix was developed and deployed on (prototyped on DEC PDP-7, developed on PDP-11, and then BSD unix was mass-deployed in 1980 as the IMP replacement across the arpanet on DEC VAX hardware, which is how Unix became the standard operating system of the internet).

So the standard DOS adopted back in the 80's works fine for Linux, and all the common $TERM types ("linux", "xterm", "vt100", "ansi") should support this fine precisely because it _is_ the common subset. Even the kernel's ctrl-alt-F1 VGA terminal driver supports it. Linux even has a man page on commonly accepted escape codes, it's "man 4 console_codes" describing what the kernel's VGA terminal driver (and thus presumably TERM=linux) implement.

This is why curses needs to _die_, it's a giant pile of complexity serving no modern purpose, dragged along because we've always done it that way and the people who understand why it was that way wandered off and the new guys blindly repeat the patterns they inherited. And thus "let's just do the simple thing" is met with scorn because it MUST somehow be dangerous or we'd already all be doing it that way. That's why it takes _effort_ to make this crap go away. Sometimes via research and sometimes by taking a risk and rediscovering why not doing it was a bad idea (and then either fixing it or documenting it, but generally NOT reproducing exactly the pile of crappy workarounds accumulated in the dark).

The alternative to shoveling out this mess is drowning in superstition. (The /bin vs /usr/bin split was another one of those. There's a reason computer history is a hobby of mine, I want to know _why_ we do things.) And this is why systemd scares me. A sealed black box of ever-increasing complexity with no clear explanation even of what problems it's trying to solve, just "trust me, we'll do it for you forevermore"? That is THE WRONG APPROACH, even without bringing actively dishonest agents (NSA voyeurism, russian kleptocracy, china's great firewall, Red Hat cornering the enterprise market and forcing its technological decisions upon standards bodies ala RPM as the only packaging standard in LSB, Wintel deciding that ARM must have ACPI instead of device tree because reasons) into it.

So the hex editor gives me an excuse to write the escape sequence parsing code that reads cursor up/down/left/right, page up, page down, home, and end. (And presumably more keys if they become interesting.) This involves putting the terminal into raw mode, and writing the signal handler plumbing to restore it atexit. (Although if you ctrl-c or ctrl-z in raw mode it doesn't produce a signal, so I have to do that myself anyway. Speaking of which, the "redefine the break key to something other than ctrl-c" functionality of stty? Screw it, that's part of the historical baggage from the teletype days, it DOES NOT MATTER anymore. I can implement it in tty, _and_ I can have hexedit respond specifically to ctrl-c, hardwired.)

It also lets me write a "put the cursor at this X/Y location" function, dig up the old "scroll the entire screen up one line, scroll the entire screen down one line" sequences, and figure out how I want to write a character in the bottom right corner of the screen (the scroll up/down stuff above could easily do it, scroll up, write the new bottom line, scroll down, rewrite the top line... but that could cause screen jitter. I really want to write the whole line and then scroll just that line one to the right, basically "insert" without redrawing the line. There's probably a sequence for that...)

And then once I start on a second user (probably cleaning up more.c) I can factor this stuff out into lib/interestingtimes.c.

April 24, 2015

I was way too fried on the plane flight back from Japan to work on anything complicated, so I started adding a hex editor to toybox. The first big program I wrote on the commodore 64 circa 1983 was a hex editor. (Which I then used on the main directory of the disk that contained itself as its first test, and it wrote the sector back rotated around by one byte. Important early learning experience. Gimme a break, I was eleven.)

Alas, unlike the commodore 64 we haven't got unambiguous representations of all 256 bytes ala "what they look like if you poke them into screen memory". The 16 bit PC back under DOS did, I know this because I was writing stuff to screen memory back in my chamelyn bbs, and yes it was spelled like that; I wrote a series of like 5 bbs programs in the late 80's and early 90's and that was the one where I reinvented the bytecode interpreter without knowing there was a name for it. But the internationalization people objected to the 128 bytes ascii _didn't_ standardize being used for graphics characters, and of course none of them could agree on what _should_ go there, so we got codepages. Eventually Ken Thompson sorted it all out with unicode, but that doesn't help print a character representing each byte's full range.

What the C64 did was values 0 through 31 were reverse video versions of characters 32 through 63, and I'm totally stealing that and using it here. But characters 128-255 had graphics on the C64, and here they don't. What I did was change the color (actually switch to the "dark" version of the default color, intensity off) so there's a grey version of 0-127 mapped to 128-255. Not perfect, but eh...

There are still some more things to do. Right now it works on an mmap(), which means you need to feed it a -r (read only) flag to edit some stuff, and it simply can't take input from a pipe. That's probably ok given what it _is_ (where would you save the result, and if you can't save, why edit?) but another thing is it can't insert. You can't change the size of the file you're editing; I might want to implement that. (We have an insert key...)

Another thing I should implement is an undo buffer. Just use toybuf as a ring buffer of edits, and roll them back one at a time when you hit the "u" key until you run out. Doesn't ahve to go back to the beginning, just let you undo typos. (The undo buffer is exspecially important because it _is_ working on an mmap, meaning all changes happen immedately. There's no "save" operation, just "exit". Yes, this means on 32 bit systems you can't edit a file larger than a gigabyte and change because you'll run out of virtual address space. But since even phones are going 64 bit, I might be ok with that. Then again, $DAYJOB's sh2/sh4 chip isn't likely to go 64 bit any time soon. I suppose I could fix it so we redo our mmap() window as you traverse the file... Six of one...)

April 23, 2015

Back at $CUSTOMER site since I'm the local. (Ok, maybe I flew back from the other side of the planet for this meeting, but I do live here. Well, about 20 miles from here, but the point stands.) Ken is still here (he's a manager, off with $CUSTOMER managers all yesterday and doing that again today, but I got a ride with him). Geoff and Martin are at the airport flying back to canada, so I actualy got to do a software thing today (cross compile an updated library version). As Flynn said in (the first) Tron, "hooray for our side". The $CUSTOMER boards continue to be problematic, but they now mostly understand why, and could fix it if the engineers in question were home (in Minnesota, they also flew here; lotsa pieces being integrated).

April 22, 2015

At $CUSTOMER site for $DAYJOB. Flew back for our Big Integration Meeting Week and then was too fried yesterday and just went to bed.

Now I'm mostly helping $CUSTOMER debug their own hardware. They may get to the point where they can test our stuff this week. That would be nice. (Ah, prototype integration bringup. If we knew what we were doing, we'd be done.)

Two other $DAYJOB coworkers I've never actually met in person are here, Geoff (not to be confused with Jeff) and Martin. Both nice. Both normally in Canada, I believe. They're actually doing the bulk of our side of the work, I'm mostly helping the $CUSTOMER engineers work out what's wrong with Linux bringup on their boards. (Three different prototype boards. Three different hardware behaviors. It's that point in the project. Luckily we can test different bits on each one and show that our respective bits work when the wires go through. It'd still be nice to see all of it work together, but you can't have everything. Where would you put it?)

April 21, 2015

Japan! Yay japan.

Sleep. Yay sleep.

I did so much stuff, and posted pictures on twitter. Many, many, many, many, many pictures. Often of food. And the occasional river, or supertoilet.

April 14, 2015

Spent the day personing a both at Cool Chips, and managed to fish a college professor and several graduate students out of the croud to give them a presentation. (Using a subset of the slides from Friday's thing.)

All this is sort of practice for a theoretical talk at Linuxcon Japan in 6 weeks. (Note that Cool Chips is _not_ run by the Linux Foundation, and thus has a year in the URL rather than a history that will vanish without trace once it stops being an effective fundraising tool to milk cash out of cluess Fortune 500 companies that want a company to represent Linux the way AOL representated The Internet to grandma. Any time a large enough accumulation of money says "Get me the blogosphere on the phone, NOW!", somebody will pick up the phone with one hand and the money with the other.)

Anyway, said Linuxcon talk remains purely theoretical because their contact address is a role not a person (well of course, it's a bureaucracy: individuals are single points of failure, you can't have any of those working for you), and thus no actual specific person has answered it since Saturday. We didn't know giving a talk about this stuff at a conference was even an option until we did it on Friday and went "we should do this again, bigger and better", at which point the CFP deadline had already passed (results hadn't been posted yet but the website wouldn't let us add more), so we're attempting to "fly standby", as it were.

For a value of "attempting" that involves being completely ignored, but we didn't sponsor the conference, so... *shrug* Oh well, wouldn't be the first "hall party" I've thrown at a con.

Meanwhile, the SMP circuitry should be ready enough to at least talk about it tomorrow. Or possibly the DMA stuff. The hardware engineers have been a variant of "ready" that collapses when you examine its quantum state, but I suppose that's what I'm here for...

April 13, 2015

The future starts in Japan! By which I mean given where the international dateline is and the 11 hour time difference (or is it 10? They don't do daylight savings time here, it varies), anyway my laptop clock says it's 11pm on the 12th, but here we just came back from lunch on monday.

Either way, the 4.0 kernel dropped last night which means I need to finally fix the #*%(&# problem with 3.19 that's blocked aboriginal on the 3.18 kernel. And so we dig.

The problem is that Aboriginal's sources/patches/linux-arm.patch, which forces the kernel's kconfig to allow the "arm versatile" board to select what kind of CPU it has (QEMU can emulated v4, v5, v6, v7 with various -cpu options, but the kernel has assumptions about what's in there), conflicts in 3.19. Bisecting, it was broken by commit dc680b989d51 "ARM: fix multiplatform allmodcompile", which was itself patching the earlier commit 68f3b875f784 "ARM: integrator: make the Integrator multiplatform".

This sounds great: do they now allow you to run the same board on multiple procesor variants? Hmmm, looks like they do actually. So I might be able to just yank that patch. And the outoutdamnperl patch is obsoleted by commit d69911a68c86 so that can go to...

And then the FUN part is figuring out what the heck happend to initramfs. The config didn't change but the kernel is panicing unable to mount root= which it shouldn't be trying to do because there's an initramfs. What?

Bisect, bisect, bisect... Oh thank you so much Andi Kleen for commit ec72c666fb34 so we now have two symbols (CONFIG_INITRAMFS_COMPRESSION_GZIP and now CONFIG_RD_GZIP) meaning the EXACT SAME THING and you have to specify BOTH of them to be able to compress your initramfs with gzip. (Note! We already compress the KERNEL with gzip using KERNEL_GZIP which sets HAVE_KERNEL_GZIP (no idea why) which means we have FOUR symbols that mean the exact same thing.

~~The Aristocrats!~~ Linux!

Anyway, 3.19 seems to work now. Doing test builds on the various targets, and then I can cut a release and be once again only one version behind.

Might cut a toybox release just because I can. But I'd like to finish ps first, and expr is fairly low hanging fruit that's the main remaining item in the aboriginal linux command usage table:

   7 busybox gzip
   8 busybox dd
  17 busybox tar
 190 busybox diff
 275 busybox sh
 417 busybox tr
 457 busybox awk
3414 busybox expr

That's it for the i686 build. The LFS build plus command prompt niceties add ash bunzip2 fdisk ftpd ftpget ftpput gunzip less man pgrep ping pkill ps route sha512sum test unxz vi wget xzcat zcat and while I'm at it android is grabbing expr hwclock more netstat pgrep pkill route tar tr out of "pending" (get the value of $ALL_TOOLS out of their Android.mk and then for i in $THAT; do grep -q "TOY($i," toys/pending/*.c && echo $i; done), and a contributor to my patreon wants mdev prioritized, and $DAYJOB could really really use toysh.

So.

I should do that.

April 12, 2015

Video of the Jamboree presentation isn't up yet, but given that ELC video from last month isn't up yet either, I can't be too hard on them. (Also, we and Tim Bird skyping in from the states were the only english presenters, so Kawasaki-san did his third of the talk in Japanese. Everybody else's slides were in english, but not their talks, so pointing a US crowd at the video is... thingy.)

Bought perfume for Fuzzy, and a a can of coffee in a can for Fade. Other than that spent about half the weekend holed up in my hotel room programming (caught up with the Android git repository's accumulated commits that weren't to their Android.mk file or their checked-in copy of .config or the generated/ directory, plus their mount ioctl() thing so they can eventually switch that over), and the rest wandering around tokyo with Jeff. Fun town. I _really_ need to learn Japanese.

I also submitted a "can I fly standby" talk proposal request to the Linuxcon Japan guys. We had no idea I was coming here before the call for papers thing closed (the original reason was we're finishing up SMP support for the new chip design and they wanted me here with the hardware guys to do Linux bringup for that, it's been UP up until now). The talk we gave at the Jamboree was actually kind of nice but a bit unpracticed and described websites that aren't live yet, so we'd like to do it again in a more polished and complete form, in front of a larger audience, and all of it in the same language.

Alas the CFP is over so the web form won't let me submit a proposal. (They apparenty announce their selections tuesday. Yeah, I know.) Still, I asked and we'll see what they say. (Haven't said anything yet, but it's Sunday, so... if they turn us down maybe we can hijack a BOF or get a table or something...)

I actually learned a lot preparing the slides with the other guys. We've done an SH2-compatible chip (the "J2") because the last SH2 patent expired in October 2014, and the SH4 patents don't expire until 2016. So we can release BSD licensed VHDL (and do our public live development in a github repository) for SH2, and then add SH4 support when those patents expire.

Another reason you want a nommu design is latency: if you're doing signals measurement with nanonsecond accuracy you don't want TLB reloads adding random jitter.

Also, we're not just releasing processor VHDL we're releasing a bunch of components (serial, ethernet, DSP, etc) with a build system that lets you configure and make an entire SOC (selecting the stuff you want in it during config).

Our "sh2 managing a bunch of DSPs" design is approximately what the Cell processor in the PS3 was trying to accomplish ("powerpc managing a bunch of DSPs") and what NeXT boxes before that were doing ("m68k managing a bunch of DSPs"). The problem with PS3 and NeXT is it turns out DSP programming is something not many people know how to do, and each of them were reinventing the wheel each time. What we're trying to do is A) build an pen source community that knows how to do this and can teach even more people to do it, with a reusable library of code under open licenses, B) leverage stuff like opencl that's doing general-purpose GPU programming, since that actually maps right over to the DSP stuff.

This is stuff people working at the company know how to do... but I'm not one of them. We need to put the linux-side programming info for all this on nommu.org and the hardware-side programming for FPGA and OpenCL stuff on Zero P F dot org (becuase the Orangutan Protection Foundation got .org, Original Print Factory got .jp, and somebody in germany got .net; I think we're claiming the zero stands for "no intellectual property licensing restrictions" or some such, you'd have to ask the marketing guys).

Anyway, really fun stuff. I hope I get to talk about it at Linuxcon.jp.

(I keep typing linucon which died when I moved to Pittsburgh for a year and nobody inherited it. Also the year I chaired Linucon coincided with Penguicon 3 which wobbled badly and I spent the next 2 years focusing on getting Penguicon back up to speed (once again recruiting the guests of honor for 4 and 5, introducing Liquid Nitrogen ice cream in year 4, and so on). I stopped being involved after that because reasons. Been too busy to do another one since. Such is life...)

April 10, 2015

Enjoying tokyo immensely. They have tea the way I used to make it, at least before I switched to splenda instead of sugar. (My tendency to like cold tea with milk in it horrified both sides of the atlantic, but apparently Japan is fine with this.)

Presented at Japan Technical Jamboree #52, in the 4pm slot. They wrote in "Rob Landley and his partners" but the other two were Jeff Dionne the founder of uclinux.org (and founder of the company I work for), and Sumpei Kawasaki the original SuperH processor architect (and guy driving the new J2 processor design). They outrank me, I just got credited because I'm the one who emailed to propose the talk and we only prepared the slides the night before so they didn't have a copy of them yet. (Still don't, I should fix that...)

April 7, 2015

11 hour flight to Tokyo Haneda. Got a bit more of PS written, but netbook battery does not last 11 hours (new one was way closer but doesn't have all the right stuff installed on it yet), and there were no outlets on the plane.

Picked up from airport by Kawasaki-san (the original architect of the SuperH processor, who is working with us on the fresh implementation), and taken to Kanda Grand Central Hotel. That has outlets. Japanese ones.

Ironically, my shiny new netbook and the replacement power supply for the old one require a ground plug, which japan doesn't use. But the OLD power supply (the one with the flaky cord that only passes current in certain positions) works just fine.

(Outlets here are apparently 50hz 110 volts with non-polarized plugs, so some things just work and other things don't fit at all. The really silly part is netbooks MUST work off battery, by definition, so requiring a grounding plug or caring all that much about the plug polarity is kinda strange. And yet both new power supplies do.

I fall over now.

April 6, 2015

Guess who's getting on a plane tomorrow for a sudden last-minute trip to Japan?

Go on, guess.

But hey, this means I get to present at Japan Technical Jamboree which I've always wanted to, ever since I met Ueda-san at CELF in 2006 (the man who organizes it). It's basically a monthly Tokyo LUG meeting, but this being tokyo they fill a room with people and do a half dozen presentations.

I should learn Japanese.

April 5, 2015

Broke down and switched the toybox repository over to git.

Since android and tizen and openembedded and gentoo and so on have all been using Georgi Chorbadzhiyski's git mirror rather than the mercurial repository, I bit the bullet and switched the project's repo to git. Georgi's mirror is now pulling from that.

Now trying to figure out how to make git do lots of things I've been doing in mercurial for years. I know there's a WAY, I just have to look up each command and keep hitting crap like:

$ git log lib --stat
fatal: bad flag '--stat' used after filename
$ git log --stat lib
[ works fine ]

And there's just no excuse for that.

April 4, 2015

Spot the cheat:

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S  1000   465   464  0  80   0 -  7313 poll_s pts/9    00:00:00 vi

I'm trying to work out appropriate padding for the ps fields, so I thought I'd take a look at what "ps -l" looks like on ubuntu, and what do I find? The ADDR and SZ fields overlap. They didn't implement ADDR (it's - for everybody, even though you could use EIP field of the proc/$$/stat stuff), and they let the SZ field leak over into it, so you have 5, possibly 6 digits worth of 4k pages until you run out of space to display the resident set size.

(Figuring out how much memory a process is "using" when the executable pages and library pages are shared between processes, and even if it isn't doing file backed mmap() a certain amount of dirty page cache may be due to other files it needs)... But let's ignore that for now.)

The way you can tell the ADDR SZ combination in ps -l is a @*%^@! _SPECIAL_CASE_ is that in "ps -o addr,f", addr is right aligned, but in ps -l it's "left" aligned. That's just _sad_.

What I'd like to _avoid_ doing is readahead cacheing all the columns to be output, calculating the amount of space they'll eventually use, and then outputting them appropriately padded in a second pass. I'm trying to make this work on low memory (even no memory) systems, which means streaming operation.

Then again, what I did for vmstat was adjustment padding: when a field goes over pad later fields by only one space until we've caught back up. The question then comes up whether you eat leading or just trailing spaces, and that seems to be a question of alignment: right aligned things eat leading spaces (so their right edge still matches up), left aligned things only eat trailing spaces (so their left edge still matches up).

Which fields are right aligned and which are left aligned? Strings are left aligned, numers are right aligned, and timestamps apparently count as numbers. (You can test this yourself with "ps -o s:3,f" vs "ps -o f:3,f", although you how put that in the test suite I have no idea, because you can't controll which PID a launched process gets. Possibly some sort of backgrounded sleep command, jobs -p, and liberal use of environment variable expansion in the test cases. (Also, I need a file named "abc) def" to test the stat parsing with.)

Creating a symlink to sleep called "1", moving it to /usr/local/bin, and running "1 100 &" then doing "ps -o cmd,f" did _not_ right justify that cmd field, so it's _not_ checking isdigit() on the first character of the field. (-o tty already showed it wasn't checking the _last_ character that way).

Hmmm, "ps -p 2,3,1" does not print output in that order, so it's just a matching filter.

Another problem: truncating fields. The "pad and catch up" thing conflicts with the old ps behavior of truncating fields, which comes up for "cmd" and such in a big way on a regular basis. Hmmm...

Ok, only let a field leak out to the left or right if there's _space_ for it to do so. If a new field needs to start on the left edge or an old field needs to end at the right edge, truncate the adjacent field far enough away to leave one space between them. This means the last field can slightly more naturally expand out to the right edge of the screen (or beyond with l).

April 1, 2015

Poking at ps. Arguing with the "C" field. What does "processor utilization for scheduling" _mean_? It's not one of the /proc/$PID/stat fields. I ran ubuntu's ps under strace and it didn't read anything obvious (or call weird ioctls), it looks like the data comes from stat or status? But where?

The STIME field is easy to fetch (stat field 22 is start_time for the process, in jiffies after system boot), but the spec doesn't say how to represent it. The other ps is doing hour:minute of starting time for the same day (ok, first entry of sysinfo() is uptime in seconds since boot, close enough), but if it's not the same day it prints a three letter month abbreviation followed by two digit day (with no space so it fits into 5 characters). And again, that month is english and I'm trying to avoid gratuitous english. (Yes, the --help text is all english. There are some built-in conflicts in what I'm trying to do here. I'm open to suggestions.)

I dunno STIME would do for a process more than a year old, haven't got a system rebooted that long ago lying around. I could fake something up under qemu but that's not the point, the point is the SPEC doesn't SAY what it should be. Grrr.

I guess 04-01 for an April 1 start time? And beyond that uptime in days?

March 31, 2015

Fiddling with the old uClinux toolchain build gives me a much better appreciation for my own build system.

Mine doesn't expect to run as root, doesn't try to build packages with -Werror by default, doesn't require makeinfo as a prerequisite, actually has the names of the files it tries to download and the ones on the website match up (gz vs bz2 confusion in elf2flt), takes advantage of more than one processor while building, has had a release in the past 5 years, doesn't have its install path in the build script AND hardwired into the uClibc config file...

Oh hey, and it doesn't treat gmp, mpfr, mpc (the three additional packages gcc has metastasized into since the last GPLv2 release's gcc/binutils) as PREREQUISITES that the build expects you to have ALREADY INSTALLED and thus doesn't try to compile itself.

That's... yeah. Sigh.

Maybe I can extract config info out of this and apply it to the Cross Linux From Scratch toolchain build?

March 30, 2015

Gave up and started rewriting ps from scratch. It needs to be based on dirtree, not calling readdir() directly. It needs to use bitmasks to set its default modes (including -f and -l). It needs to actually implement at _least_ all the posix flags. (Except -n because seriously: what the?) And it needs to get some really weird behaviors right like the way "ps -A -o pid,tty,cmd" expands cmd to the right edge of the screen but "ps -A -o pid,cmd,tty" doesn't. (Honestly, why -l has an arbitrary limit on the length is beyond me.)

And that's before the whole "posix dash options vs bsd dashless options behave differently" can of worms I have to figure out how to implement.

(P.S. Why did posix have a table of "variable names and default headers in ps" where of the 15 entries, 9 have headers that are upper case versions of the names, in 3 more one is a truncated version of the other, and then 3 just random oddballs (args/COMMAND, etime/ELAPSED, and pcpu/%CPU). Why would you do that? It's SO CLOSE TO MAKING SENSE, and then DOESN'T.)

(P.P.S. Did the posix guys even _notice_ that the XSI default columns at the start of the stdout section and the aforementioned -o field list table at the end of the section DO NOT MATCH? The first has 4 fields (PID, PPID, NI, TIME) that match, 9 fields (F, S, UID, C, PRI, ADDR, SZ, WCHAN, STIME) that don't, and two more (TTY, CMD) that just INSULT the other table because -o tty is called TT but the _default_ name is TTY with the Y!

This is a standards committe? They agreed on this? I'm aware standards bodies should document and not legislate, but _dude_. This is not a coherent result. You can at LEAST just go ahead and add the missing 9 fields to the -o table. And then accept the lowercase versions of the uppercase labels as -o input the thing will recognize to trigger that field. If you want alternate historical spellings, fine, but SERIOUSLY...

Sigh. Gotta implement the standard we've got rather than the standard we want. But I am filing off some of the stupid and documenting the deviation.

March 26, 2015

That was a fun convention. In theory video should be up eventually.

Two different panels on microcontrollers, I.E. nommu systems running from SRAM so they can boot straight into Linux without needing a bootloader to run DRAM initialization. One had 8 megs of ram, one had 256 _kilobytes_, but both got away with it because they did XIP (running the kernel code straight out of flash without copying it into memory first). The 256k one even did userspace xip from a cramfs or some such.

And then Wednesday I gave my talks. Both of them. I spent all my time preparing the toybox one, working on it right up until it was time to give it (not that unusual, but it worked out because I figured out the day before what to leave _OUT_; start with "here are links to three talks I already gave, which I will not be repeating" and don't try to even do "what's new" in those areas because if I start talking about licensing or history or the self-hosting crusade I'll be there all day).

So I'm reasonably happy with the new toybox talk, but then my shrinking C code got short shrift and once again the problem was the need to edit it down. The point of the talk was that I did elaborate writeups of the 27 commits that took ifconfig from 1500 lines to 520 lines, and I'd like to explain the techniques I used. And I was willing to take it on as a second panel when space opened up because hey, I already did the prep work!

Unfortunately, the writeups were _too_ elaborate, in the 2 hour timeslot I made it through maybe the first third of them, and then had to skip to the end. What I should have done was go through and work out the techniques and skip around showing examples. Maybe I should do it again, but I remember trying again to fit The Rise and Fall of Copyleft into an hour for Texas Linuxfest. (Ok, heatstroke, rehydrating with an energy drink, coming close to needing hospitalization, and giving the talk the next day. Took me 6 months to feel resonably normal again after that. But still! Talk was not improved by second attempt at it, is my point.)

I should do podcasts.

Meanwhile, there is a _reason_ I don't schedule travel on the same day as the thing I'm traveling for. I am _totally_fried_, even though I went to bed at like 10pm each night and slept for upwards of 10 hours a night. More than one person noted they were flagging on day 3. The greying of Linux affects us all. (There was one teenager in attendance! Because one of the attendees brought his son.)

Today's an extra day in San Jose with a plane leaving at 6pm, and my voice is toast. Ensconed... ensconced... ensconcinated in some sort of "business center" down the hall from my gate, with an electrical outlet, reasonably quiet working environment, and the prospect of a $14 sandwich in the near future from one of the overpriced airport restaurants.

I happily walked to the airport. Exercise! Getting so much exercise here. And I can _smell_ things. The relentless sinus troubles always clear up when I'm here. I keep forgetting that. I grew up breathing pacific ocean air, not middle eastern juniper imported to texas as an ornamental plant hilariously misidentified as "cedar" a century ago that's gone totally invasive species upwind of a major city. It always starts spewing pollen from late december through at least march (in the middle of what SHOULD be winter), and my sense of smell goes away entirely for months at a time.)

March 23, 2015

In California at ~~CELF~~ ELC (which stands for the Linux Foundation Embedded Linux Foundation Conference by the Linux Foundation), and I'm... kind of surprised at the restraint. Last time I was here (2013) it was All Yocto All The Time (sponsored by Intel) and the t-shirt looked like something out of nascar. This time the t-shirt is a simple black thing with a small name and conference logo and no sponsors listed anywhere.

I wonder if Tim Bird staged a coup?

INSANELY busy day. Great stuff.

Like three different panels were actually work related. My boss's boss Jeff Dionne (co-founder of uclinux and founder of se-instruments.com) was coincidentally in town, and I dragged him to the evening meet-n-greet dinner thing where I actually got him together with David Anders (prpplague) so they can make the eventual 0pf.org launch actually work right for hobbyists. (Jeff lives in Japan these days, and goes to LinuxTag in germany every few years but apparently hasn't been to a US event in forever. I need to wave the Jamboree things at him.)

Alas, Rich Felker the musl-libc maintainer wasn't there (his panel isn't until tomorrow). The openembedded maintainer said he was going to show up but had a childcare thing happen instead. Oh, and the buildroot maintainer was there; his talk this year was on device tree stuff and I talked to him about _both_ buildroot (he wants me to resubmit toybox _and_ he wants to merge nommu stuff but had to give back the cortex-m test system he used to have) and device tree stuff (apparently a base boot-to-shell prompt device tree needs to describe memory, processor, interrupt controller, and a timer to drive the scheduler).

This conference is making my todo list SO MUCH LONGER...

March 22, 2015

Red-eye flight to San Jose, arriving at 9:30 in the morning because I flew over two timezones, and got to have a long lunch with Elliott Hughes, the Android Core maintainer (covering bionic and toolbox, I.E. the guy who's been sending me all the android patches). Fun guy, very smart, and apparently way more swamped with todo items even than I am.

He's sympathetic with a lot of my goals for toybox, but his time horizon is shorter than mine: right now the Android M feature freeze is looming for him, and his plans for the Android release after that are in terms of what needs to get deferred out of this release to go into that one.

My "what's all this going to look like in ten years" frame of reference seems like a luxury most android guys can't afford, drinking from a firehose of a half-dozen phone vendors sending them constant streams of patches.

(Also, he used to maintain the java side of things and still thinks java and C++ were a good idea, so we're not _entirely_ in agreement on where new system programmers come from. But I expect history will sort that one out eventually.)

Yes, for those of you keeping track at home Google bought me lunch. (At Panera.) Collusion!

Staying at an airbnb. It's quite nice. It's almost two miles from the venue, but the walk is pleasant and I could use the exercise.

March 21, 2015

One of my patreon goals is "update the darn blog" and I'm doing a horrible job at it.

Right now I'm applying David Halls' big toybox patch, which he actually posted to the Aboriginal list because that's where he's using it. He sent me a big patch touching several files, and I'm going through each hunk and trying to figure out what it does, so I can commit the individual fixes preferably with a test suite entry to regression test the bug.

It all looks good except for the last hunk, which is actually a workaround for a uClibc bug. On glibc or musl (and presumably bionic) if you open a directory and getdelim() from it, you get an error and a NULL return value. But on uClibc, your process segfaults.

I came up with a cleanish fix (more or less doing what David's patch was doing but in a different way)... but I don't want to apply it to toybox. It's a bug workaround for a problem in another package. That other package should be fixed... except uClibc is dead.

The eglibc project happened because uClibc couldn't get releases out reliably, and eglibc already got folded back into glibc. The entire lifecycle of the eglibc project happend _since_ uClibc's troubles started. Same with klibc (which was a failure, but it _ignored_ uClibc). These days uClibc is coming up on _three_years_ since their last release; that's the amount of time musl took to go from "git init" to a 1.0 release! Even if uClibc did have a new release at this point it wouldn't matter. With musl and bionic competing for embedded users both at uClibc's expense, I'm calling it. The project is dead.

At CELF I should poke the musl-libc maintainer about seeing what feature list uClibc has that musl doesn't yet (basically architecture support, uClibc supports things like the DEC Alpha and m68k and the uClibc doesn't yet), and getting musl to the point where people don't blunder into the uClibc quagmire thinking they need it, and then exit embedded linux development in disgust a year later.

March 20, 2015

Listening to The Rachel Maddow Show (apparently on self-driving cars) and I'm amazed. Five minutes in she hasn't mentioned abortion or how nuclear power will kill us all even once.

Oh never mind, around the eight minute mark it turned into "why you should be afraid of self-driving cars". And now it's all segued back into an analogy about politics.

March 18, 2015

Fade and I watched another episode of the hotel version of kitchen nightmares where bald not-gordon-ramsey mentioned a couple websites people look up hotel quality on, so I checked the cheap place I'd made reservations at for ELC in San Jose.

Bedbugs. Review after review, with photos. Right.

So I cancelled that and did "airbnb" instead. (It's silicon valley, booking through a dotcom is what they do there.) Which meant I had to sign up for an airbnb account. Which was like a twelve step process wanting to confirm by email _and_ text and wanting a scan of my passport and so on. When they wanted an online social profile, I picked linkedin from the list because I honestly don't care about that one. And since I had to log in to linkedin anyway (for the first time since 2010 apparently), I added my current position to that so it didn't still think I was at Qualcomm.

I am now getting ALL THE RECRUITER SPAM.

March 9, 2015

The ~~CELF~~ ELC guys approved a second talk for me, on shrinking C code. Yay.

I wonder if I should mention my patreon in either talk? Seems a bit gauche, but I should probably get over that. It _is_ a Linux Foundation event these days...

(Then again, maybe I should update the top page on landley.net, since it hasn't changed in something like a decade now...)

March 7, 2015

I've been using signal() forever because it's a much simpler API than the insanely overengineered sigaction(), but for some reason adding signal handling to PID 1 isn't working, and debugging it has of course involved reading kernel code where I find horrible things, as usual.

Backstory: I'm upgrading oneit as discussed a few times on the list, and one of the things I'm adding is signal handling. The old traditional "somebody authorised to send signals to init can tell it to halt, poweroff, or reboot the system", and I'm using the signal behavior in the system v init the developers at Large Company That Wishes To Remain Anonymous sent me. So SIGUSR1 should halt, SIGUSR2 should power off, and SIGINT or SIGTERM should reboot.

In theory, PID 1 has even the unblockable signals blocked by default (because if PID 1 dies, the kernel panics). But if you set a signal handler for a signal, your handler should get called (overriding the default SIG_IGNORE behavior of all the signals that would normally kill the process). Unfortunately, this is only working for SIGINT and SIGTERM, I can't get it to call the handler for SIGUSR1 and SIGUSR2.

So I dig into the kernel code to see what it's actually _doing_, and right at the system call entry point I find:

SYSCALL_DEFINE2(signal, int, sig, __sighandler_t, handler)
{
    struct k_sigaction new_sa, old_sa;
    int ret;

    new_sa.sa.sa_handler = handler;
    new_sa.sa.sa_flags = SA_ONESHOT | SA_NOMASK;
    sigemptyset(&new_sa.sa.sa_mask);

    ret = do_sigaction(sig, &new_sa, &old_sa);

    return ret ? ret : (unsigned long)old_sa.sa.sa_handler;
}

I.E. signal() is implemented as a wrapper around sigaction() with the two "be really stupid" flags set. It intentionally breaks signal handling. (Note: the hobbyist developers at berkeley fixed this in the 1970's. The corporpate developers at AT&T maintained the broken behavior through System V and beyond.)

The solution: make my own xsignal() wrapper that's my own wrapper around sigaction() that sets the darn flags to 0 to get the sane default without having to specify extra fields to get the default behavior.

March 6, 2015

Why am I paying $500 in airfare (plus more in lodging) to go give a talk at a Linux Foundation corporate event again? I'm sure I had a reason...

Oh well, tickets booked for the thing.

March 2, 2015

Current status: force resetting the man pages database to see if whatis or apropos can find a semiportable way (works under glibc, uClibc, musl, and bionic is close enough to "portable" for me) to nondestructively reinitialize the heap (leave the old one alone, just leak it and start a _new_ one with a new heap base pointer) so i can write my own vfork() variant (calling clone() directly) for nommu systems which does _not_ require an exec() or exit() to unblock the parent, but which lets me re-enter the existing process's _start(). (I can already get clone to create a fresh stack, but the heap is managed by userspace.)

You know, like you do...

March 1, 2015

Elliott Hughes sent a bunch of patches to fix printf argument skew, and another patch to annotate stuff so gcc can spit out its own warnings about printf argument skew.

Back in C89, arguments were promoted to int which meant that varargs didn't have to care too deeply about argument types on 32 bit systems, because everything got padded to 32 bits. But C99 did _not_ do the same thing for 64 bit values, which means that some arguments are 32 bits and some are 64 bits, and if you're parsing varargs and suck in the wrong type it all goes pear shaped. (If the one you get wrong is the _last_ argument, and you treat a long as an int, and you're on a little-endian system, it works anyway. This is actually fairly common, and disguises 80% of the problem in a way that breaks stuff if you add another argument after it or build on a big endian system like powerpc, mips, or sh2.)

(Oddly enough, every big-endian system I can think of off the top of my head _can_ work little-endian too, they all have some variant of processor flag you set to tell it whether it's working in big-endian or little-endian mode. It's just that some people think big endian is cool and thus break compatability with 99% of the rest of the world because requiring humans to reverse digits when reading hex dumps is far worse than making all your type conversions brittle and require extra code to perform. But this is one of those "vi" vs "emacs" things that's long since passed out of the realm of rational argument and become religious dogma.)

So 64 bit registers went mainstream (with x86-64 showing up in laptops) starting in 2005, and these days it's essentialy ubiquitous, and that means the difference between 64 bit "long" and 32 bit "int" is something you have to get right in printf arguments because they're not promoted to a common type the way 8 bit char and 16 bit short still are. (The argument that it wastes stack memory implies that doubling the size of "char" wasn't ok on a PDP-11 with 128k of memory, and doubling the size of "short" was wrong on a PC with one megabyte of ram, yet that's what they did. But it's different now, for some reason.)

Anyway, gcc's inability to produce warnings correctly extends to printf argument mismatch warnings too: if you try to print a typecast pid_t or something as a long when it's an int, it complains something like "%ld requires a long but argument is pid_t". Note that typecasts are basically a #define substituting one type for another, it ALWAYS boils down to a "primitive" type, I.E. one of the ones built into the compiler (or a structure or union collecting together a group of such), but that's not the error gcc gives. Instead it hides the useful information and makes you trace through the headers manually to find out the actual type.

There are only a half-dozen primitive types in c: 8, 16, 32, and 64 bit integers, short and long floating point, and pointers. (The ints come in "signed" and "unsigned" but that's not relevant here. There's also bitfields almost nobody ever uses because they're inefficent and badly underspecified (honestly better to mask and shift by hand), and a "boolean" type that pretends not to be an integer but actually is. But again both of those are promoted to int when passed to a function, and thus can be ignored as far as printf is concerned.)

The other problem is gcc complains about identical types: on 64 bit systems "long long" and "long" are the same type, but it complains. This is especially hilarious when the headers #define (or typedef) a type as "long long" when building for 32 bits and "long" when building for 64 bits, so "%lld" would always try to print 64 bits and would always be _fed_ 64 bits so it's works fine, but gcc warns anyway because reasons. (They're rewriting it in C++, therefore C must be just a dialect of C++ and everything everywhere is strongly typed regardless of what type it really is, right?)

Yes, I could crap typecasts all over the code to shut the broken compiler up. And that's what most people do in situations like that. But forcing type conversions when you don't need to not only hides real bugs as often as not, it sometimes causes new ones. I rip unnecessary typecasts _out_ because simple code is better.

And then I have to deal with gcc. I know everything the FSF maintains is going away, but it's not dying _fast_ enough. (And LLVM, written in C++, isn't hugely exciting. It has momentum because people are moving away from the gcc. LLVM isn't really _attracting_ anybody, the FSF is repelling them and "any port in a storm". Still, at least the people running it aren't the FSF, so that's something.)

February 28, 2015

Sigh. One of the most useful things to be able to build standalone in toybox would be the shell, toysh. (Which is currently near-useless but one of the things I need to put serious effort into this year.)

However, the shell needs the multiplexer. It has a bunch of built-in commands like "cd" and "exit" that need to run in the current process context to do their thing, it _must_ be able to parse and dispatch commands to be a shell. So the main thing scripts/single.sh does, switch off CONFIG_TOYBOX (and thus the "toybox" command), isn't quite appropriate.

Except... the shell doesn't need the "toybox" command. When you run it, the default entry point should be sh_main(). In fact it _needs_ to run sh_main() even if the name is rutabega.sh because the #!/bin/sh method of running a shell feeds the script name into argv[0] which would confuse toybox_main().

However, the scripts/single.sh method of disabling the multiplexer treats the array as length one, and just dereferences the pointer to get all the data it needs. Currently, this means if do hack up toysh to build standalone, it thinks it's the "cd" command. (Which runs in a new process and then exits immediatey, so is essentially a NOP other than its --help entry.)

I note that somebody is arguing with me in email about calling things scripts/make.sh when they say #!/bin/bash at the top and depend on bash extensions, becuase obviously if they don't run with the original 1971 sh written in PDP-7 assembly then they're not _shell_ scripts. I may be paraphrasing their argument a bit.

February 27, 2015

Cut a toybox release. Need to do an aboriginal linux release now. (It built LFS-6.8 through to the end, if some random thing still needs tweaking in toybox, I can add a patch to sources/patches in aboriginal.)

February 26, 2015

Blah, build dependencies! In lib/xwrap.c function xexec() cares about CFG_TOYBOX and !CFG_TOYBOX_NORECURSE, and if those toggle in your config you need to "make clean" to get it to notice.

Alas, if you rebuild the contents of lib/ because .config is newer then "make change" rebuilds it every time. But there isn't a way to tell make to depend on a specific config symbol unless you do that insane "create a file for every symbol name" thing which is just way too many moving parts.

February 25, 2015

The stat breakage was printing long long as long, which is 32/64 bit type confusion on 32 bit hosts. Of course the real type was hidden by layers of typedefs, which are worse in statfs than in posix's statvfs because the linux structure is trying to hide a historical move from statfs() to statfs64(). But statvfs has the fsid as a 64 bit field, and statfs has fsid as a 128 bit field (struct of two 64 bit fields, and it uses all the bits), so switching from the LSB api to the posix API would truncate a field. Grrr.

Anyway, stat's fixed now and I ran a build of all the targets and half of them broke, with WEIRD breakage. On i486, i586, and i686 the perl build said the random numbers weren't random enough (but /dev/urandom is fine). Sparc and ppc440 segfaulted with illegal instructions.

Four things changed: the kernel version, the qemu version, toybox, and the squashfs vs initramfs packaging. The illegal instructions sound like a qemu problem, the perl build breakage might be kernel version? Sigh. Too much changed at once.

Oddly enough, arm built through to the end. Well of course.

February 23, 2015

Still trying to get an Aboriginal release out. The lfs-bootstrap control image build broke because the root filesystem is writeable now, so the test whether or not we need to create a chroot under /home and run the build in that isn't triggering.

So I need to add another condition to the test... but what? The obvious thing to do is df / and see if there's enough space, but A) how much is "enough", B) df doesn't have an obvious and standardized way to get the size as a numeric value. You have to chop a field out with awk, which is (to coin a phrase) awkward.

Yes, classic unix tool, standardized by posix, not particularly scriptable.

The tool that _is_ scriptable (and in toybox) is "stat", and in theory "stat -fc %a /" should give the available space... but it doesn't. It gives it in blocks, and how big is a block? Well that's %S, so you have to do something like $(($(stat -fc '%a*%S'))) and have the shell multiply them together (and hope you have 64 bit math in your shell, but for the moment we do).

Next problem: stat is broken on armv5. It works fine on x86, but it's breaking in aboriginal. (Is this an arm thing, a uClibc thing, a 3.18 kernel thing... sigh.)

So now to debug that...

February 21, 2015

Still banging on Aboriginal Linux. You rip out one little major design element and replace it with something wildly different and there's consequences all over the place...

The ccwrap path logic is still drilling past where I want it to stop (and thus not finding the gmp.h file added to /usr/include because it's looking in /usr/overlay/usr/include in the read-only squashfs mount). I pondered using overlayfs to do a union mount for all this, but that's a can of worms I'm uncomfortable with opening just yet. (Stat on a file and stat on a directory containing the file disagree about which filesystem they're in. I suppose the symlink thing is similar, but one problem at a time...)

Since I was rebuilding ccwrap so much, I decided to make a new "more/tweak.sh" wrapper script to generally make it easier to modify a file out of a stage and rerun the packaging. The stage dependencies are encoded in build.sh using the existence of tarballs (if the tarball is there, the stage build successfully), so it can just delete the tarball before the check and the stage gets blanked and rebuild.

However, I want to manually do surgery on a stage, and then rebuild all the stages _after_ that one without rebuilding that one. (Avoiding a fifteen minute cycle time for rebuilding native-compiler on my netbook is sort of the point of the exercise.) And build.sh didn't know how to do that, so I added an AFTER= variable telling it to blank the dependencies for a stage as if the stage was rebuilt, but not to rebuild the stage. (Sounds simple. Took all day to debug.)

The other fun thing is that system-image.sh is rebuilding the kernel, which is the logical place for it to go (it's not part of the root filesystem, and all the kernels this is building are configured for QEMU so you'd want to replace that when using real hardware anyway), but it's also an expensive operation that produces an identical file each time (when you're not statically linking the initramfs cpio archive into the vmlinux image, anyway).

So I added code to system-image.sh that when you set NO_CLEANUP it checks if the vmlinux is already there and skips the build if so. (The filesystem packages blow away existing output files, the same way tar -f does.) And have tweak.sh set NO_CLEANUP=temp (adding a new mode to delete the build/temp-$ARCH directories but not the output files) to triger that.

So when I finally finished implementing this extensive new debugging mode, it took me a while to remember what problem I wanted to use it on. It's been that kind of week...

And then, when I got "more/tweak.sh i686 native-compiler build_section ccwrap" to work, it put the new thing in bin/cc instead of usr/bin/cc because natie-compiler is weird and appends "/usr" to $STAGE_DIR. So special case that in tweak.sh...

And after all that, it produced an x86-64 (host!) binary for usr/bin/cc, because sources/sections/ccwrap.sh uses $HOST_ARCH isn't set. (Sigh: there's TOOLCHAIN_PREFIX, HOST_ARCH, ARCH, CROSS_COMPILER_HOST... I'd try to figure out how to get the number down but they all do slightly different things, and the hairsplitting's fine enough that _I_ have to look it up in places.

The toybox build is using $ARCH, the toolchain build is using $HOST_ARCH. This seems inadvisable. ($ARCH is target the toolchain produces output for, and $HOST_ARCH is the one the toolchain runs on. They're almost always the same except when doing the canadian cross stuff in the second stage cross compiler. In fact native-compiler.sh will set HOST_ARCH=$ARCH if HOST_ARCH isn't already set, which is the missing bit here.)

Sigh. Reproducing bits of the build infrastructure in a standalone script is darn fiddly. Reminding me how darn fiddly getting it all to work in the _first_ place was...

February 20, 2015

I switched the Aboriginal Linux stage tarballs from bzip to gzip, because bzip2 is semi-obsolete at this point in a way gzip isn't. There's still a place for gzip as a streaming protocol (which you can implement in something like 128k of ram including the buffer data), while kernel.org has stopped providing tar.bz2 and replaced them with tar.xz.

This gets me off the hook for implementing bzip2 compression-side in toybox. (Half of which is a horrible set of cascading string sort algorithims where if each one takes too much time it falls back to the next with no rhyme or reason I can see, it's just magic stuff Julian Seward picked when he came up with it, just like the "let's do 50 iterations instead of 64" magic constants all over the place that scream "mathematician, not programmer" (at the time, anyway). And I can't use the existing code because it's not public domain, but if I can't understand it I can't write a new one.)

Yes, I'm enough of a stickler about licenses that I won't use 2-clause BSD code in an MIT-licensed project, or Apache license, or ISC... They all try to do the same thing but they have slightly different license phrasing with the requirement to copy their chosen phrasing exactly, which is _stupid_ but if you think no troll with a budget will ever sue you over that sort of thing, you weren't paying attention to SCO or the way Oracle sued Google over GPL code. In theory you can concatenate all the license text of the various licenses you used, which is how the "Kindle Paperwhite" wound up with over 300 pages of license text under its "about" tab. If you ever _do_ wind up caring about what the license terms are, that's probably not a situation you want to be in.

The advantage of public-domain equivalent licenses is they collapse together. You're not tied to a specific phrasing, so nobody bikesheds the wording (which is what's given us so many slightly incompatible bsd-alikes in the first place).

But it's also that toybox isn't about combining existing code you can get elsewhere. If I can't write a clean, polished, well-integrated version of the thing, you might as well just install the other package. If I can't do it _better_, why do it _again_? (That's why I didn't merge the bsd compression code I had into busybox in the first place. I had the basics working over a decade ago.)

So back to Aboriginal: switching tarball creation from bzip to gzip actually made things _slower_. Yes, the busybox gzip compression is slower than the bzip compression. That's impressively bad. (Numbers: running busybox gzip on the uncompressed native-compiler tarball takes 2 minutes 3 seconds. Cat the same data through host gzip, it takes 21 seconds. Busybox is _six_times_ slower. The point of gzip is to do the 80/20 thing on compression optimized for speed, simplicity, and low memory consumption. Slower than bzip is... no.)

For a while now I've been considering how to parallelize compressors and decompressors. I don't want to introduce thread infrastructure into toybox, but I could fork with a shared memory region and pipes and probably make it work. (Blocking read/write on a pipe for synchronization and task dispatching, then a shared memory scoreboard for the bulk of the work.)

In the case of gzip, if I chop the input data into maybe 256k chunks (with a dictionary reset between each one), and then have each child process save its compressed to a local memory buffer until it's ready to write the data to the output filehandle (they can all have the same output filehandle as long as they coordinate and sync() properly).

However, first I'd like to see if I can just get the darn thing _faster_ in the single processor version, because the busybox implementation is _sad_. (Aboriginal is only still using it because I haven't written the toybox one yet. I should do that. After the cleanup/promotion of ps and mdev, which is after I cut a release with what's already in the tree, which is after I get all the targets built with it.)

February 15, 2015

Did a fairly extensive pass to try to fix up distcc's reliability, tracing through distcc to see why a simple gcc call on the command line was run locally instead of distributed. And I found the problem, in distcc's arg.c line 255: if (!seen_opt_c && !seen_opt_s) return EXIT_DISTCC_FAILED;

Meaning I have to teach ccwrap to split single comple-and-link gcc command lines into two separate calls, because distcc itself doesn't know how to do it. (At which point I might as well just distribute the work myself...)

February 13, 2015

Grrr.

The new build layout breaks halfway through the linux from scratch build, and the reason is that the wrapper is not compatible with relocating the toolchain via symlinks.

The wrapper does a realpath() on argv[0] to find out where the binary actually lives, and then the lib and include directories are relative to that (basically ../lib and ../include, it assumes it's in a "bin" directory).

I need to do that not just because it's the abstract "right thing", but but because I actually use it: aboriginal's host-tools.sh step symlinks the host toolchain binaries it needs into build/host (so I can run with a restricted $PATH that won't find things like python on the host and thus confuse distcc's ./configure stage and so on). The toolchain still needs to figure out where it _actually_ lives so it can find its headers and libraries.

But in the new "spliced initramfs" layout, the toolchain is mounted at /usr/hda and then symlinked into the host system. So /usr/hda/usr/bin/cc is the real compiler, which gets symlinked to /usr/bin. The wrapper is treating /usr/hda/usr/include as the "real" include directory, but the package installs are adding headers to /usr/include... which isn't where the compiler is looking for them. I created an environment variable I can use to relocate the toolchain, but I'd prefer if it could detect it from the filesystem layout. So how to signal that...

I was thinking it could stop at a directory that also contained "rawcc", but there's two problems with that. 1) libc's realpath() doesn't give a partial resolution for intermediate paths, 2) I moved rawcc into the compiler bin directory where the real ld and such live. (You thought the binaries in the $PATH were the actual binaries the compiler runs rather than gratuitous wrappers? This is gcc we're talking about, maintained by the FSF: it's unnecessary complexity all the way down.) So instead of /usr/hda/usr/bin having rawcc symlinked into /usr/bin, my toolchain has it in usr/$ARCH/bin/rawcc (which corresponds to /usr/lib/gcc/x86_64-linux-gnu/4.6/cc1 on xubuntu and yes this INSANE tendency to stick executables in "lib" directories needs to die in a fire. "/opt/local" indeed...)

I guess what I need to do is traverse the symlinks myself, find a directory where ../lib/libc.so exists relative to that basename(), and then do a realpath() on the _directory_.

February 12, 2015

Excellent article describing how recessions work, but it raises a question: Why would anyone save at 0% interest? Where's this "excess of desired savings" coming from, why _now_?

The answer is that paying off debt is a form of saving. (It's often one of the best investments you can make, it gives you guaranteed tax free returns at higher interest rates than you get anywhere else.) Having a zero net worth is a step up for a lot of people, burdened by student loans and credit card debt and an underwater mortgage...

But borrowing creates money, because money is a promise and borrowing is makes new promises. When you swipe your credit card the bank accounts the money was borrowed from still show the same number of dollars available in them, and the people you bought things from keep the money you borrowed and spent. That money now exists twice, because your promise to repay the credit card debt is treated as asset in your bank's books. Money is _literally_ a promise, new money comes from people making new promises, and credit cards allow banks to turn individual civilian promises into temporary money.

For the same reason, paying off debt destroys money, by canceling the magnifying effect of debt. When the debt is repaid, that money no longer exists in multiple places, so there is now less money in circulation. The extra temporary money created by securitizing your promise expires when the promise is fulfilled and the loan is repaid. Result: there are fewer effective dollars in circulation, which is a tiny contraction of the money supply.

But debt is not just magnifying existing money: this is where _all_ money comes from. It's promises all the way down, and _nothing_else_. This is the part that freaks out libertarians, who use every scrap of political power they can buy to forbid the government from ever printing new money, so they can pretend the existing money is special and perpetual and was immaculately created by god on the seventh day or something.

Here's what really happens.

A lot of government "borrowing" is actually printing money while pretending not to. A "bond" is a promise to pay a specific amount of money at a specific future date (say $100 in 30 years), which is then sold today for a lower value than the future payback (so the "interest rate" is the difference between the future payoff and the current price, expressed as an annual percentage change). The federal Department of the Treasury regularly issues "treasury bonds", which it auctions off to the highest bidder. (Again, the auction sale price determines the interest rate: divide the amount the bond pays at maturity by the amount it auctioned for and annualize it, and that's the interest rate the bond yields to the buyer.)

The trick is that the Federal Reserve (the united states' national bank) can buy treasury bonds with brand new money that didn't exist before the purchase. When buying completely safe assets (such as debt issued by people who can, if all else fails, literally print money to pay it back), the federal reserve is allowed to borrow money from itself to do so. The Federal Reserve's computer more or less has a special bank account that can go infinitely negative, and they transfer money out of it to buy treasury bonds, using the bonds as collateral on the "loan".

The Federal Reserve doesn't need congressional authorization to create this new money because "cash" and "treasury bonds" are considered equivalently safe assets (issued by the same people even), so it's just swapping one cash-equivalent asset for another, a mere bookeeping thing of no concern to anyone but accountants. At the other end the Treasury Department is auctioning bonds all the time (including auctioning new bonds to pay for the old ones maturing), but these bonds are all made available for public auction where investors other than the Federal Reserve can bid for them, so in _theory_ the federal debt could be entirely funded by private investors and thus the libertarians can ignore the fact it doesn't actually work that way.

This is why the federal reserve can control interest rates. When it's buying the vast majority of treasury bonds at each auction, the price it bids for them determines the interest rate earned on them by everybody else. (If you bid less than the fed you don't get any bonds. If you bid much more than the fed you're a chump losing money, and anyway your entire multi-billion dollar fortune is a drop in the bucket in this _one_ auction.)

So a _giant_ portion of the federal debt is money the government owes to itself. (Not just the federal reserve, but the social security trust fund, which is its own fun little fiction: when social security was created people retiring right then got benefits without ever paying into the system. Current taxpayers paid for retirees, and that's still how it works today.)

This debt created money, and the expanding national debt expanded the money supply not just so the US economy could expand but so foriegn countries can use the US dollar as their reserve currency (piling up dollars instead of gold and using them as ballast to stabilize the currency exchange rates).

The hilarious part is that the federal reserve makes a "profit" due to the interest paid on the treasury bonds. When the bonds mature and get paid back, the Fed gets more money from Treasury than it paid them to buy the things. What does the Fed do with this profit? Gives it back to the Treasury.

No really, that's literally what happens: the federal reserve's profits are given to the government and entered into the government's balance sheet as a revenue source. People are _proud_ of this, even though it's just money going in a circle. The treasury pays interest to the federal reserve which gives it right back, and it's just as good as taxes!

(Half the point of taxes to keep inflation down by draining extra money out of the system after the government spends money the federal reserve and treasury have spun up out of promises to pay each other back. It's a bit like the two keys to launch a missile thing: they have to cooperate because printing presses were too obvious. There are still printing presses, but you have to buy cash from them with money in bank accounts. _New_ money is created in the bank accounts by borrowing previously nonexistent dollars from the federal reserve in exchange for previously nonexistent treasury bonds. Welcome to the ninteenth century.)

Clueless nutballs like Ayn Rand Paul who have no idea how the system actually _works_ constantly attack it because they are incensed at the idea that the money they worship is a social construct rather than a real tangible thing, so they attack the machinery that makes it work to prove that everything will still work without the machinery. (Just like if you stop paying the water bill you no longer get thirsty. Well how do you know until you've tried? As was established in Red Dwarf, "Oxygen is for losers".)

But as I said years ago, money is just debt with a good makeup artist.

So when people respond to us being in a recession (because everybody's paying down debt so nobody has any money to buy stuff with) by trying to cut federal spending and balance the federal budget and pay down the _national_ debt at the same time...

They are IDIOTS, who should not be allowed to operate heavy machinery they clearly do not understand _anything_ about. The government _can't_ run out of money. It can cause inflation if it overdoes things, but right now we've been stuck with the _opposite_ problem for almost eight years. We could do with some inflation. (If you take on a 30 year mortgage expecting 3% annual inflation and get 1%, you wind up paying off twice as much money as you thought you would over the life of the loan. Inflation benefits debtors. That's why creditors hate it so much. They always go on about retirees, but it's billionares doing the lobbying to screw over people with mortgages.)

February 11, 2015

Oh wow, somebody donated to my patreon.

Sometime last year I claimed my name on Patreon. (My last name is nearly unique: during World War I a german immigrant named "Landecker" decided that immigrating to the US with a german sounding name wasn't a good idea, so he made something up, and my family has been the proud bearer of this tradition ever since. All what, three of us? My father's sister Barbara changed her name when she married, as did my sister Kris, so there's my father, my brother, and myself. Oh, and my father remarried. Four people with the name, of which I'm the only programmer.)

Of course nothing's entirely unique on google, it shows up as a typo in a number of places, one of my brother's friends wrote fanfic using the name for a character, and some random bodybuilder decided to use my last name as the first name of his stage name (so if you do a google image search for it you mostly get him), but eh. Unique enough for most purposes, not a lot of competition for it as login handles, but still worth grabbing if you have any plans for someday caring about the service.

Anyway, I filled out a creator profile on Patreon and did some goals called "Do the thing!" where in exchange for the money I promised to appreciate receiving it, and then largely ignored the thing. I didn't even bother to mention it here or on my project websites or mailing lists. (Once upon a time I proposed crowdfunding on the toybox list. Literally nobody replied.) It's not even the patreon account my household sponsors other patreons through (that would be Fade's patreon, through which I send like a dollar a month each to a half-dozen different webcomic artists).

Over the past year or so several companies have contacted me to ask if I had time to do toybox or aboriginal linux work for them, and I mentioned the patreon each time and they went "we're not really set up to do that". I guess it's not surprising somebody eventually took me up on it, but still... Cool.

(They're strongly encouraging me to work on mdev next. Ok then...)

February 2, 2015

I've been wrestling with an Aboriginal Linux design change for a couple months now, and it's fiddly. The problem is the linux-kernel build.

In the old design, the top level wrapper build.sh calls:

download.sh - to fetch any missing or out of date source tarballs (verifying the sha1sums of the ones we've got and replacing any that aren't right).
host-tools.sh - to collect/compile the host binaries we use to build the rest of the stages. (This is the "airlock step".)
simple-root-filesystem.sh - Compile busybox/toybox and uclibc/musl, plus the directory layout and skeleton files from sources/root-filesystem and the $OVERLAY environment variable.
native-compiler.sh - Compile a native toolchain that runs on the target, including some extra packages (make, bash, distcc).
root-filesystem.sh - Combine simple-root-filesystem and native-compiler into a chroot dir.
root-image.sh - Package root-filesystem into a squashfs or cpio image.
linux-kernel.sh - compile the kernel (which is a slow, expensive operation and can optionally consume a cpio image if you want it in initramfs.
system-image.sh - Package up the kernel, root-image file (if it didn't get built into the kernel), and generate three different shell scripts to invoke qemu on the result in various ways.

(This is slightly simplified, ignoring the optional second stage cross compiler, the ability to disable the native compiler, and so on.)

The idea behind the new design is to move simple-root-filesystem into initramfs. Then the native-compiler is packaged up into a squashfs on /dev/hda and gets symlinked into the initramfs at runtime via "cp -rs /dev/hda/. /".

This means the simple-root-filesystem output gets packaged into a cpio image, the native-compiler.sh output gets packaged into a squashfs, and the old root-filesystem.sh script that used to splice the two together at compile time goes away (they're now combined at runtime).

So the new design is:

download.sh - same
host-tools.sh - same
root-filesystem.sh - same as simple-root-filesystem above
native-compiler.sh - mostly the same, but everything in a "usr" subdirectory so cp -a to / installs it under /usr.
linux-kernel.sh - compile the kernel (which is a slow, expensive operation and can optionally consume a cpio image if you want it in initramfs.
system-image.sh - Package root-filesystem (cpio) and native-compiler (squashfs), compile the kernel (building in the cpio if necessary), and generate three different shell scripts to invoke qemu on the result in various ways.

Yes, I could have made the splice part cp -s the files into /usr instead of /, and thus not had to modify native-compiler at all, but that's less generic. In theory you can splice arbitrary extra complexity into the initramfs from the hda mount, no reason _not_ to support that. (There's a "directory vs symlink" conflict if the new filesystem has an /etc directory, mkdir tries to mkdir /etc" and complains something that isn't a directory already exists there. Of course it's a symlink _to_ a directory so if it just continued everything would work. I should check what the host cp does, reread what the standard says, and figure out what I want toybox cp to do here. But for the moment: the /etc symlink points to /usr/etc so just put the files there for now and it works...)

So root-filesystem.sh, root-image.sh, and linux-kernel.sh got deleted, simple-root-filesystem became root-filesystem, native-compiler.sh got its output moved into a "usr" subdirectory, and system-image once again builds the kernel (which is really slow, but conceptually simple).

The packaging is completely different. The old root-filesystem.sh script goes away, because the simple-root-filesystem and native-compiler output get combined at runtime instead of at a compile time. (This means more/chroot-setup.sh script also has to know how to combine them, but since it's just a cp -a variant that's not a big deal anymore.)

The old root-image.sh and linux-kernel.sh stages used to be broken out to avoid extra work on rebuilds, but I put them back because I don't like describing the design. It makes iterative debug builds take longer, but I can rebuild individual packages outside the build system if I need to fiddle with something many times. (I'm almost always too lazy to bother...)

A lot of optional configuration tweaks the old build supported go away too: ROOT_NODIRS was a layout based on linux from scratch chapter 5, but lfs-bootstrap didn't use it. NO_NATIVE_COMPILER let the build do just the simple-root-filesystem, now you can select that at runtime.

On the whole, a big conceptual cleanup. But a real MESS to explain (mostly because of what's going away), and a lot of surgery to implement.

February 1, 2015

Happy birthday to me...

Didn't really do anything for it this year. (Last year was 42, that's an important one. 43 means you survived the answer. Didn't ask for what I really wanted either year, because I want other people to be happy more.)

January 30, 2015

Work ate this week dealing with kernel stuff (adding icache flushing support to a nommu system), and now I'm back poking at toybox trying to remember where I left off. According to (hg diff | grep +++ | wc) I've got differences in 20 files, and that's _after_ moving some of the longstanding stuff like grep -ABC or the half-finished dd rewrite (the bits that broke a "defconfig" build for me) to patches...

But I'd like to ship both sed and aboriginal releases this weekend, and now that sed is in the next aboriginal linux todo item is expr. And expr is weird. Testing the host version:

$ expr +1
+1
$ expr -1
-1
$ expr 1-
1-
$ expr 1+
1+
$ expr 1 +
expr: syntax error
$ expr + 1
1
$ expr - 1
expr: syntax error
$ expr 1 -
expr: syntax error

So now I'm staring at the posix spec to try to figure out what portion of this nonsense is required by the spec, and what portion is implementation weirdness. (I think +1 is being treated as a string, -1 being treated as an integer, but I have no idea why nothing + 1 is allowed but nothing minus 1 isn't? (Maybe the first becomes "" + "1", but there's no minus behavior for integers? Maybe?)

January 27, 2015

One of my four talk proposals got accepted at CELF. Unsurprisingly, they went with "What's new in toybox". (Not the rise and fall of copyleft.)

I'd link specifically to this year's page, but this is the Linux Foundation. They never archive old stuff, it's all about squeezing sponsorship money out of thing du jour and moving on to the next revenue garnering opportunity while history vanishes. Sigh. Oh well. At least the free electrons people record and archive stuff.

January 26, 2015

Okaaaaay....

The ancient and decrepit version of Bash I've been using in Aboriginal linux, 2.05b, doesn't understand "set -o pipefail". It doesn't have the "jobs" command. And it doesn't build toybox.

I was lulled into a false sense of complacency by the fact that aboriginal uses an "airlock step" where it populates build/host with busybox and so on, and then restricts $PATH to point to just that directory for the rest of the build. So the system should rebuild under itself since it initially built with the same set of tools

The exception to this is stuff called at an absolute path, namely any #!/script/interpreter because the kernel doesn't search $PATH when it runs those so they need an absolute path. (The dynamic library loader for shared libraries works the same way.)

I think this old version of bash _should_ have pipefail and jobs, but apparently the way I'm building it they're not switching on in the config. I don't know why.

Of course I tried a quick fix of switching to #!/bin/ash to see if the various rumors I've been hearing about busybox's shell being upgraded actually meant something, and got 'scripts/make.sh: line 79: syntax error: unexpected "("' which is ash not understanding <(command) redirects. Of course.

I may have to do toysh sooner than I expected. This is not cool.

(Yes, I could upgrade to the last GPLv2 release of bash, but that's not the point. I plan to replace it eventually, upgrading it wouldn't be a step forward.)

January 22, 2015

Working on switching Aboriginal so simple-root-filesystem lives in initramfs unconditionally. Have native-compiler.sqf live on /dev/hda and splice them together at runtime instead of in root-filesystem.sh. Use initmpfs for initramfs, and have a config knob for whether the initramfs lives in vmlinux or in a separate cpio (SYSIMAGE_TYPE=rootfs or cpio). This probably means that simple-root-filesystem needs to be unconditionally statically linked, otherwise the "delete old /lib contents and replace with new lib" gets tricky. (No, you don't want to bind mount it because the squashfs is read-only so you can't add more, you want symlinks from writable lib into the squashfs.)

All this means run-emulator.sh just gives you a shell prompt, without native toolchain, so move the qemu -hda argument to dev-environment.sh.

While I'm making stuff unconditional: drop NO_ROOTDIRS, it's fiddly and unnecessary. (The idea was to create an LFS /tools style layout, but lfs-bootstrap.hdc doesn't use it.)

Leftover issue: more/chroot-splice.sh needs a combined filesystem and root-filesystem.sh isn't doing it anymore...

January 16, 2015

By the way, these are the busybox calls left in the basic Aboriginal Linux build:

    2 busybox gzip
    4 busybox dd
   11 busybox bzip2
   28 busybox tar
  121 busybox diff
  215 busybox awk
  275 busybox sh
 1623 busybox tr
 2375 busybox expr
21692 busybox sed

And I'm almost done with toybox sed.

The development category has more commands than that, the Linux From Scratch build and general command line stuff switches on another 20 commands (bunzip2 fdisk ftpd ftpget ftpput gunzip less man pgrep ping pkill ps route sha512sum test unxz vi wget xzcat zcat) but none of that is actually used by aboriginal linux itself. So, approaching a milestone...

January 15, 2015

Linux Weekly News covered Toybox's addition to Android, (Ok, I emailed them a poke and a couple links but they decided it was newsworthy and tracked down several more links than I sent them.)

Meanwhile it keeps showing up in contexts that surprise me, such as openembedded.

Heh. Grinding away at the todo list...

January 11, 2015

Finished cleaning up printf.c, back to the sed bugs. Felix Janda pointed out that posix allows you to "split a line" by escaping an end of line in the s/// replacement text. So this _is_ probably local to the 's' command the way the other one was local to 'a', and I don't need to redo the whole input path.

Which is good, because redoing the input path to generically handle this ran into the problem that it _is_ context-specific. If "a\" and just "a" could no longer be distinguished because input processing had already removed the escape, that conflicts with how the gnu/dammit extensions for single line a are supposed to behave. Or that "echo hello | sed -e 'w abc\' -e 'p'" is apparently supposed to write a file ending in a backslash, and then print an extra copy of the line.

(This is all fiddly. You wind up going down ratholes wondering how "sed -e "$(printf "a\\\n")"" should behave, and decide the backslash is _not_ the last thing on the line because the input blocking put the newline into the line, and in that context it's as if it was read in by N and becomes significant... I think?)

January 8, 2015

So my big laptop (now dubbed "halfbrick") is reinstalled with 14.04, which has all sorts of breakage (the screensaver disable checkboxes still don't work coming up on a year after release, you have to track down and delete the binary), but I used that at Pace and at least found out how to kill the stupid Windows-style menus with fire and so on.

Last night, I started it downloading the Android Open Source Project.

And downloading.

This morning I found out "repo sync" had given up after downloading 13 gigabytes, and ran it again. It's resumed downloading.

And downloading...

January 7, 2015

Now that toybox is merged into android, I really need to set up an android build environment to test it in that context (built against bionic, etc).

The software on my system76 laptop has been stuck on Ubuntu 13.04 since the upgrade servers went away. (I thought I could upgrade versions after that point, but apparently no.) This has prevented me from poking at the Android Open Source Project, because building that needs packages I haven't got installed (can't install more without the upgrade servers), and my poor little netbook hasn't got the disk spacei, let alone CPU or memory to build it in less than a week.

I've held off upgrading because it's also my email machine, but finally got to the point where I need to deal with this. (Meant to over the holidays, didn't make it that far down my todo list.)

The fiddly bit was clearing enough space off my USB backup disk to store an extra half-terabyte of data. (Yes, I filled up a 1.5 terabyte disk.) then leaving it saving overnight, and now it's installing.

The ubuntu "usb-cd-creator" is incredibly brittle, by the way. I have a 10 gigabyte "hda.img" in the ~/Downloads directory, and it finds that when it launches and lists that as the default image it's going to put on the USB key (but does _not_ find the xubuntu-14.04.iso file), insists that it won't fit, and doesn't clear this "will not fit" status even if I point it at the ISO and select that instead. So, delete the hda.img so it won't find it, and then I hit the fact that the "format this disk" button prompts you for a password and includes the time you spend typing the password in the timeout for the "format failed" pop-up window. I.E. the pop-up will pop up WHILE YOU ARE TYPING THE PASSWORD.

This is not a full list of the bugs I hit in that thing, just the two most memorably stupid.

January 6, 2015

The printf work I've done turns out to have broken all sorts of stuff because the regression tests I was running were invalid, because printf is a bash builtin! Which means the tests/printf.test submitted to toybox last year is not actually testing toybox printf: it doesn't matter what printf is in the $PATH, the bulitin gets called first.

(Is there a way to disable unnecessary bash builtins? Other than calling the binary by path each time...)

I keep hitting funky bugs in gnu commands while testing their behavior to see what toybox should do. The most recent example:

$ printf "abc%bdef\n" "ghi%sjkl\n" "hello\n"
abcghi%sjkl
def
abchello
def

What I was trying to test was "does the %b extension, which isn't in posix, interpret %escapes as well as \escapes?" And the answer seems to be: sort of. Only, badly.

I'm not entirely sure what this bug _is_. But it's not in the printf the man page is about, it's in the printf you get from "man bash" and then forward slash searching for printf. :)

(Well, ok, /usr/bin/pritnf produces the same output. But probably shouldn't, and I don't think I'm implementing that strangeness in toybox.)

January 4, 2015

Finally fixed that sed bug I was head scratching over so long: it was that I need to parse escapes in square brakets, ala [^ \t], because the regex engine isn't doing it for me. (It's treating it as literal \, literal t.)

Now on to the NEXT sed bug, which is that you can have line continuations in the regex part of a s/// command. (Not so much "bug" as "why would anyone do that? That's supported?"

In theory, this means that instead of doing line continuations for specific commands, I should back up and have a generic "if this line ends with \, read the next line in". (Except it's actually if this line ends with an odd number of \ because \\ is a literal \.)

The problem is, the "a" command is crazy. Specifically, here are some behavioral differences to make you go "huh":

$ echo hello | sed -e a -e boom
sed: -e expression #1, char 1: expected \ after `a', `c' or `i'
$ echo hello | sed -e "$(printf 'a\nboom')"
sed: can't find label for jump to `oom'

In the first instance, merely providing a second line doesn't allow the 'a' command to grab it, the lines need to be connected and fed in as a single line separated by a newline. but in the second instance, when we _do_ that, the gnu/dammit implementation decides that the a command is appending a blank line (gnu extension: you can provide data on the same line). (Busybox treats both cases like the second one.)

I suppose the trick is distinguishing 'echo hello | sed -e "a hello" "p"' from 'echo hello | sed -e "a hello\" "p"'. In the first case, the p is a print command. in the second, it's a multiline continuation of a.

And the thing is I can't do it entirely by block aggregation because 'echo hello sed -e "$(printf "a hello\np")"' is a potential input. (The inside of $() has its on quoting context, so the quotes around the printf don't end the quotes around the $(). Yeah, non-obvious. One more thing to get right for toysh. The _fun_ part is since 'blah "$(printf 'boom')"' isn't actually _evaluating_ the $() during the parse ($STUFF is evaluated in "context" but not in 'context'), then the single quotes around boom _would_ end and restart the exterior single quote context, meaning both the single quotes would drop out of the printf argument and the stuff between them wouldn't be quoted at all if you were assigning that string to an environment variable or passing it as an argument or some such. Quoting: tricksy!)

Anyway, what it looks like I have to do is retain the trailing \ at the end of the line. I have to parse it to see if I need to read/append more data, but then I leave it there so later callers can parse it _again_ and distinguish multiline continuations.

Sigh. It's nice when code can get _simpler_ after a round or two of careful analysis. So far, this isn't one of those times, but maybe something will crop up during implementation...

January 1, 2015

So the kernel developers added perl to the x86 build again, and of course I patched it back out again. Amazingly, and despite this being an x86-only problem, it _wasn't_ Peter Anvin this time. It was somebody I'd never heard of on the other side of the planet, and it went in through Thomas Gleixner who should know better.

The fact my patch replaces 39 lines of perl with 4 lines of shell script (and since the while shell script fits in the makefile in place of the call to the perl script it only adds _2_ lines to the makefile) is par for the course. I could clearly push this upstream as another build simplification.

But I haven't done so yet, because I really don't want to get linux-kernel on me anymore. It's no fun. I'm fighting the same battles over and over.

There are a bunch of patches I should write extending initmpfs. It should parse the existing "rootflags=" argument (in Documentation/kernel-parameters.txt) because size=80% is interesting to set (the default is 50%, the "of available memory" is implicit). I should push a patch to the docs so the various people going "I specified a root= and didn't get rootfs as tempfs because that explicitly tells it not to do that" have somewhere I can point them other than the patch that added that to go "yeah, you don't understand what root= means". I should push a patch so CONFIG_DEVTMPFS_MOUNT works for initmpfs (the patch is trivial, grep -w dev init/noinitramfs.c shows a sys_mkdir("/dev") and sys_mknod("/dev/console") and do_mounts_initrd.c has a sys_mount() example too, this is like 20 minutes to do and test.

But... It's zero fun to deal with those guys anymore. It's just not. The Linux Foundation has succeeded in driving away hobbyists and making Linux developent entirely corporate. The "publish or perish" part of open source is still there, just like in academia. But I'm not hugely interested in navigating academic political bureaucracy for a tenure-track position, either.

Sigh. I wrote a series of articles about this a decade ago. The hobbyists move on, handing off to the 9 to 5 employees, who hand off to the bureaucrats. I'm not _surprised_. I'm just... I need to find a new frontier that doesn't involve filling out certification forms and collecting signatures to navigate a process.