Sigh, I'm trying to redo the toysh line input processing. The trailing \ logic is different from what bash does (ongoing thread on the list with chet about that), and I _also_ have a hack I've been meaning to clean up: because parse_line ignores completely blank lines, I'm having EOF feed in single space " " lines to flush pending line continuations, which is wrong for multiple reasons. For one, toybox sh -c 'echo hello\' has an space on the end instead of a backslash, which is TWO bugs (\ gets eaten, space gets added). For another, you can have MULTIPLE pending line continuations, and a single EOF line won't necessarily flush them. Except we return 1 when we need another line, and there's different reasons for needing another line, which behave differently: unterminated if or || flow control errors out, but unterminated HERE documents are terminated by EOF (with a warning in bash, silently in the defective annoying shell). And you can have more than one HERE document pending at the same time: cat << EOF1; cat << EOF2 or even just cat << EOF1 << EOF2 (no they don't append, stdin gets redirected twice so the first one is dropped, but 3<<EOF would let you read from fd 3... Oh, and HERE documents are seekable, I should add a test for that. It writes to a deleted temp file to get a seekable filehandle that frees its contents automatically when closed. Classic unix filesystem semantics...)
My line input logic was removing trailing newlines right at the start, but I can't do that because an escape at the end of a line that was NOT ended with a newline gets preserved. (Well, not RELIABLY by bash, but I poked Chet about that. The -c processing is still magic.) So now I've got to propagate that \n through and it's essentially trailing whitespace which I'm already mostly handling, but something somewhere's likely to break. Plus NULL pointer, empty string, and string that only contain whitespace being DIFFERENT is why that "send in a line with a space in it" hack happened in the first place...
I'm also patching the HERE document logic to terminate all outstanding HERE documents at EOF. It can still return "nope, I need more" for unterminated flow control, at which point the caller errors out because there is no more, but the caller can't distinguish "need more HERE document lines" from "need more flow control logic", it just returns 1 to ask for another line, so it has to do it within parse_line(). And I've had to add multiple goto statements to get it to work because the existing logic really isn't set up to turn into a loop. Multiple gotos are not elegant, it means there should be a loop here which would require major surgery to insert...
Multiple people are now trying to use toysh and sending me bug reports, but what I _really_ need to do is grind through the "ASAN=1 make test_sh" bugs because every time I hit an issue or major todo item I try to throw a test in there which bash passes and thus toysh probably _should_. And there are a whole lot of existing tests toysh doesn't pass yet, which makes adding new ones awkward. (I keep sticking them near the start so they trigger, but there should be some logical order to all this...)
The next test_sh failure is a double free when a command comes after a HERE document, ala ASAN=1 make sh && ./sh -c '<<0;echo hello' which did print the hello! It didn't warn that the HERE document hit EOF, but I can presumably add that.
ASAN says the second free happened on line 2923... which is in the function free_pipeline() so yes it would, wouldn't it? This being gcc, it doesn't say who CALLED that function, because gcc's ASAN is crap. And I can't use the Android NDK's ASAN because it's only available as a dynamic library, and if I dynamically link against bionic it's not available on the host so the binaries won't run. And you can't LD_LIBRARY_PATH your way around the dynamic loader being /system/bin/linker64". Elliott suggested I could symlink /system to somewhere in the NDK, but find android-ndk-r25c -name linker64 produced zero hits.
So backing up, the first free was in do_source() which calls llist_traverse(pl, free_pipeline) after run_lines() returns. Because we've executed all the stuff and it's done with it now... Ah, and then it frees the HERE document but the last entry in that is the EOF which was one of the arguments to the earlier command line that already got freed. (Because why copy it when we're
I need to enable the leak detector, and whitelist the EXPECTED leaks at exit. I know how to do the first, not sure how to do the second. Other than writing a debug function to laboriously free stuff the OS is about to free for us. I'm worried about accumulating leaks during long runs, not blocks of data with the same lifetime as the process. I want some sort of leak_forget() function that says "anything that's already been allocated is not interesting for leak detection, only show me NEW allocations after this point that don't get freed".
Alas, gcc's ASAN is abandoned crap, and the LLVM toolchain I have wants dynamic bionic installed on the host, and will NOT work static linked (for no obvious reason other than they either didn't think of it or didn't bother). So puppy eyes about adding stuff would add it to a context I can't use anyway.
Hmmm, I should try getting a dynamic bionic chroot working again. In theory the stdin panic fix in the _start code has made it in to the release version by now? (I should really learn to build the NDK from source. Too many tangent ratholes...)
Email from Chet: my trailing backslash line parsing is wrong in toybox (or at least doesn't match bash). I knew my line parsing is wrong and I'd have to redo it, but it turns out it's wrong in more ways than I was aware of. Hmmm...
Also bash -c 'cat< Oh, and no matter how you fiddle with the priority, HERE documents always
seem to eat their lines before line continuation logic does: Which toysh is already getting right, but I want to make sure I have
tests for. And also: I.E. your REASON for requesting line continuation can vary from line to line,
based on parsing the new input. And that $(cat) can't be evaluated until
the trailing redirect has replaced stdin for the whole block, which I'm
already getting right but the new changes can't break that, hence regression
testing... I mean, if you REALLY want to go down the rathole here: That's why each "do_source()" has it's own pseudo-function
context, because LINENO is often a local variable even without a
function call, which sometimes gets reset and sometimes gets inherited
as you enter/exit each new parsing context, and I need tests for all of
it...
$ if cat << EOF; then
> blah
> EOF
> echo hello; fi
blah
hello
$ if [ $(cat) == blah ]; then echo hello
> fi << EOF
> blah
> EOF
hello
$ bash -c 'echo $LINENO'
0
$ bash -c $'\n\n\necho $LINENO'
3
$ echo 'echo $LINENO' > weeb
$ bash -c '. weeb;. weeb;echo $LINENO'
1
1
0
$ bash -c $'. weeb;. weeb;echo $(eval $\'echo $LINENO\\necho $LINENO\');echo $LINENO'
1
1
1 2
0
$ bash -c $'. weeb\n. weeb\necho $(eval $\'echo $LINENO\\necho $LINENO\');echo $LINENO'
1
1
3 4
2
$ bash -c $'. weeb\n. weeb\neval $\'echo $LINENO\\necho $LINENO\';echo $LINENO'
1
1
2
3
2
Flew back to Austin first thing in the morning, early enough in the day I could hang out with new/old laptop at Wendy's and the HEB tables.
Finally got the ls --sort tests in, and fixed more than one bug found by them.
For example --sort can handle csv arguments (in toybox, not in gnu/dammit) but when I fed it more than one sometimes it was looping endlessly because when you've already matched you don't do further sorts (which meant it wasn't advancing past those arguments it wasn't processing), but it needs to CHECK those future arguments to see if there's a "reverse" in there, so it was looping without advancing... (There's also "unsorted" which stops argument processing despite not having matched, but I can't TEST unsorted because it means the filesystem order leaks through and I can't control what that IS)...
Yesterday the dentist said the tooth should come out, and responded to my "it doesn't hurt" with a poke with a tool demonstrating that the nerve is alive and well and not protected by much and capable of hurting a VERY LARGE AMOUNT QUITE SUDDENLY.
This would be the SEVENTH tooth I've lost. (Four wisdom teeth and two on top next to the incisors removed presumably for cosmetic reasons as part of the braces years ago? I had a somewhat pronounced overbite as a teenager. My parents were really making those decisions at the time, I was in early high school.)
He didn't quite come out and say it, but when my wisdom teeth were removed this tooth was left without a matching tooth to chew against, and that's apparently bad for teeth. Which means I'm losing this one because they took out too many wisdom teeth back in the day. Those two removed in the front meant the braces shifted the rest forward, which made room to KEEP the top two wisdom teeth (which I pointed out at the time) and the dentists went "no, you don't want to leave a tooth with no matching tooth for it to work against"... but they DID, didn't they?"
Add in the TMJ I got because the braces for some INSANE reason involved a rubber band from my upper left to the lower right (across my tongue) for 6 months, and this means basically every major experience I've had with american dentistry has caused future problems. The braces made my jaw click and grind, the previous tooth removals left me with an orphan tooth that's now collapsing, and the 2013 experience in St. Paul left two of my front teeth looking obviously terrible. The cost each time was probably somewhere around ten grand each time (if not more adjusted for inflation)...
Anyway, had him grind off the pointy bit (which didn't hurt for about five seconds and then essentially repeated the first poke; he is a VERY good dentist to not have done further damage inside my mouth when I lurched like that, but hey: my cheek can heal now). And then I scheduled a follow-up apointment for a month from now. I was thinking of flying back to dogsit while Fade's at 4th street anyway. (Previous years she flew back from Austin and left Adverb with us, but she's staying in Minneapolis this summer to finish her dissertation so she can defend it in August, which is why I'm flying there to see her here so many times this year.)
I may have gone a little overboard. Somebody emailed me to track down a citation, and I replied with my usual "THIS IS A SPECIAL INTEREST OF MINE" enthusiasm:
> Is this yours?
> https://landley.net/history/mirror/cpm/history.html
> I'm citing some of it in a book I'm writing and I wanted to make sure it was you.
As the "mirror" states, it's a copy of an old page from geocities. Here's the original pulled out of archive.org.
Back in 1984 Gary Kildall was one of the original co-hosts of the TV show "computer chronicles" (until he became too busy with his company to continue). Here's an episode he co-hosted on "programming languages", and here's an episode on "operating systems". When Kildall died, the show did a retrospective on him.
Here's the "standard" interview with him (I have a copy of this book). And here's another computer industry pioneer reminiscing about him.
> Also, Im going a little beyond what's on that page, and might you be able to
> confirm it's (more or less!) accurate, please?
> Thanks
> [NAME]
>
> In 1974 Gary Kildall, co-founder (with his wife) of Digital Research, personally
> created CP/M, which became the standard operating system for 1970s personal
> computers.
You should really watch the PBS series Triumph of the Nerds (which is based on the book "Accidental Empires", the presenter is the book's author).
CP/M only became the standard operating system for "S-100" systems. (Here's a song Frank Hayes, columnist for ComputerWorld, wrote/performed about the S-100 bus. Yes, it's a "C shanty". From the album "never set the cat on fire".)
The "PC vs Mac" of the day was Apple II vs S/100 systems (which started as clones of the Dec Altair: MITS manufacturing couldn't keep up with demand but they shipped a full schematic with every system were using off the shelf parts, so other people bought the parts and assembled them according to the schematic, and then started making improvements).
The company "Imsai" (that's the computer the protagonist of the movie "Wargames" had in his bedroom) convinced Kildall to break his OS into two parts (BDOS and BIOS, Basic Disk Operating System and Basic Input Output System), with the BIOS essentially being a driver package provided by the hardware manufacturers so the same BDOS could talk to disk and console hardware. That way, ALL CP/M machines could run from the same floppy disk, rather than having separate disks for each manufacturer.
All that was 8-bit, and since 1979 Kildall had been chasing multiprocessing (MP/M) as the next big thing (about 20 years too early, the cost of memory was a big limiting factor so at the time running multiple programs in parallel on the same machine wasn't _that_ much cheaper than buying multiple machines and networking them, although S/100 systems didn't have a "motherboard", memory expansion was on cards (which you can keep adding as long as you have slots), and even the CPU was on a card, so having a multi-processor system with two CPU cards wasn't a far-fetched idea, the trick was making it WORK...) so he basically ignored the 16 bit 8086 for the first couple years.
But a guy named Tim Patterson at Seattle Computer Products was working on a new 8086 board which was intended to run CP/M, and since DR hadn't shipped it yet he bought an off the shelf CP/M manual and implemented 16 bit versions of the system calls it listed so he had something to test the hardware with, calling the result "QDOS" (a play on BDOS, Quick and Dirty Operating System).
Tim had previously worked a summer job for Microsoft where he created their first hardware project (an 8080 processor card for the Apple II, allowing it to run Microsoft DOS instead of the one Steve Wozniak had written), and when Paul Allen realized that IBM's project Acorn was basically a 16 bit CP/M machine he and gates threw $50k at Tim (split with his employer) to buy QDOS from him, which they renamed "DOS 1.0"...
> But the first versions of CP/M, like the early personal computers,
> had very limited functionality: the first version merely supported
> single-tasking on 8-bit microprocessors and no more than 64 kilobytes of memory.
8-bit machines all had only 64 kilobytes of memory, and hacks like "bank switching" historically never made much difference. CP/M was about the best you could do on that generation of hardware. Paul Allen thought the PC that IBM was developing could do better and wanted to run Unix on it, so he licensed Unix from AT&T and contracted a small 2-man garage outfit called SCO (the Santa Cruz Operation) to port it to the Intel 8086 and Motorola 68000 processors (because IBM hadn't decided wihich it would go with yet), and called it "xenix" to indicate "we'll port it anywhere IBM needs it to go".
Then when they signed the NDA and got the hardware specs of the original IBM PC (because IBM wanted to put Microsoft's DOS in ROM as the PC's operating system, like the Commodore 64 and so on, and the CEO was on the board of directors of the Red Cross with the mother of William H. Gates III, "trey" to his friends, made an exception for "Mary's Boy". Microsoft was too small to qualify as an IBM vendor normally, that part is in the book "Big Blues" about the history of IBM, by the way)... Anyway, the IBM PC specs read "16k of ram expandable to 64k if you pay extra, and the ISA bus is just the S/100 bus with unused wires removed" (as in there were literally adapters that plugged the bigger cards into the smaller slots, no electrical or timing fiddling required, just shifting wires over)...
Paul Allen went "oh: you're going to run CP/M on it". But he and Gates had already expanded their ambitions to sell a bigger OS to IBM, and Gates said he knew Kildall and offered to set up the meeting with IBM, whereupon SOMEHOW Kildall got the impression that the meeting was in the afternoon but IBM got the impression that the meeting was in late morning, so Kildall was off at the airport flying his airplane (cessna probably?) to cool his nerves, and when the IBM guys unexpectedly showed up at his house (he worked from home) his wife paniced and called Gates who suggested that the company lawyer look over the NDA while Kildall got back from the airport, and as lawyers do he went "ew" and started negotiating terms, and since they refused to sign it as is the IBM guys went away empty handed before Kildall even got back from the airport, and the whole meeting was set back weeks...
Which gave Paul time to contact Tim Patterson and scrape up $50k to buy QDOS and offer to be IBM's "second source" with "their" 16 bit CP/M clone (filing off the Q and renaming it MS-DOS). IBM did the PC after their salesbeings saw Apple II running Visicalc on the secretaries desks when they went to meet with executives in otherwise pure IBM shops, and after allowing Digital Equipment Corporation and the PDP-1 to live (creating the minicomputer ecosystem) they vowed NEVER AGAIN. They estimated they had a year to flood the market and smother Apple before it got entrenched, but an internal process audit had just measured it took them 9 months to ship an empty box, so they had NO TIME to make the new product and get it to market. The head of the Boca Raton department offered to make one out of off the shelf third party parts they could order in volume with a phone call, which is NOT how IBM normally did things but this was an emergency and the CEO personally granted absolution and indulgences to the Boca team. IBM was a monopoly used to squeezing customers, so they carefully made sure none of these new suppliers could ever do monopoly leverage against IBM, by ensuring there was a second source for EVERYTHING, with their one unique contribution being the BIOS ROM (the thing Compaq clean room cloned). They hadn't been second sourcing the software (just the hardware), but hey: good idea! Another CP/M, sure thing.
Meanwhile, Kildall was a Navy instructor before he started Digital Research, which meant he knew about being a vendor to big bureaucratic institutions, and wasn't really keen on going there. It's lots of money, but most of it's pie-in-the-sky someday money after jumping through lots of hoops and years of delay, and to get there you need a dozen full-time staff just to navigate the bureaucracy. His company was a couple people running out of his house: he'd take free money if IBM offered it, but he already had a CP/M ecosystem built around the stuff he was already selling to existing customers, and this new thing was either part of the S-100 family or it wasn't.
So when the PC shipped, CP/M-86 was late, and when it arrived it cost several times what Microsoft priced DOS at. But the real nail in the coffin was that Paul Allen didn't give up on his dream of having this machine run Unix. Each new release (PC, XT, AT) could support more memory, and the 8086 processor could physically address up to a full megabyte. (The DOS 640k barrier was because they'd arbitrarily mapped I/O memory at 10x the original PC's memory capacity: 2/3 for RAM, 1/3 for I/O memory space. You had to move the VGA card's memory window in order to use more contiguous address space in your application, and even then you don't get ALL the space because I/O memory is still needed.)
DOS 1.0 was a bug-for-bug clone of CP/M (well, a 16-bit port of an 8 bit system, but otherwise identical). But for the DOS 2.0 release, Paul Allen added as many Unix features to MS-DOS as he could. You could now use filehandles instead of file control blocks, and stdin/stdout/stderr were filehandles now. He added unix-style subdirectories, although DOS 2.0 used "\" and "/" interchangeably because "dir /s" was how CP/M had indicated command line options, so DOS 2.0 let you use both "dir /s" and unix style "dir -s" with the / version deprecated, but he couldn't quite REMOVE it yet, so the syscalls supported both directory separators. And he publicly announced that a future DOS version (hand-wiggle maybe around 4.0) would just be Xenix with a DOS emulation layer for old programs. You'd need something like 256 to be worth it, and hey: you'd get multiprocessing for free. (Remember how Kildall was doing MP/M? Maybe not THAT crazy. For reference, IBM announced its "Topview" multitasking graphical desktop for DOS in August 1984, and the first version of the Desqview multitasker for DOS shipped in July 1985. If 8 bit systems max out at 64k, a 16 bit system with 128k of RAM running 2 of those 8-bit programs at once sounds pretty feasible...)
The new unix features in DOS 2.0 made it a way better programming environment than CP/M-86, so it wasn't just cheaper now it was BETTER, and CP/M-86 receded from use on the IBM PC. (And clones, Compaq had happened by now. The reason the IBM PC took over the world and the Apple II didn't is that when IBM sued Compaq they lost, but when Apple sued Franklin they won: https://en.wikipedia.org/wiki/Apple_Computer,_Inc._v._Franklin_Computer_Corp. That was the legal decision that extended copyright to cover binaries and thus invented "shrinkwrap" software, see also the 1980 Audio interview with Bill Gates (mp3 and transcript both linked from https://landley.net/history/mirror/#:~:text=1980%20audio ). The GNU project, IBM's "Object Code Only" announcement, and AT&T's post-breakup commercialization of Unix were all responses to Apple vs Franklin...)
IBM's competitive focus on Compaq and the hardware clones distracted it for years from the fact it had lost its second source competition on the operating system side when DOS 2.0 rendered CP/M-86 irrelevant. IBM shipped its own PC-DOS and Digital Research eventually came out with DR-DOS, but by then Microsoft was doing "CPU tax" contracts with motherboard manufacturers (see the 1995 antitrust trial under Judge Sporkin), and used aggressive bundling (buy X get Y for free, and you can't NOT buy X) to promote Windows and Office... But I'm getting ahead of myself.
Two things happened to derail the dos->xenix move:
1) the IBM PC/AT (developed in 1983, shipped August 1984) added a hard drive, so the DOS 3.0 release was mostly about adding hard drive support (the C: drive) rather than furthering the convergence with Xenix.
2) in 1983 Paul Allen came down with Hodgkins Lymphoma. (That's the same cancer Hank Green just got. It's one of the most treatable forms of cancer, but it IS cancer, and can totally kill you).
Nobody initially knew WHY Paul Allen was so sick (looked like overwork during the DOS 3.0 crunch), but Paul Allen owned 1/3 of Micorosoft's stock because Bill Gates was an asshole: they original wrote DOS for the MITS Altair, and the owner of MITS offered Paul a job working at MITS. When incorporating Microsoft Gates insisted he have 2/3 of the stock and Allen only 1/3 because Gates would be working at Microsoft full time and Allen only part time due to his job at MITS, and Allen agreed... and then immediately after that was signed, Gates asked Allen if he could get him a job at MITS. As I said: asshole.
But the ultimate asshole move was that while Paul Allen was working himself to death trying to get DOS 3.0 out fast and clearly sick but not yet properly diagnosed, Paul heard Bill Gates and Steve Ballmer (Microsoft employee #3, Gates' old poker buddy from Harvard Law School before they dropped out to work at Microsoft) talking to each other in the next room about how to get Paul Allen's 1/3 ownership of microsoft back when Paul died. They didn't want it going to his family, they wanted to figure out how to take it back.
When Paul Allen took a leave of absence to get cancer treatment, he never returned to Microsoft. The drive to switch everything to Xenix left with him, and Gates looked around for other people to copy technical agendas from instead. He saw the Apple Lisa (because Apple gave them an early unit to port their application software to), and tried REAL HARD to copy it but Windows 1.0 and Windows 2.0 were just pathetic. DOS 4.0, 5.0, and 6.0 offered nothing that DOS 3.0 hadn't. Gates teamed up with IBM to work on OS/2 which was IBM's attempt to port mainframe technology down to the PC space... alas, targeting the 286 instead of the 386.
IBM had bought the entire first year production run of the Intel 286 processor to keep it out of the hands of competitors (like Compaq), and was then stuck with a warehouse full of the slowest, most expensive, rapidly depreciating 286 processors ever made. That's why they refused to go to the 386 and even the IBM PS/2 was mostly 286 chips, they were trying to unload that backlog of 286 chips! (They eventually landfilled some portion of them, but it took YEARS.) In 1987 the Compaq Desqpro 386 was the first 386 PC because IBM the 386 had been out since 1985 and IBM still hadn't used it, and Compaq got tired of waiting. (As did IBM's customers.) So yeah, that's why OS/2 was so far behind the times that Windows 3.0 could get out ahead of it and establish a new programing API standard.
When David Weise made windows work years later on his own and against orders, the first person he showed it to thought he'd get in trouble for it because Microsoft was focused on OS/2. Microsoft never had a plan, they had a monopoly that let them fail repeatedly until they got lucky. Their "CPU tax" monopoly contracts forced manufacturers to License microsoft products for entire "product lines", meaning PC manufacturers who wanted to ever sell a Microsoft operating system on ANY machine had to put them on EVERY machine. They couldn't sell even a small number of machines without the preinstalled microsoft software, and microsoft fought a marketing campaign for years against "naked machines" because obviously the only thing anyone could do with a machine that DIDN'T have microsoft software on it was install pirated microsoft software. Microsoft's monopoly leverage also let them prevent other operating systems from being installed alongside theirs, and when Windows 95 came out they extended this to preventing IBM from installing OS/2 on any of its own PCs if it wanted any access at all to Windows 95. (See the 1998 antitrust trial with Judge Jackson.) But again, getting ahead of myself.
The death blow for Xenix was that after the 1983 AT&T breakup, when AT&T was commercializing unix, it sucked in code (without attribution) from all the third party unix variants and shipped it in Unix System III. (System V was a successor to System III, there was a 4.0 but it never shipped to customers.) This is why the AT&T vs BSDi lawsuit ended favorably for BSDi: they were able to prove in court that AT&T had sucked in THEIR code without attribution, and thus forced a settlement on AT&T. AT&T also did the same thing to Xenix, and when Gates found out Microsoft code was in an AT&T product without permission or payment he went BALLISTIC, but didn't think he had the legal heft to take on AT&T so instead he purged Xenix from Microsoft (it had been running their internal email system and so on) and unloaded Microsoft's interest in Xenix on SCO (which is how SCO wound up fully owning Xenix, they'd intially just been a subcontractor doing work on somebody else's IP, but they got it cheap), and basically developed a Dave Cutler level of Unix hatred going forward...
I note that back in the day I did a LOT of research on this for my rebuttal to SCO's second amended complaint against IBM, and xenix is all through it. The indented parts in green are mostly stuff I wrote, with a little bit from Eric, but the OSI position paper was his baby and the rebuttal paper was mine. The rebuttal links to a lot of primary sources, many of which have sadly gone away over the years but you can still pull most of them out of archive.org if you try...
(You should TOTALLY get a copy of Peter Salus' book "a quarter century of unix". And a copy of "Where wizards stay up late" which is about the formation of the internet. Soul of a new Machine and A Few Good Men From Univac are more tangential, but loads of fun.)
Oh, and the book "Hackers" by Steven Levy is the other half of this Ken Olsen Smithsonian interview, literally two halves of the same story with the TX-0 and so on.
Oh, and the first four interviews in the Intel section of my mirror are the four parts of the story of the birth of the microprocessor: Ted Hoff (the actual creator), Federico Faggin (who went on to found Zilog and the Z80 processor), Masatoshi Shima was their actual customer at Busicom (who many people say was the ACTUAL inventor of the 4004), and then their boss was Gordon Moore (of Moore's Law fame).
Then read "Crystal Fire" about the invention of the transistor. The second half of that book is about the creation of Silicon Valley (which exists because William Shockley was an utter asshole), and Gordon Moore is a featured player (part of the "traitorous 8" that bounced from Shockley to Fairchild to found Intel)...
Ahem: computer history is a hobby of mine. Here's a 2 part writeup (part 1, part 2) on some interesting plot threads I did a dozen years ago.
(I've been meaning to write my own book for years, but... too busy.)
Sigh, fell out of the habit of blogging during the week when I couldn't. (Nothing for my editing pass to elaborate on when I didn't leave myself a trail of breadcrumbs...)
Git log shows couple of shell fixes. I should get a release out then do a deep dive into shell stuff again, and try to get that properly finished.
Cut up one of Fade's old disposable mouthguards to get a chunk of plastic I can put over the tooth so my cheek can get some relief from endless stabbing. (It was keeping me awake, and it's not fun to talk either.)
Fade got me an appointment at the dental school attached to the university she gets all her tooth care done at. Of course she gets it free as a grad student, and I don't. We pay like $500/month to get me on her health insurance plan, but it doesn't cover dental for me: luxury bones. Still, these guys are known to be very good at their job, and should not make it WORSE. I'd very much like treatment that didn't cause more problems than it solved...
Back on the horse. (For a definition of "horse" that involves taking my new laptop to the common work area in building 1 of Fade's apartment, which is playing a spanish cover of "Achy Breaky Heart" for some reason.)
The 'repeated hang" failure mode left me with a lot of vi :recover files where it prompts me which of the three .swp files to read, and I'm just zapping all that. There's a lot of pausing to stare at "am I deleting the .blah.c.sw? file or the blah.c file" before each one JUST TO BE SURE. (I have made that mistake. Less of an issue when the file is in git, and I'm just losing recent changes instead of trying to dig it up out of a USB backup drive.)
Sigh, the hard part of fiddling with a command like ulimit/prlimit is A) coming up with the new help text, B) coming up with test suite entires. Once I've got those, the CODE is generally pretty easy. Implementation is seldom the hard part, DESIGN is the hard part. What should it DO?
New laptop arrived. The freeze problem advanced to "happens 30 seconds after a reboot", so I ordered another of the same type I could just swap the hard drive into. (I have 3 such spares at home, but they're in Austin and I'm with Fade in Minneapolis.)
It's so CLEAN. Not covered in scratches and gunk, almost as if I HAVEN'T been dragging it around with me everywhere for a couple years. Same model (Dell E6230) but this one's refurbished and thus in a slightly different case (doesn't say Dell E6230 on it for one), and with this case I can't see the charge/disk LEDs with the lid open. Seems like a tiny thing, but kinda significant now that I'm confronted with its absence. Yeah, there's software versions up in the toolbar (which I have configured to only be visible with the mouse hitting the top of the screen), but I don't TRUST the software ones. I wouldn't have a band-aid over the laptop camera if it had a physical LED that lit up when it was powered independent of any software. The fact they refuse to do that stuff is why one of the first things I do with any new laptop is stick a band-aid over the camera. The pad protects it for when I want to use it, and when I don't it's NOT LOOKING AT ME. Grrr.)
I tried borrowing Fade's old macbook during the gap, which was a comedy of errors in and of itself. She dug it out of the closet, confirmed it worked, set it to charge on the counter, and went to work. I opened the lid to be confronted with a login prompt. Ah. Day 2: armed with the password I tried to ssh out to a linux machine to do some work and... none of the ones I can think of are configured to allow password, they're all key-only. (I have backups of everything... in Austin.)
It's pretty late in the day by this point (shipping estimated the new laptop would arrive yesterday, instead it came in after 3pm today), and by the time I'd rustled up an appropriate screwdriver and got the hard drive swapped and network access sorted out (registering the mac address with Fade's apartment's wifi... gave me an intercept screen asking me to log in? Seems redundant somehow. Oh well, phone tethering still works...) it's after 5pm. Old machine still has the bigger memory but I'm making sure this is STABLE before swapping more parts than strictly necessary. To be honest it's possible I could have fixed the old one with a can of compressed air, but I haven't got one here and am not entirely sure where to buy one (target?), and the hang problem going away and then coming back again is how I _got_ here. I want reliability, please.
Via the phone tether I'm downloading SO MUCH EMAIL... (Gmail's pop3 does about 1 message per second in 250-500 message chunks. Between linux-kernel and qemu-devel and so I, I get well over 1000 messages a day. This is likely to take a while...)
On the bright side, the time off probably gave my eyes time to adjust to the new glasses. (The myopia is the same, only the asigmatism has been changed to protect the innocent.)
Wrote up yesterday's broken tooth while email downloads. Not gonna backfill the rest because I didn't do anything of note and don't remember what most of it was anyway...
Still backfilling: this is the day I broke a tooth. Molar all the way in the back, bottom left side, next to where I got a wisdom tooth removed years ago. The tooth itself doesn't hurt, the magic japanese toothpaste is quite effective. Hydroxyapatite deposits more calcium phosophate on top of any exposed dentin and keeps the nerves protected behind bone equivalent... but it does nothing about the enamel, a large chunk of which is what broke off here, leaving a sharp pointy bit that's stabbing my cheek. The cheek hurts a LOT.
Regretting not getting to a dentist while I was in Japan, but while I trust the medical providers over there FAR more than the ones in the USA... there's still a language barrier, and my teeth are in terrible shape due to the extensive dental work I paid thousands for back in 2013. (The 6 month apartment I had for the Cray contract in St. Paul was right down the hall from a dentist, and I used them as a second opinion to say "yeah, those two front teeth that got chipped before they even came in because of that car accident when you were 5 years old smashing your baby teeth up into your gums so they've got grooves on the front? We we ALSO want to just drill all that out and turn it into fillings because it's weird and we're calling that cavities even though you yourself can't detect them in any way." So I went with it, and all the fillings they put in chipped to pieces and fell out entirely over the next 18 months, leaving me with large obvious holes in two front teeth. I paid a lot of money to get those holes, and felt really silly about it, but regular application of japanese toothpaste meant it didn't hurt and did not appear to be getting worse...
And now I need to wrestle with US dentistry for a _different_ problem. Dowanna.
This gap was due to my laptop being dead and having to mail-order a replacement because all my spares were back in Austin.
The "battery charging while using laptop" problem is getting worse, I just had two reboots (well, freezes forcing me to reboot) in half an hour with nothing plugged into USB, and while basically just typing in a text editor and no cpu-intensive anything pulling power.
I originally had this as "April 31" but then my python RSS feed generator went "boing" parsing the date, because there isn't one. (Midnight according to my laptop is in the middle of the day in tokyo, it's a bit fuzzy which day I'm writing for over there...) So I moved it here because I didn't write a blog entry today, due to ongoing travel recovery:
Somebody in email said "Canada as a whole seems to be determined to be a branch plant operation of US-based multinationals," and I replied:
This too shall pass. Maybe not fast enough to benefit either of us personally, and when both the roman empire and the british empire receded they left behind a lot of scar tissue, but no empire lasts forever, and the USA is already pulling back from the world now we're a net exporter of oil (have been since 2019) and thus don't really care about having a trillion dollars of navy policing everybody's shipping quite so much anymore. We've still got aircraft carriers, but they can't be everywhere, and we've gotten rid of most of the smaller patrol boats that used to _be_ everywhere...
It's hard to take solace in bad things happening, but the currently powerful aren't going to stay powerful forever. The USA is facing the end of the boomers (1946 was 77 years ago, their refusal to hand off anything gracefully is going to cause a LOT of loss of institutional continuity), climate change (Houston's flooded twice more since hurricane Harvey, but "again" isn't as newsworthy), the exhaustion of the ogalalla and california's central plains aquifers underpinning the majority of our agriculture (don't get me started on crop monocultures), the collapse of the US health care system coinciding with the rise of antibiotic resistance, a dozen kinds of invasive species (Texas has "crazy raspberry ants" that are attracted to electromagnetism and will thus ball your wifi router), the reshoring of manufacturing (if our trillion dollar annual defense budget stops paying for the navy protecting container ships "for free" then floating everything from the far side of the world instead of mexico gets a lot more uncertain)...
Canada has its own issues to work out, but the worldwid fascist crazy should recede with the Boomers (they got BADLY poisoned by airborne lead in the gasoline for 50 years, and in 2/3 of who's left it's combining terribly with senility to go past "kids these days get off my lawn" into flat earth territory). Once that's past, then maybe the wretched survivors can start shoveling out. (Step 1: universal basic income, which yes will only happen over the Billionaires' dead bodies. So maybe UBI is step 2.)
Personally, I would like the new victorian prudishness and the Fosta/Sesta (Comstock Act II) nonsense to stop being imposed on every other country in the world, along with the USA's tendency to treat children as non-persons. In Japan quite small kids buy stuff in the store and go on the train by themselves. In Europe kids can have wine or beer at dinner as soon as they can walk. In the USA we look back in horror on the days of "latchkey kids" because now they're non-persons legally confined to a building every day where they go through metal detectors and are watched over by police with live ammo who randomly search their bags and lockers, they can be arrested and permanently removed from any family that allows them to be alone on the street two blocks from home. It's "for their protection" that they can't work or vote or drive, if a teenager sexts a naked selfie to another teenager they can BOTH wind up on a sex offender watch list for life, which is a change to the law the Supreme Court only made in 1982 by the way. Ronald Regan hijacked the federal highway funds to force states to raise the drinking age from 18 to 21 (after vietnam lowered it since Johnson and Nixon were drafting 18 year olds to die, so they-who-were-about-to-die protested their way into being treated like adults in other areas, which involved "the man" gunning down protestors).
So yeah, rooting for the american empire to collapse. When France invented the guillotine they had to work through Robespierre and Napoleon (liberty, equality, oh look a rich white guy has siezed dictatorial power again, rinse repeat), but it worked out for them in the long run, and they're currently braving tear gas to push back against late stage capitalists who want cheaper and more obedient servants. "We can't afford this" means you squeeze the rich harder. We went to the moon WHILE fighting the cold war, whether or not we could "afford it" wasn't the big question.
I would love to reach a point where I could take solace in GOOD things happening. Not just looking forward to the end of bad things and hunkering down to minimize the inevitable collateral damage...
Travel recovery day. Headache. I had a row to myself on the motreal -> minneapolis flight so I managed to get an hour of sleep, but missed beverage service and got intensely dehydrated. (It's the pressure changes, that's WHY they do constant beverage service on airplanes.)
Then lyft couldn't find where I was in terminal 2 and cancelled the pickup, and then taking the light tactical rail home was... interesting. Minneapolis seems to have cut the rail maintenance budget, the second train (after the half-hour wait between trains because they don't run frequently enough) was dirtiest public transport I've ever been on. And that includes Camden New Jersey and the New York subway. (I am sooooo spoiled by tokyo.)
Got home to Fade's, crashed, woke up in the morning with a headache. Still have the headache. I have a lot to catch up on, but am unlikely to be very productive today.
I'm also realizing that part of the headache is probably adjusting to the new glasses I got the night before my flight. (Japan does better glasses than the states, but the ones I've been wearing are from 2017 and finally started scratching last year.) We ordered went in to get them over a week ago, but since I wanted the extra anti-scratch coating they needed a week to do instead of an hour, and then when I came to pick them up a week later they apologized and wanted to _redo_ them because the left lens was slightly off center, but they did a rush job in 2 days this time (and gave me the other pair of lenses in case I need spares). Which is all fine, but I'm adjusting to new glasses and this is the first quiet unrushed "ok, sit down and try to work" session I've had since... and it's stacking debuffs.
Flying back to Minneapolis.
Sigh, this laptop power supply with the intermittent data connection works fine when I'm not trying to charge the battery _and_ use the laptop at the same time. Suspend it, charge it up, then keep it plugged in and use it: great. But when I don't do that, it has three obvious falure modes: 1) power supply gets VERY hot (the first time I noticed the _smell_ of volatile plastic compounds becoming airborne), 2) laptop toggles every 15 seconds or so between charge and discharge (which can't be good for the battery, and has corresponding screen rightening and dimming poer management weirdness sometimes), 3) if I forget and try to charge a USB thingy (such as my bluetooth headphones) while it's also charging the battery, the laptop freezes solid. (Probably a kernel panic that doesn't get marshalled through X11 into actually showing me the panic or even THAT it paniced. The kernel guys have been throwing functionality overboard like hot air ballons dropping sandbags, and one of the things they gave up on a while ago was "go into VGA mode or framebuffer and dump text to the screen when you get a panic". Because who cares about THAT?
I note that the OLD battery, which was smaller even before it lost 1/3 of its capacity due to age, had much fewer problems because I think its maximum charging current was lower (fewer cells). The battery charging logic goes "aha, I can feed THIS much power into the battery" and when the controller can't interrupt that and go "I said I could deliver this much power but I'd like to take it down a notch now" because the data line's gone walkabout again, Dell's charging logic goes all pear shaped.
Anyway, I tried to charge my laptop and phone before getting on the plane, laptop went suddenly catatonic and had to have the power key held down until it turned off, so I've lost all my open windows again. Sigh. Right as I was getting into a position to dig out and address the backlog.
Plane is 100% crowded again, and they made the seats smaller (again!). Front to back AND side to side, I'm trying to use laptop at awkward angle on the TV tray but it just plain doesn't FIT. And while I theoretically have a power outlet (I assume that's what the green LED on the seat in front of me near my right ankle is about), I can't see it well enough to actually plug into it. They turned the cabin lights off because it's an overnight flight, and my phone battery is fully dead so I can't use its flashlight (well I didn't really get to CHARGE it in the airport, did I?). The overhead reading light doesn't make it down there...
Not _much_ makes it down there. Air Canada added seats to their planes since pre-pandemic times, and reduced flight frequency so every international flight is 100% full. 13 and 1/2 hours in a space too small to pick up anything that fell onto the floor (unless I can hook it with my foot, I'd have to ask the guy next to me to get out of his seat, which means the person next to HIM would have to get up and stand in the aisle). The accumulated muscle cramps from being unable to move is not pleasant. Add in the usual 6pm tokyo departure time and the sleep deprivation is... getting unpleasant.
Dunno how much of this is Air canada and how much of this is post-pandemic late stage capitalist capitalist profiteering, but the days where I wrote most of the ps.c infrastructure on a flight back from Japan seem long gone.
Jeff wants me to meet Mike today and talk about future plans. I really, really, really don't want to, but there isn't a graceful way to back out of it. I strongly suspect they're going to try to pressure me into making more of a commitment to Jeff's company. (I'm not signing anything before Fade can read it. I also have no idea if Google wants to continue the toybox funding beyond what they've already done, but I'm not moving on from that until I've done all I can there. Jeff's project does not take precedence over MY project.)
Jeff hates when I say I'm working on "his projects", and insists that it's "our project". He has a "vision" that he's upset he hasn't ben able to explain all of to me because I keep getting derailed into practical things we need to do, but I'm not interested in infrastructure in search of a user and a big strategic goal that can't be concretely implemented. We've worked on and then left half-finished a dozen different pieces of technology. I care about what can get completed and put in the hands of users. If Jeff had funding for us to spend 5 years focusing on Basic Research in the vein of bell labs or xerox parc, great. But we don't. And I got enough swap-thrashing on toybox, thanks.
Still, I'm learning interesting stuff I didn't know before. And I'm FINALLY getting to the point where I know a LITTLE japanese. There have been anime dialog scenes where I followed multiple consecutive sentences! Yeah, ok, simple ones, but once there were FIVE sentences, in a row, that I understood almost all of. Alas, in actual interactions with japanese people, knowing "this is the point where the person running the cash register asks me if I need a bag" is still far more useful than my ability to parse the words...
The OpenLane git checkout is about 2 gigabytes. We went on a "deleting
stuff we can prove isn't needed" spree... and the result is 2.1 megabytes
of actually needed scripts. The .git directory is over a gigabyte, full
of old long-deleted churn. They also checked in every project that's
successfully built against this thing INTO THE OPENLANE REPO. (Remember
when uClibc had a test suite containing every package that had ever
successfully built against uClibc, and the invocation necessary to make
it work in the new context? That test suite turned into the "buildroot"
project. Well OpenLane has something similar, and it's FAR BIGGER THAN THE
ACTUAL PROJECT.) This is pufferfish territory, the project is making itself
look big, but once you cut through the cloud of squid ink there's not actually
much there. I need to fix up the toybox shell because people are using it, which means
I need to finally add the command line editing and history (without which
it seems way less finished than it is because monkey brains conflate
user interface polish with functionality, and yes that includes _me_). Command line editing is adjacent to the chrunch_str() logic in exactly
the same way fold() is, namely that "backpace eats how much" is the big
missing piece of both. Which is a nonobvious question to answer because
the HARD part is tabs advance by a variable amount based on where they
started. (Also, nonprintable characters are TRAILING, which is the dumbest
thing the unicode committee ever did. A printable character does not FLUSH
pending nonprintable characters, the printable character comes first and
is them modified by following characters, being REDRAWN ON THE SCREEN multiple
times in some instances, which also means you're never sure you've finished
a stack of combining characters until you've read PAST it and gotten
a character (not byte, utf-8 sequence parsed to unicode point!) that is NOT
part of this one, which you then need to unget and
process seprately in the net go 'round the loop. When you've got a string
fragment, you CAN'T know. It could end in an unfinished utf8 sequence.
There could be a combining character following it. Pretty much the only
thing that DOES tell you it's done is newline... and then what does
combining characters at the start of a line MEAN exactly when there's
NOTHING FOR THEM TO COMBINE WITH? (What, do they combine with an implicit
NUL? What would that mean? The last newline has an umlaut!)
It's REALLY STUPID because they did it BACKWARDS.) But it's what's there, so we must cope.
I keep wanting to do a "how to bloat code" presentation, starting from the classic K&R "Hello World" and then shifting over to C++ with accessor functions doing exactly the same job but passing the data between all the various "enterprise" style contexts, demonstrating the full range of "it's not code reuse the first time" nonsense, and generally showing that you can have a very large amount of infrastructure that LOOKS like it's doing something but isn't really.
I'm reminded of this becuase the skywater toolchain turned their README from a text file into one of those "markup that generates HTML" things, yes of course it has stylesheets, and along the way they added a lint variant to check the validity of their markup, then of COURSE they factored it out into a subrepo which does a "git submodule update" at build time. (So you check out the repository, and then when you run the build it checks out more repository within the build.)
Remember: this is a README. Historically, this was ONE SMALL TEXT FILE. There's no build infrastructure for a text file. None of this is NEEDED, but this crap it's metastasized into is pulling in a chunk of another one of Mithro's projects, which is doing dependency checking against the host to make sure various packages and versions are installed... except it's checking RPM and we're running it on a Debian system so it's not finding the magic red hat package names out of the wrong kind of repository. And since they did a "-include" this whole mess should just DROP OUT if the repository isn't installed... but they're installing it WITHIN the build.
Someday I should do a proper writeup on why Google's OpenLane project stalled...
tl;dr: OpenRoad is a DARPA funded initiative, which works. OpenLane is fundamentally a small number of shell scripts that call the OpenRoad tools to do their thing in order, with reference to the Sky130 PDK which is basically "fonts and CSS to make a mask for this fab". (A fab is sort of a really high end printer. We're submitting a job to it. The job is literally a big data file.)
OpenLane is a partnership between Google and SkyWater (which used to be Cypress Semiconductor) to create an open toolchain for Sky130. Google hired the guy who did QFlow to work on the tools part, and then subcontracted much of the fab integration work to a company called "Efabless" which has an existing business taking people's design files (mostly in Verilog) and converting them into something the fab can accept. Which means if Efabless were to succeed at what Google's paying them to do, it would undercut their existing core business. There are two big projects here: OpenLane is a set of control scripts that call the OpenRoad tools in the right order to perform tasks, and the other is "Skywater PDK" which is a data dump from the fab. Tim Edwards is running a giant pile of fixup scripts in the Skywater PDK build because the fab's data dump is horrible (there's like... OCR errors in it or something?) And the resulting PDK is subtly broken half the time, although somebody found that if you run make TWICE, the result is usually good after the second time. (But the only way to determine if the result is good is to build the REST of the toolchain around the resulting PDK, then build your project with the resulting toolchain, then test result. Which is time consuming and labor intensive.)
The guy at Google running this is Tim "Mithro" Ansell, who is writing his own build system to do some portion of all this, except he doesn't seem to have done this before in a nontrivial way so doesn't really know what success looks like? He's a fan of the concept, but not a veteran. Jeff (who has done this before) keeps telling him "you need to do this" and getting dismissed as silly, and then 6 months later they realize they need to do what Jeff was telling them. Kinda like the RiscV guys, really...)
So Mithro grabbed chunks of his symbiflow project and stuck them to the Skywater PDK builder, specifically he's grabbed anaconda, which long ago used to be Red Hat's system installer. It was the large python program that would run when you booted an install CD (or floppies) that would partition and format your disk and let you select what type of Red Hat system you wanted to install, and would then install all the packages. This was back before Fedora and Enterprise happened, they replaced it with something else rather than rewrite the large pile of Python 2 code in Python 3. But the old 1990s Red Hat system installer seems to have spun out into its own project maintained by yet another proprietary company producing source-under-glass that you can see but would be crazy to try to build or install yourself) and is using it to confirm prerequisites are available in the local RPM repository. On Debian systems that don't use RPM, this doesn't find much.
Specifically, it's complaining that "yosys" isn't installed. It is, and it's in the $PATH, but since Mithro's slurped-up symbiflow plumbing that's installing a proprietopen fork of Anaconda didn't install it, it's not finding it. If it just tries to call "yosys" it's there, and presumably "yosys --version" might say if it's new enough, but instead Mithro/symbiflow/anaconda/openlane runs a large pile of python 3 which returns an incorrect answer.
Note: it doesn't have to do ANY OF THIS AT ALL, becuause if yosys isn't there then you should get an obvious build break where the last line of output is an attempt to run yosys and getting a file not found error. This is basically the "assert" problem where the bug IS THE EXISTENCE OF THE ASSERT.
It looks like if you remove that whole subdirectory, the enclosing makefile should just work becuase it has - before the include to skip the nonexistent file, and then the $(wrapper) variable drops out and it just calls the rest of command line. Seems worth a try, anyway. So I'm trying to remove the git repository, and I chopped the makefile target that clones it out.
Except he didn't just clone it there, he added it as a submodule, which means it's getting cloned already. So I need to remove it from the Makefile AND remove it as a submodule from the parent repository. Except the makefile of the parent repository is checking it out. (Remember yesterday's "how do I remove a submodule"? Because me not checking it out didn't prevent THIS from checking it out.)
One of the subrepos needs to be patched, which means we check out our own copy and tell the build where to find it. And there's even a make variable for this! Except autoconf is marshalling data from variable to variable, and if you track it back the top level configure is setting it with a hardwired path.
The whole project is like this. They keep making layers of infrastructure and then hardwiring it to do specific things. There are obviously multiple teams working at cross purposes here, and the PROPER fix would be to RIP OUT all the stuff that is we can PROVE is not doing anything. But you don't show progress in a fortune 500 company by REMOVING code. Code has a dollar value attached, generating more of it is always progress. Code gets depreciated and amortized, not _deleted_. Deleting it costs MONEY. Creating more is profit! IBM's KLOCS and so on...
We, on the other hand, are trying to get something to WORK.
So git grew a "fatal: detected dubious ownership" error whenever you cd into another user's directory and try to "git log" a repository. Not a "warning", but "I stubbornly refuse to perform the requested operation". So far the only fix is to sudo and run git as root, where it doesn't care about permissions.
That's really stupid. Barfing this way when WRITING to a repo is one thing, but I'm cd-ing into another user's directory and trying to "git log" and "git show" individual commits there. (I could tar the repository and extract a copy in my home directory so it all belongs to me, but that's deeply silly. And inconvenient.)
I don't know if google has deteriorated to the point it can't find it, or if there's no way to fix git other than to build it from source with this test patched out. Luckily, there's part of a git implementation in toybox, and this would be a reason to finish and use it.
In the meantime, it makes debugging a build that runs as a different user extra-annoying... and even more brittle than I thought? Darn it, the sudo workaround isn't load-bearing: if I do an "env -i PATH=$PATH git log" as root, I get the "fatal: dubious ownership" abort again. It's something about HAVING RUN SUDO that git goes "oh well if you really mean it". Actually doing so AS ROOT is not allowed by git. (I mean, I could destructively "chown -R root:root .git" but then the original user couldn't use it. Before finding out git was treating sudo as magic, I was thinking the right thing to do here is create an LD_PRELOAD library that wraps stat() to patch ownership to always equal getuid(), but even that won't fix it?)
I am so tired of myopic git developers. The reason I stopped maintaining kernel.org/doc is when kernel.org had a breakin (because one of the devs was ssh-ing in from a windows machine, and once you'd logged in the server wasn't that secure internally across users) they locked the barn door after the horses had escaped by removing generic ssh support, including the ability to rsync over ssh. I pointed them at a way to make ssh explicitly call rsync, with forced prefixes and everything, but they weren't interested: they'd homebrewed some horrible wrapper tool that ONLY let ssh run git (nothing else), so to update the website I had to check everything into git, including things like the gigabyte video file (USB driver writing tutorial) they'd removed after the breakin which I wanted to put back online. If the file then moved elsewhere (it was eventually uploaded to youtube) it would STILL be taking up space in the .git directory both in my local copy and on the server, forever. But they were deep in "If all you have is a hammer, everything looks like a thumb" territory, and I gave up and moved on...
Aha: I asked on the #git channel on freenode, and the magic invocation is "git configure --global 'safe.directory=*'", and they pointed at a reference for why the stupid happened. (And they confirmed it's checking SUDO_UID, which is just wrong.) Yay, found a way to make it stop.
Long argument with somebody online who claims that statically linking an initramfs into the kernel is weird, and that in 20 years of messing with Debian and Ubuntu they've never encountered it, so obviously nobody does it because their experience is universal. And apparently contradicting them was considered insulting. (Insisting that having used debian and a derivative of debian makes your experience universal, thus everyone else is weird, was apparently NOT insulting. Go figure.)
I can't link to it because it wasn't cc'd to a mailing list, only to Jeff. Mike texted to yell at me about "burning bridges", which means Jeff forwarded it to him. I'm also informed that Jeff has apologized to them on my behalf, which was not something I'd asked for (or been aware of at the time).
A lot more "huh, Google can't find this" instances during said exchange, which kinda undercut my point that what I was doing is not unusual. The deterioration of Google is getting alarming. But I did manage to at least dig up a few interesting numbers, which I can cut and paste here:
My perspective is skewed. Not just because I'm the guy who wrote the initramfs documentation in the kernel back in 2005, but I also maintain the command line utilities of Android and used to maintain the command line utilities used by Linux routers. That means I hear from those communities a lot, and they're orders of magnitude bigger than desktop Linux. That's not hyperbole: this page estimates there are 33 million active Linux workstations, and 1.6 billion active Android devices. Add in about ~750 million routers of which around 91% use Linux and "somewhere over 500 million" seems a reasonable guess, bringing the embedded total from just those two sources over 2 billion active installs.
So I regularly hit things that are "weird" for 33 million installs and "normal" for a couple billion. It's hard for me to convince the developers who make those billions of devices to show up even briefly on linux-kernel because they got tired of being called weird, and being seen as pushy when they try to explain. And if they won't show up, out of sight out of mind. (They think _I'm_ weird for still engaging with the kernel community at all.)
(I didn't even go into the PC hardawre space, where Red Hat claims to have a 33% share of the "worldwide server market" although that's in terms of who's paying for their OS, not installs. In terms of seats, all conventional Linux distros together are collectively 2.1% of desktop installs, behind ChromeOS at 2.2%. Windows is over 74%, Mac is 15.3, neither NOTICES those two. And in the PC "cloud" space... it's still Windows at 72%.)
Working on toybox stuff today instead of Jeff's thing, but I no longer feel safe huddling in my hotel room (and they're cleaning the room today anyway), so I went to the Hello Office, and when Jeff arrived he got mad that I didn't immediately stop working on the toybox thing and start working on his thing instead, and he left abruptly and angrily. I took the train to Akihabara to try to find a coffee shop there (it's the other part of tokyo I'm familiar-ish with), but didn't bring an umbrella and got caught in a rainstorm. (I have an umbrella in my hotel room and we found FOUR cleaning up the hello office, I am not buying ANOTHER ONE). Holed up in a random not-mall space, but it didn't have good seating to use laptop with so I mostly watched stuff on my phone. Eventually a lull in the rain let me take the train back, except I was most of the way to Shibuya before I realized I was going the wrong way down the Ginza line. The train was too crowded to pull out laptop there either. Got back to the hotel eventually.
Yesterday's entry was too long so I moved the battery tech description here. I have actually learned a lot about battery technology this trip:
Step 1 was to stare at boards taken out of the OLD system being salvaged and repurposed. (Well, diverted. The batteries were ordered for another project but never actually installed. The pandemic messed with shipping logistics or some such. I think they were going to be used at a wind farm in another country, but didn't make it there?) The existing system has thousands of batteries in over a hundred big cube things, each cube is sort of an industrial garden shed meant to live outdoors. It's a proprietary Chinese design made from imported western chips that require an NDA to get programming specs for from the (german?) chip vendor to figure out what we can salvage and what we have to reimplement. Each battery management board attaches to a case containing 52 Lithium Iron Phosphate prismatic cells: big blue rectangles with two terminals on top like a car battery, each roughly 15x10x4 centimeters and weighing a little over a kilogram.
The resulting pack of 52 is in a big (aluminum?) case, kind of a horizontal silver version of the Monolith from 2001, which is too big to fit in the elevator to the office, and most of them are a 2 hour car ride away anyway. (Mike has a car but Jeff and I don't. The guy who salvaged the batteries bought a plot of cheap land out in the countryside to store them. It's not anywhere near a train line, and does not have much in the way of hotels either. You know the parts of rural japan that have a lot of abandoned houses and entire towns with no one younger than 65 because of the declining birth rate? Yeah that. Jeff and Mike went out there and fetched stuff a few days before I flew in to Tokyo, but did not bring back an actual battery. Just an assortment of easily removable electronics and lots of photographs.)
So terminology: "battery" is a collection of cells, and "cell" is an individual anode/cathode pair (with electrolyte and separator), in this case those big blue rectangles ("prismatic cells"). The battery cells are wired in series because Lithium Iron Phosphate chemistry produces 3.2 volts (plus or minus ~10% depending on how charged the cell is; the voltage rise/drop is actually how you tell when you're done charging and discharing the battery). So each prismatic cell holds a LOT of power (over a hundred amp-hours) but produces a tiny voltage. Wiring them in series adds up the voltages of each battery, so 52 x 3.2 = 166.4 volts, at a LOT of amps. (Each monolith is roughly like a Ford E-transit battery I think? Same ballpark anyway, I don't have the numbers in front of me.) And each cube has a couple dozen of them: it was a VERY big battery farm.
So the board attached to the front of each of these 52 battery monoliths has four AFEs, which stands for "Analog Front End". It's a big analog to digital converter that measures voltages, and each AFE has a 28 pin connector hooked up to it through a zillion little resistors and capacitors. Those pins come from the batteries, the general idea is to have a connection before/after each cell so you can measure the voltage put out by just that one cell, and if it's higher than it should be while you're charging the battery, you can route current around it through the same pins so it doesn't charge up as much as the others in the string. Except the AFE chip can only divert like 1% of the charge current around the battery, so it's just a LITTLE bit of balancing, but it can happen each time you charge, and you can choose to stop early on either the charge of discharge if some cells are hitting an end stop and others aren't yet, sacrificing collective capacity to avoid damaging any of the individual cells.
This is how all battery management systems work, people do youtube videos about this. You start with balanced cells when you assemble the battery pack, and then do a tiny amount of balancing each time you charge them to _keep_ them balanced. If they're all from the same production run and have been linked together since, they should only really get UNBALANCED due to slightly uneven heating. But the home users making their own battery walls mixing and matching scavanged cells with very different origin and history, their battery management systems will have a LOT more work to do, and may not be able to keep up. Hopefully not an issue here.
So anyway, this chinese design is 4 copies of the AFE chip vendor's reference design board glued together (literally; they sell it on their website and it looks _identical_), with a different 5th board on one end (haven't found what they copied that from yet), and then the whole thing laminated under at least a milimeter of plastic to keep moisture out (and maybe electrical insulation). The 4 AFE chips are daisy chained together talking some SPI/serial variant, and then for the 5th board the SPI connection goes to a microcontroller. The microcontroller is from the same company that makes the AFE chip, and it's more or less a motorola 6800 from the 1970s with a bunch of SRAM and flash bolted on.
The 6800 is an 8-bit predecessor to the 32-bit m68k from the Amiga and Macintosh. The MOStek 6502 in the Commodore 64 and Apple II was to the Motorola 6800 what the Zilog Z80 was to the Intel 8080. In both cases, engineers who worked on the earlier one left to form their own company. So a 6800 SOC with 128k of sram is roughly equivalent to a Commodore 128, albeit clocked a bit faster 30 years later.
So that 5th board section is the controller, and at the far end of THAT board is a CANBUS connection to the outside world. CANBUS came from the car industry and is also used in manufacturing automation. The problem is, CANBUS is just a "read value from address, write value to address" protocol that tells us nothing about what's being said. The chinese manufacturer's board design is all under NDA (we recognized the AFE reference board they copied because there's a picture of it on the chip manufacturer's website), and the proprietary-is-good chinese company built their stuff out of chips that are all themselves under NDA. (There are perfectly good non-NDA AFEs on the market, but that's not what they chose to use.)
Even if we felt up to reverse engineering an assembly dump of the biggest program that could fit in a Commodore 128, we can't get at it because this NDA SOC has a fuse you blow to prevent reading the flash back out. (What this system is doing is generic and well understood, and the patents on Lithium Iron Phosphate batteries themselves expired last October (although that's mostly about manufacturing more cells, not managing them). But every design decision in the electronics has been about obfuscating what they're doing to protect largely nonexistent intellectual property. They want to SEEM unique and magic because you can't tell what they're doing, or interoperate with any of their existing stuff to repurpose it.)
So EITHER this control SOC is a dumb translator passing on the AFE info to whatever is at the other end of the CANBUS connection, OR this is where the battery management program lives that's measuring the voltages and making the bypass decisions and reporting how "full" or "empty" the battery is so it doesn't overcharge or undercharge and damage any of the cells. It's either a passthrough or it's the brain, no idea which.
Solution: build a new one and replace the whole board. Which also has the advantage that when these repurposed wind farm batteries run out, we can order more prism cells and put them in our own case (Jeff found an aluminum fabricator in japan that would do nicely), and make our own battery systems.
Oh, the old battery cases are also water cooled. (Well, half water half ethylene glycol.) The big cube shed things have an elaborate climate control system, and the spec sheets say that these batteries operate within about 3 degree celsius temperature range. This is partly because they packed a LOT of batteries tightly together into each cube (and run a LOT of power through them), but also because they apparently didn't want to do the math about how the batteries behave differently at different temperatures. (Which Jeff has papers on with graphed curves and math... but the chinese engineers apparently didn't bother.) Which is funny because the 28 pin connector only NEEDS 14 wires to measure the battery voltages (52/4=13, so 12 between each battery and 2 at the ends of the string), and most of the rest are probably temperature sensors? If we're using the same case and wiring harness for the initial deployment we want to reuse those temperature sensors, but... no documentation. Gotta go poke at stuff with voltmeters and crack one open to find what's actually there and look up data sheets...
All this stuff has to get re-certified to be hooked up to the grid, we need to find or create documentation for the parts we keep. (Jeff also wants to find a "current shunt" for measuring the whole battery, because adding up the individual cells isn't good enough. It's apparently somewhere in all this.) Personaly, I'm uncomfortable mixing enough electricity to run a car with conductive liquids, but it's what's already there. Cracking them open and then deploying the result is a thing we would rather not do, so we want to just replace the electronics on the front without opening the case. (Also, Lithium Iron Phosphate is WAY SAFER than Lithium Ion. I would not want to do ANY of this with Lithium Ion. The downside is LiPo4 only has half the energy density of the best Lithium Ion, but it can go through a LOT more charge/discharge cycles without losing capacity. "Lasts fifty times longer and puncturing a single membrane doesn't result in a three hour fire water won't extinguish" is rather a nice trade-off.)
Anyway, coming up with a plan for what to do with all that was "Milestone 2". Initially the goal for that was "design a demonstration prototype unit" (and milestone 3 is building/delivering like 3 prototypes they could show to people), but we wound up debugging through enough of the original electronics (and out the other side) that we came up with a scalable manufacturing plan for all-new replacment parts.
At which point I assumed we would actually start making stuff, but so far...
Jeff wanted me to come along to Shibuya for a meeting with Mike and PK today so we can go over business plans for fundraising. Because of course. I mentioned not wanting to talk to Mike, and Jeff went through several variants of "that's not good for me", "you can't do that", "get over it", and "suck it up and deal" (none of them phrased _quite_ that way), and I went along rather than argue.
I think Jeff's position here is "ha ha, Mike just _threatened_ to have you arrested and presumably deported and barred from the country, it didn't actually happen, so no harm no foul". My position is "Mike showed me who he is". There's probably some divergent neurochemistry in there, what with me being an ADHD poster child and all. (Growing up I was diagnosed "hyperactive and gifted". They hadn't invented ADHD yet. This isn't exactly rejection sensitive dysphoria because I never wanted Mike's approval, he's a friend of Jeff's whom Jeff finds useful for running the business that gives us the opportunity to work on the interesting tech. That's not the relationship jeff wants me to have with mike, and "I work on interesting tech with you" is not the relationship Jeff wants me to have with his business, either. But I started working for him in October 2014, it's 7 and 1/2 years later, and I can't think of a single thing we worked on that actually got deployed. We have not shipped ANYTHING to a customer except prototypes and demonstration units. As with Linux on the Desktop and making Android self-hosting, I keep grinding away and want it to work, and we get closer. There's a bunch of good side effects. But I am no longer trying to organize my household finances around it succeeding: for the moment Google is paying the bills so I can focus on toybox, and THAT is plenty of challenge for me. I don't know how long that situation will last, and am trying to make the most of it. This is VACATION TIME from that.)
I want to learn tech stuff from Jeff, and he's got a lot of great projects to do. I was hoping this trip we might reopen the VHDL to implement the barrel processor and fourier engine functionality we were talking about before Covid happened. Making an ASIC work through Sky130 _or_ ArtAnalog/TSMC would be great too. Jeff's also talked about doing a fresh j-core implementation starting over with a tomasulo/scoreboard design so it can do multi-issue. But we're not working on any of that, because we have more important things to do... as in chase money. I always get lured here with promises of tech work, and then we do a big fundraising document.
This time the document was called "milestone 2", and there was at least a lot of high level technical design work involved as we worked out how to recondition a load of batteries someone wants to repurpose from storing power at a windmill farm into individual combini and factory power walls. Only 6 of Japan's 17 nuclear reactors have reopened since Fukushima, meaning they leaned hard back into fossil generation without really PLANNING for that to happen, and in the past couple years the cost of Japanese electricity has tripled as fart gas got expensive due to Vladimir Putin's dick being too small. Load shifting from overnight to daytime is now potentially a big cost savings, and that's a market with legs anyway as wind+solar ramps up, so let's get into the battery management system business! Sure, why not, sounds like fun. I've been watching prudetube videos from Will Prowse and such about this sort of thing for years anyway, I'd love to learn more.
When I first went to work for Jeff in 2014 his company Smart Energy Instruments was trying to retrofit the electrical grid with sensors so we could feed a lot more wind and solar into it. I'm big into renewables and getting off fossil fuel, this IS my idea of fun. I very much want to see this project succeed, and grow into a sustainable business.
In the first ~2 weeks of this trip I learned how battery management systems work, although I doubt I could quite reproduce all the math myself. We've confirmed we can do a new one from off the shelf chips that don't require an NDA, plus technology Jeff has lying around from previous projects. Yay! The result was... a document that goes to somebody who gives Jeff money for having completed the project milestone. (But that somebody is not a "customer" and Jeff was angry that I kept calling him that.)
At the end of my original trip it looked like we were just about to actually starting building stuff, so I agreed to stay a couple more weeks. Then as soon as the trip was extended, we did the Open Project stuff to make gantt charts, and today's meeting I didn't want to attend was about preparing for a fundraising round.
After the meeting in Shibuya, Mike wanted to talk to me about setting up a meeting where he and I go over a new contract for me to come back to work for Jeff's company full-time. I was noncommittal. I'd rather NOT be arrested and deported before the 17th.
Travel arrangements for the potential talk in taiwan this summer went a bit off the rails yesterday, because when they originally asked about airport selection I thought they were talking about the DESTINATION airport not source (I didn't recognize airport code IAH)... so they booked me out of Houston. Oops. That's a 2.5 hour drive away from Austin (if I still had a car, bit longer by bus). I asked if they could add a connecting flight since a quick check of commuter flights from Austin to Houston shows a bunch for around $70... and they offered to refund me the $70 when I got there. I was too tired to cope and thought I'd try again in the morning. (Actually I went "maybe I could just send them a video of my talk, and not go in person because even if they can't amend the itinerary I'm sure they can still get a refund this far in advance...", and was up until 3am doing an outline.)
The money isn't the problem. The USA's insane security theater is the problem. Airports these days, you're supposed to budget 2 hours to get through security, and if I arrive on a different itinerary than I'm continuing on I have to go through security AGAIN (and collect my luggage and re-check it), which means my one hour layover adds 2 more hours, and I still have to arrive 2 hours early in Austin, plus the actual austin to houston flight, plus me getting up and going to the airport in Austin, which means my ~1:30 pm depature out of Houston is now something I should leave home for around 6 am. It's now a red-eye flight with something like 17 hours of travel before I arrive in a strange country to deal with a new kind of customs check and trying to find the hotel. (Tokyo I more or less know my way around now, and can recover from inevitably getting lost more than once on my way anywhere. Taiwan I've never been to, I'm assuming it has a rail system or busses of some sort? To... a hotel? Somewhere?)
What I'd MEANT to work on last night back at the hotel (instead of outlining a talk I don't have to give for months yet as a way of venting "dowanna deal with this right now" anxiety) was getting Jeff an initramfs that extracts a tarball into a subdir and then does a proper switch_root into that. Which means teaching switch_root that it's not only partition boundaries block file deletion, it should also skip the destination directory. Oh, and it wasn't doing the mount --move on the existing partition mounts, thought it was but apparently I hadn't implemented that.
Doing this means the j-core developers don't have to rebuild the kernel each time to change the root filesystem contents. (They never taught the j-core bootloader to load an eternal initrd.gz file because doing that with device tree involves either patching an existing device tree on the fly or doing the device tree overlay thing, and we just never got around to it.) Upgrading switch_root is a good toybox thing to add to the release I'm preparing anyway, seemed like a good thing to prioritize.
So after about 3 hours of sleep, I got up and started working on that, and an hour later Mike called up to yell at me that I was making "us" look poor by arguing over $70 (the conference that invited Jeff and me to speak had offered to cover travel, I took them up on it, I hadn't yet responded to the email offering $70 instead of amending the itinerary). When I told him my position on that, he shifted to yelling I hadn't left the hotel in days (not true?) and that I didn't have the work for Jeff done yet which he had CALLED INTERRUPTING ME WORKING ON. (And which I do best in the hotel because our Hello Office is hot, stuffy, windowless, and full of trash.)
They're paying for my trip but not paying me a salary. I like tokyo, hadn't been here in years, from my point of view I got a free vacation to Japan and am helping out friends, but I've GOT a day job working on toybox, and am trying to keep up with that while I'm here. I want to see them succeed, but this ain't paying the mortage (or the $8k to clean the mold out of the vents back home).
I told Mike that his call had interrupted me doing the thing he had called to yell at me about not doing, and that I was going to go back to doing it, and hung up and muted my phone for a bit. Twenty minutes later I noticed his Signal message: "Rob pick up the phone or I will get you arrested by the police". (Did I mention Mike is a japanese citizen and I'm not? On an unrelated note, when I first got ADHD meds I didn't know they weren't allowed in Japan: different schedule levels and not honoring foreign prescriptions and so on. When Jeff and I cleaned up the office I found all sorts of old pre-pandemic things, many of which I'd asked Mike about the status/location of before said cleanup. Back when I left the apartment in Japan expecting to return _before_ the end of the pandemic, I left behind two suitcases worth of stuff which Mike moved to the office when the lease on that apartment finally expired and they chose not to renew it.)
So at that point it was me trying to get the code done and sent to Jeff before the police showed up. I got it done, tested, checked in, and an updated image (with build procedure) emailed to Jeff. I then left the hotel room (still sans police), met Jeff at the Hello Office, and walked him through the build so I was sure he could reproduce what I'd done.
I'm sure other things happened later that day, but I couldn't tell you what they were. No, police didn't show up. (I dunno if Mike was bluffing or merely changed his mind. I was actually looking forward to maybe getting to go home early, I'm kind of regretting extending my stay from the 3rd to the 17th if it's going to be like this, and NOW the complication with the Taiwan itinerary thing has switched to "I asked them to fly me in from the USA and out to Tokyo, but I'm no longer sure I could/should come back to Tokyo"...)
Cleaned out the spam folder again. For some reason, gmail has decided that all of Elliott Hughes' posts to the musl mailing list were spam. (Not the REST of the threads he's replying to, just him. And not his posts to me or to the toybox list.) Yes, he's a google employee emailing from a google.com address. *jazzhands*
Finding stuff wrong while doing release notes, as you do. (Nothing highlights gaps and weird cornercases like writing documentation.)
I should record my talk proposal writeups instead of just entering them into various call for papers websites where the ones that aren't selected vanish. (I mean, I SHOULD do them as online talk videos, but it's really hard to motivate myself to talk to camera by myself in a quiet room. As the twelth doctor said in the episode Heaven Sent: "I'm nothing without an audience". I was doing ELC but the Linux Foundation's really done a number on that one and I haven't been able to brace myself for it since 2019.)
Slogging away at toybox release notes. I've done a new entry skeleton that I may leave commented out at the start of the page because I haven't entirely been consistent in my category headings. I should probably ALMOST CERTAINLY switch the index.html link to point to the "about" page (which tries to explain what the project is and why) instead of the "news" page (which is long technical gibberish release notes and not a good first impression; yeah proof of life but not a gentle slope to ascend instead of a vertical cliff).
Oh goddess. You know how news coverage and articles always seem authoritative until you read something you already know about, and then there's multiple obvious errors? I just read the Wikipedia[citation needed] article on Bionic. Lots of "that's just wrong", "that's almost a decade out of date", "Elliott fixed that because of _me_", "no you can just do this instead"... I may need to go lie down. (And I'm not even a Bionic developer!)
Jeff and I cleaned up the Hello Office by dragging most of its contents out into the conference room down the hall and then putting 3/4 of it back and throwing out the rest. (Well, right now it's a pile of trash in the middle of the office because the building's trash room is only open for an hour in the mornings, and we didn't find a box cutter for the boxes so may need to buy one. But I'm calling it a win anyway. Three hours of lifting and hauling. I do NOT get enough exercise.)
During the shoveling I unearthed a mysterious CD which turned out to have the professional photo I had taken (japan still has that service, like Sears used to) for my now-expired zairyu card, which means my own attempt was not strictly necessary.
Woo, 5000th commit to the toybox repository! I feel I should have some sort of celebration. (I bought an instance of the famed famichicki from famimart. Tasty, but very greasy. I prefer their teriyaki grilled chicken breast to their fried offering. I guess both technically qualify as famichicki, but the fried one is the meme.)
I am REALLY TEMPTED to add a new option to toybox echo so it can split the arguments it's printing with newline instead of space. There's a lot of "ls blah/*/blah | xargs | blah" to glue things together, but the other way has to use "echo blah/*/blah | tr ' ' '\n'" which is awkward (quoting both arguments!), but not awkward ENOUGH to add "echo -N blah/*/blah" and open the whole "should I try to teach busybox and coreutils about that" can of compatibility worms.
Yes, I added "test -T 37" recently, because I care that the file descriptor is OPEN, not that it's necessarily a terminal. And I couldn't figure out how to do that otherwise... Sigh, figured it out: I can just do "2>/dev/null <37" instead and the shell will error if the filehandle isn't open (because the dup2() fails). Alright, remove that then. (This is why releases take so long: writing documentation reveals needed code changes, and both blogging and release notes count.)
Under the weather today. The kebab place Mike likes to go has very spicy food, and I killed a roach that crawled over the table while I was eating. This morning my digestion was not happy. Go figure.
Working on closing tabs for a toybox release. So many tabs...
I need to rebuild toolchains with musl-1.2.4 but I never did bisect why sh2eb won't build under newer gcc, and I can't ship new toolchains without that. (Well, ideally I want to rebuild the hexagon llvm too, which had its own version skew a while back. I also want to redo mcm-buildall.sh to not need mcm, at which point I can probably stick the new replacement in mkroot.
And I need to finish mkroot/README and update the mkroot faq.html entires...
Some text I cut out of my reply, as "not helpful". (I have a venting-about-lkml budget I try to stay under, and that message already had plenty. Here, no such limit. Well, I spent so much of the rump administration venting about what the ruling nazis were doing that I forced myself to only do it on odd numbered days and left the even ones for technical stuff, but A) much higher limit, B) that was people's lives rather than just niche drama.) Anyway, what I wrote was, with URLs moved into actual links because blog instead of non-HTML email:
I was going to point you at the last kernel commit with "oppenlander" in it so you could confirm which email to use, but I have a repository here going back to 0.0.1 and his name's not there despite the patch submissions. He's not a regular in the clique, so nothing he submitted ever got in. That's linux-kernel for you.
(For all my faults, I historically _have_ managed to get code into linux-kernel. Largely because I'm really old so have been around longer than a lot of the grognards gatekeeping these days, and I was even technically the Documentation maintainer between commits 01358e562a8b and 5191d566c023. And I understand the whole "if you want to get anything done you have to complain until you're blue in the mouth" Dead Parrot aspect of the project the author of Squashfs eloquently explained ("a closed community which know everyone worth knowing by sight") ten years ago when Linux Weekly News asked if the Linux Foundation had completed its purge of all hobbyists from the open source development process, which it had. They've ossified a LOT more in the 10 years since Philip Lougher wrote that...
So yeah, happy to submit patches to someone who will actually talk about the code and not the bureaucracy+politics (he says, venting about the bureaucracy+politics).
Jeff and Mike are turning a big todo list I made into Open Project Work Items. I'm sitting with my laptop doing other stuff, but available in case they need to ask questions about the todo list.
I have an old rant about open source being unable to do user interfaces, and it's about how any time it's faced with a user interface issue the process melts down into one of three distinct failure modes. I know I blogged about it but couldn't remember which year off the top of my head, so I googled for "landley three distinct failure modes"... and then put quotes around "landley" because I recently learned that it's silently substituting in random misspellings for words it doesn't think are popular enough... and my blog STILL does not show up in ANY of Google's hits. Nor does the copy of the rant I put into the aboriginal linux about page, which I was reminded of when I looked at the talk version of the rant I gave years ago at ELC and I had that about page version up on the screen.
Google found NONE OF THAT. Despite all three containing the phrase "three distinct failure modes", and two of them being on landley.net. Google search is not healthy. It's kind of concerning, twitter going away is one thing, but Google Search will be _missed_. (They're panicing about chatgpt, but NOT about rapidly losing competence at their original core business. It seems to have started about when they laid off those 12,000 people.)
Today I learned that Open Project (and presumably whatever the generic name for crap-like-jira is) has "stories" and "epics", and an epic is a collection of stories. (Like the Epic of Gilgamesh... which seems kinda unique and nobody ELSE calls a collection of stories an epic? It's usually a series when it's not a trilogy. Kevin Feige is trying to brand the MCU iron man to endgame collection as "the inifinit saga".) This "epic" naming is pretentions enough I'm actually slightly nauseous. I would go out of my way to avoid meeting the people who decided on that naming.
Still getting emails for the "Austin Tech Happy Hour", which was a vaguely interesting thing many years ago. It seemed like a good idea to maybe meet some people on the same side of the planet, now that all the local LUGs I knew broke up; I went three or four times, don't think I actually met anybody I wound up seeing again). But at some point it grew a cover charge to keep the riff-raff out, and I really don't feel the need to pay $10 to attend a gathering of people I don't know at a bar, thanks. (Meeting random strangers with shared interests in-person is what giving talks at conferences is for. And science fiction conventions. In THEORY it's what meetup.com was about, but all the ones I tried to attend of those were "oh no, you're not allowed to enter the building without paying" nonsense too. And all those SxSW events that supposedly didn't require a badge, I stopped trying those YEARS ago because I never once got in. They were either full to capacity from preregistrations I couldn't access without a badge, or just plain "it said it didn't need a thousand dollar badge but does". As with the twitter blue checks, it's not the ability to afford it that's the problem, it's that the kind of people you're selecting for means I don't want to meet them.
Alas, my normal daily schedule involves sitting quietly in various corners reading and/or writing things, with the occasional long walk by myself. I often have _extensive_ correspondence with people at least a thousand miles away, but have to go out of my way to exchange ten consecutive words with anybody in the same town who I don't actually live with. There's a reason I founded more than one science fiction convention back in the day. :)
Darn it, glibc's wcwidth() is returning at most 1 for every character in toybox, never 2 even though when you cat tests/files/japan.txt it's all hiragana characters of width 2 (visibly measurable against an ascii text line above it). I'm trying to rewrite fold.c to do unicode properly and the glibc apis don't work.
Jeff is deeply enamoured of a pointy haired management thing called "OpenProject", so we spent HOURS yesterday setting it up so he can do gantt charts in it. Except the admin account doesn't work because it immediately goes "cross site scripting!" which turns out to be because the browser doing https is not enough, the openproject application ALSO has to have access to your let's encrypt keys. (Why? I don't know. Third base.)
This thing is the kind of "open source" you see when a corporation produces regularly updated abandonware. It has no community. There is no libra chat channel for it. Googling for things about it produces hits on their site and nowhere else (although with the sad state of google search I'm not sure what that proves).
A recurring error in our attempts to set up OpenProject is that their git integration breaks apache, which refuses to start because "OpenProjectGitSmartHttp" is a made up word its config file parser doesn't know. Googling for that word finds a closed bug report on the Let's Encrypt website where the Let's Encrypt people say "this is not our bug, ask openproject". There's also a bug report on the OpenProject website where somebody said it broke, and someone else replied "yeah it broke for me too", with no response and no fix. The bug report is from 3 years ago.
We EVENTUALLY figured out that the magic word is exported by the subversion integration code, so if you enable git integration WITHOUT also enabling subversion integration, it CAN'T WORK. (I repeat, this project has no developer community except employees of the company producing it, and THEY want you to run their magic docker where everything is preinstalled for you and you do not touch their proprietary inexplicable secret sauce "open source" code that you're crazy for trying to install/configure yourself.)
And of course if you enable the svn integration it breaks apache for a DIFFERENT reason, so we just switched them both off for now.
I also noticed that the gmail account Jeff set up for me years ago, which I'm only logged into on my phone, hasn't been inactive like I thought. When I open the gmail app on my phone (only thing logged into it), it says "auto sync is off", and I have to pull down to load to see if there's new mail. This is why I haven't gotten a new mail notification from it since last year. BUT if I try to turn auto sync on, I get a full-screen pop-up saying this doesn't apply to just gmail but will also flush all my photos to google's cloud so they can scan them on behalf of ICE and the TSA. There's no obvious way to enable "tell me when I get new email" without "send my contact list and location history to the governor of texas whose boomer supporters can sue you for a million dollars if they think your wife had a miscarriage". Hell no. I don't want to sync my photos, contacts, location history, I don't want it uploading (let alone retaining) the voice samples from speech-to-text (which I KNOW it can do locally because it does it in airplane mode, I don't know what Rossman is on about? Or is this one of those "it _can_ operate independently but there's no way to tell it not to upload everything anyway" things like I'm having with the email client?)...
So I'm writing a new unicode aware fold and I'd just like to say that posix really needs to move past the Y2K bug and enter the 21st century at some point. They have a "-b" meaning "interpret as bytes", but do NOT really handle the "not that" case.
Backspace is defined as reducing the column count by one, but unicode characters can have variable width (including zero for combining characters which should logically come BEFORE the character(s) they combine with but don't because somebody REALLY STUPID was on the unicode committee, I'm assuming from Microsoft). So in THEORY backspace should remove the number of columns consumed by the last printable character.
In practice, the flush-and-forget approach to output when toybuf fills up is a problem because we may have to backspace into it... unless we record how wide each column of output was? I mean, that's just a malloc of length -w (or shorter if we want to get fancy), AND avoids having to back up through utf8 to find the last printable unicode character.
Talked to Jeff about whether I should bump my flight back. We're getting a bunch done, and Fade and Fuzzy don't... strongly object. I enjoy Tokyo, and get about as much toybox work done here as I do elsewhere (the better work environment balancing out the extra demands on my time, although I hope fixing the mold in the vents back in texas changes that going forward).
I'm basically getting a free vacation in Japan, modulo not really seeing much of it outside of late night walks (tried to walk south to the beach; there's no beach, it turns into an industrial harbor sort of thing. Oh yeah, city. Right...) but I'm STILL trying to learn nontrivial amounts of the language.
Everybody keeps saying that "the food is so much healthier here you will lose weight", but 24 hour combini with tuna mayo rice balls and sweet milk tea EXACTLY the way I made it growing up (which was dismissed as absurd by everyone else, where's the Fools I'll Show Them All lightning when you need it, VINDICATION!!1!ichi!) it hasn't exactly worked out that way. I have two relevant Claire Ting videos queued up but would have to shuffle luggage for multiple minutes to clear space... so long walks. I wonder if there are any swimming pools available?
I'm eyeing a toybox release. It's overdue, I know, but there are so many things I'd like to get IN said release. Still, 6.3 came out and I should more or less try to stay synced with kernel releases...
The downside of this lovely office is I have no ADHD meds here, because they're not legal in Japan, and I'm really starting to notice.
Sigh, SMB_SUPER_MAGIC is still lying around, and it got moved to staging in commit 2116b7a473bf and then removed in 939cbe5af5fb in 2011, which was TWELVE YEARS AGO and yet the debris is still not just in the kernel tree, but in the header files exported to userspace. Oh hey, and USBDEVICE_SUPER_MAGIC is gone too (commit fb28d58b72aa back in 2012), but the symbol's still exported in the header. Oh, and last time I was poking at this, Novell Netware went away in commit bd32895c750b but they still have NCP_SUPER_MAGIC in the header.)
The observation in The Cathedral and the Bazaar that "with enough eyeballs all bugs are shallow" has been demonstrably untrue of linux-kernel for at least that long. There are not enough eyeballs because the kernel community is unwelcoming of newbies who would go over the obvious with fresh eyes and thus point out stuff like that. Just like the geriatric Unix community it replaced, it's now all old farts with long white beards and suspenders telling everyone who will listen about the glory days ~25 years ago.
Huh. I've got "fat" as 0x4006 in my list, and can't find that in the kernel source (not current, not 4.0, 3.0, 2.6, 2.4, 2.2, or 2.0). It came from a patch from Hyejin Kim but I have no idea where he(?) got that from? There's "msdos" and "vfat" (both 0x4D44), but no "fat" using 0x4006... And the 4d44 constant was added in linux-0.9.7 in 1992.
Right, posted the patch with a jazzhands comment, poked the github request to see if that fixed it for them, and punted on a BUNCH of questions. (If I identify smb do I say "cifs" or "smb3", both of which are driver names you can mount it with but... different behavior? msdos vs vfat is another but there's never reason NOT to use vfat these days that I'm aware of...) What I should really do is come up with a Horrible Sed Invocation that just extracts this data from the kernel source so I can regression test, but I'm not up for it right now. (In part because grep -ho 'register_filesystem[(][&][^)]*)' -r * | sort -u | wc -l says there's 97 of them, and in part because the first one grep finds is in arch/s390/hypfs/inode.c and I really can't bring myself to care about that one at the moment. And because grep 'static struct file_system_type .*_fs_type = {' -r * | wc returns 83 hits rather than 97 meaning this is NOT quite regular enough to make it easy.)
Fade finally tried one of the cans of The Dintiest Moore I left behind (I asked her to order a flat while I was there), and is a fan. It's the only american product I've encountered that makes serious use of Demi-Glace. (I don't know what non-demi glace would look like. Full glace? Ask the french. Highly boiled cow.)
Watched a frustrating history of gasoline video, which both had good historical information and repeated debunked lies out of old industry press releases verbatim.
A hundred years ago, Standard Oil worked out that mixing about 10% ethanol into gasoline prevents engine knock. All the lead in tetraethyl lead EVER did was make it PATENTABLE, because ethanol (which is the kind of alcohol humans have been drinking for thousands of years) already existed. The lead served no other function in the mixture EXCEPT to make it patentable. Tetraethyl lead is four ethanol molecules connected to an atom of lead (resulting in a molecule shaped like a swastika), and when you heat it you get back four ethanol molecules, plus a free radical of lead which goes out the tailpipe. It otherwise behaves EXACTLY like mixing ethanol into the gasoline would (which was the goal of developing the compound), and when it was finally restricted by the EPA they replaced it in gasoline with pure ethanol. Old engines that COULD use leaded gasoline (because they didn't have a catalytic converter, which the lead binds to covering over the catalyst surfaces that otherwise break down incomplete combustion products like carbon monoxide and nitrous oxide and so on), all those old engines worked JUST FINE with "unleaded" gasoline, and people only thought the stuff with lead was "better" because of years of advertising lying to them and causing placebo effect performance evaluation.
The airborne lead also made people exposed to it measurably more stupid, which is combining badly with senility in the current Boomer generation as age-related neurological degeneration overcompes their ability to compensate for a lifetime of nerve damage from massive pediatric and chronic lead exposure. (This is why everyone fled the cities to the suburbs, they moved upwind so they could breathe! But it was only RELATIVELY better, the air of the ENTIRE PLANET was poisoned (airborne lead was like acid rain and the CFCs that caused the antarctic ozone hole).
Keep in mind that organic lead compounds are generally even worse than metallic lead, because the human body is better at absorbing organic compounds and bringing them inside cells. So both tetraethyl lead itself and the free lead radicals going out the tailpipe in a cloud of superheated moist carbon monoxide and so on... that may have poisoned the Boomers WAY more than the largely inert residue it's broken down into 30+ years later. Some compounds are worse than others: the movie Erin Brocovich talks about Hexavalent chromium being WAY MORE TOXIC than other chromium compounds, and the research chemist Karen Wetterhahn was killed by a couple drops of dimethyl mercury poisoning her through her glove. The leaded gasoline profiteers were intentionally putting lead into volatile organic compounds that people would inhale, the neurological damage the Boomers suffered from this is manifesting VERY STRONGLY in their senior years.
Seriously, I wrote about this at length, with citations to multiple articles about it. Water samples taken in the middle of the pacific ocean had 20 times as much lead near the surface as the same location a few hundred feet down. Blood lead levels were SIX HUNDRED TIMES higher than samples from ancient egyptian mummies, and children absorbed 5 times as much as adults did. The Boomers were the first generation to grow up surrounded by cars, and it HURT THEM BADLY. In their 20s they could mostly compensate. But as they slowed down in their 40s the brain damage really started to show, and now that they're turning 70 two thirds of them are losing all touch with reality. This is not a case of oligarchs being better at manipulating people than the Railroad Robber Barons of the Guilded Age of the late 1800s, this is a population of lead poisoned vegetables ripe for elder abuse. Ten years ago they were falling for nigerian prince email spam, and now it's fascists finding them useful political cannon fodder. If even the rich and famous regularly suffer from elder abuse, imagine what the wider population of brain damaged Boomers is undergoing. Boomerdom going full nazi is because they literally have brain damage, which means our best chance to pull out of it and clean up afterwards is to outlive them.
Back to the frustrating video: when he later goes on to talk about "oxygenates" like ETHANOL... he does not connect the dots. This was not a new discovery. Thomas Midgely and his bosses understood this JUST FINE a hundred years ago. They chose to poison LITERALLY BILLIONS OF PEOPLE around the world entirely for profit. And then when Oil Industry stopped needing "the Ethyl Institute", the think tank reorganized itself into The Tobacco Institute to defend poisoning OTHER people for profit. And when that ran out, they reorganized into a bunch of global warming denialist think tanks to continue to kill people for profit.
Billionaires love to profit from fascists, and gerontocracy collapses into facsism, and we're suffering from both right now. On the geronotocracy thing: Hitler came to power in Germany because the previous President of Germany, 86 year old World War I veteran Paul Hindenberg, made him Chancellor in 1934 to shut him up (ahem: in hopes sharing power would appease him). Hindenberg was then manipulated into signing an emergency declaration ONE MONTH LATER giving Hitler's edicts the force of law which were not subject for judicial review for the duration of an emergency that lasted until Hitler said it was over. 6 months later Hindenberg died, at which point Hitler appointed himself president AND chancellor. Hindenberg was the same age as Dianne Feinstein (who is still in the senate), 3 years older than Nancy Pelosi (who is still in congress), and the same age Biden would be at the end of the second term he just announced he's running for. At least all those guys are OLDER than the pediatric (but not chronic) lead exposure from gasoline.
Oh good grief, now the guy in the video is on about ethanol coming from plants that absorb carbon dioxide: STOP IT. All matters is whether it's fossil carbon or not. Plants taking carbon out of the atmosphere for SIX MONTHS before it goes right back into the atmosphere does not change the amount of carbon in the atmosphere in any meaningful way. Mining operations that take carbon that's been underground for millions of years and release it into the atmosphere, THAT is what permanently increases atmospheric carbon. I do not care about rearranging deck chairs on the titanic, either you're mining fossil carbon or you aren't. (The problem with "carbon sequestration" is finding someplace to put it. A trillion dollar industry digging up carbon from miles underground is kinda hard to run in reverse at the DESIGN level...)
Reading press releases is not research.
Sigh, archive.org decided to commit seppuku during the pandemic (let's aggro every major publisher by putting their books online for free!) so I should definitely mirror the institutional memory post in my own computer history archive before it goes away. (Yes IP law is stupid the same way car-centric cities are stupid, but running out into traffic is not the answer.)
Finally got the turtle board running the 6.3 kernel and current toybox (increasing my kernel patch stack to 10 patches in the process), and... there are bugs. For some reason, ctrl-C doesn't work in the console which means oneit isn't doing the switch from /dev/console to /dev/ttyS0 (well, ttyUL0 there) properly. Another problem is that "ps" produces no output, even though I can cat files out of /proc and see the raw data it should be transforming into output.
Alas, I still don't have a proper nommu test system set up under qemu, and sneakernetting sd cards over to the turtle board for compile/install/test cycles is... really hard on the fingernails. I burned out (gummed up?) one SD card adapter already and bought a new one that's REALLY TIGHT, and have been slowly chipping bits of plastic off the ridge at the end getting the sd card back out. (The turtle board itself does the push-to-click thing but the laptop end uses a microsd-to-sd adapter, unless I want to dig up a USB adapter which is worse. I've already trimmed my fingernail to be less pointy in hopes of chipping out LESS plastic, but it's a question of degree.)
Sigh, at some point I need to do this dance with QEMU's virtual cortex-m board so I have a nommu test environment that runs under qemu, which should make regression testing this a lot easier. The problem is I don't have a "what success looks like" reference version there. Maybe I can beat one out of buildroot? (Or make puppy eyes at Geert Uytterhoeven about coldfire, that's a nommu target qemu theoretically supports as well, although I recall getting a kernel/board config to match with nontrivial amounts of RAM and useful peripheral devices didn't line up last I checked. Sigh, I should learn to modify QEMU, but just haven't got the spoons.)
The magic to stop vim from intercepting the mouse, thus preventing the terminal from letting me copy and paste text between a screen session at the far end of ssh and a local window, is the colon command "set mouse=" with nothing after it. There may have been a small rant.
I'm not just merging the j-core turtle board config into mkroot, and cleaning up mkroot general in preparation for cutting a toybox release, and also testing the 6.3 kernel. Of course there's kernel config weirdness. Kernel commit 3508aae9b561 memorializes a lot of config changes back around v5.8 that I wasn't paying much attention to at the time. IOSCHED_CFQ became IOSCHED_BFQ, IOSCHED_DEADLINE seems to have replaced the NOP one (always configured in), and MMC_BLOCK_BOUNCE went away because you can't switch off the bounce buffers anymore. MTD_M25P80 got merged into MTD_SPI_NOR.
Dirty trick: I can detect NAME=VALUE in the mkroot microconfig format and automatically insert lines other than =y or =m without needing the separate KERNEL_CONFIG mechanism... Except that the value can in theory have a comma in it. (None of the ones I'm using yet do, but they CAN.) Hmmm, I suppose I can come up with an escape mechanism for the comma? And then NOT have an obvious example of it in the file. Hmmm... The alternative is keeping the second mechanism for passing in raw lines despite nothing in the file currently using it. Or waiting for somebody to complain, which... isn't really better here because said complaint is likely to turn into "oh I can't use this" rather than "I'd better report this to the maintainer". Hmmm...
(I can backslash escape quotes and spaces, but can't backslash escape commas because the escape gets eaten before that parsing happens. I could transpose it with another character but that's black magic. I could say an assignment has to be the last thing on the line so it eats commas but I've already got multiple assignments in one config. Hmmm...)
And the air conditioner service guy back in Austin found mold in the vents. So we have to make an appointment with a Mold Remediation Specialist. Great. Well, that explains why I'd feel so tired five minutes after getting home, and had so much trouble getting a good working environment there and preferred to do all my work out at a fast food table or at the university.
(This is exactly why we had very expensive specialists come after every flood with HUGE BLOWERS and refrigerator-sized dehumidifiers drilling holes and spraying gallons of chemicals into the walls: DID NOT WANT MOLD. Didn't really think of the air conditioner vents, where condensation is kinda normal. What's in there for them to eat, anyway? Dust, I guess...)
Anyway, I'm here in tokyo, where mold smells completely different anyway. (The clothes I left hanging to dry in the apartment needed some serious re-washing.)
It's so easy to just spend the ENTIRE DAY in an APA hotel room, and ignore the outside world. I shouldn't, because it's Tokyo out there and I really like tokyo, but it's SO QUIET. (APA is apparently the middle three letters of Japan, at least on their posters... Which is weird because here it's Nippon. Medieval Dutch and Portugese traders asked OTHER countries what those islands over there were called and "Japan" seems to have emerged via consensus from multiple languages playing telephone, and then the same insane map makers who named two whole continents after Americo Vespucci went sure, "Japan", sounds great).
I figured out why Jeff can't stand them, or the windowless Hello Office: the pandemic gave him claustrophobia. Being in enclosed spaces too long gradually increases his stress levels and he needs to go OUT somewhere. I can relate, but am personally experiencing the opposite here. Let me work!
The main limiting factor is Jeff calling me up and pulling me in to his projects, but he is paying for the trip.
The old j-core ethernet driver was just too messy to submit to mainline. It's not secret, but Jeff outsourced it to some cheap Russian programmer (the lowest bidder) years before we ever met and only like 1/3 of it is actually relevant. It's got all sorts of debris from IEEE time synchronization and such that were never completed. We should really write a new one, but never got around to it.
That said, the last time it got forward-ported was 5.8, and we'd like to use it on current (6.3) kernels, and I bisected the FIRST build break to commit adeef3e32146, which made a field const and added a gratuitous new API to change it. There's a bunch of commits (bb52aff3e321, 0f98d7e47843, 9a962aedd30f) converting drivers to the new API, so it wasn't too hard to fix it up. The other breakage (b48b89f9c189) removed an argument from a function, and was easy to fix up.
Bisected the "Turtle works now" bug to commit 5d1d527cd905, which was a rewrite of the RCU plumbing for the networking code that starts "Using rwlock in networking code is extremely risky..." So yeah, I'm willing leave that part to the professionals. The symptom they saw was soft lockups, it fixed our boot hang, calling it good and moving on.
I have discovered LaserPig on Youtube, who answers the question "What if Sheogorath, Daedric prince of Madness from Skyrim, did youtube videos about the war in Ukraine in character as an extremely drunk farm animal with strong opinions about military history and equipment". I discovered him via a team up with the "oh god you've reinvented trains again" guy who keeps photoshopping Elon Musk into a clown outfit.
Trying to have mkroot more gracefully straddle the patched vs unpatched kernel issues, and also get the init script to work nicely in both QEMU and a chroot/container. Added test -T to check if stdin is open, test already has -t but I don't care if it's a tty because a chroot with redirected stdin/stdout (or piped through something) is fine and does not need to be replaced with /dev/console.
Sat down to figure out why the current vanilla kernel broke on turtle, and... it's fixed? It smelled like an alignment issue (unaligned access), and maybe it got perturbed so it's aligned again? (Or else there was a bug that hit somebody else and they fixed it?) Either way, that means the revert commit in Rich's j-core patches is no longer needed although I'm still gonna track down what fixed it so I know. (If bad alignment got perturbed into place again, it'll be back.)
And now that there's a new arch/sh kernel maintainer I'm looking at those to see what's still relevant. Vladimir Murzin's commit was merged into vanilla. There's one adding extra percpu memory... why? Rich does not actually provide descriptions with his patches, so I have no idea what actual PROBLEM he was trying to fix, and the kernel's attempts to describe this plumbing are not enlightening.
Ok, generic commits that still apply: 4c7333b0fb9e, 53ac9fc75ae0, 262e1e5884da and could maybe go upstream as-is. Commit 155d2abffb8b is jcore-specific (the clock thing), I think it's generic-ish and could go into the vanilla tree? (Need to test that the turtle board as is still works with it.) The ethernet is 186e1d80a89b and 666583fa6d5d, gratuitously split in two for no obvious reason.
I'm test building on my turtle board, by repeating:
for i in ../toybox/000[34578]*.patch; do echo $i; patch -p1 -i $i || break; done
sed -Eis '/select HAVE_(STACK_VALIDATION|OBJTOOL)[^_]/d' arch/x86/Kconfig
patch -p1 -i ../linux-sh/0001-percpu-km-ensure-it-is-used-with-NOMMU-either-UP-or-.patch
patch -p1 -i ../linux-sh/0001-revert-790eb67374-to-unbreak-j2.patch
mkroot/mkroot.sh CROSS=sh2eb LINUX=~/linux/github
sudo bash -c 'mount /dev/mmcblk0p1 /mnt && cp root/sh2eb/linux-kernel /mnt/vmlinux && umount /mnt'
sudo microcom -s 115200 /dev/ttyACM0
Bash command line history comes in handy there, cursor up a few times and hit enter, take out the sd card put it in the holder, put it in the laptop, run the thing, take it out of the holder, ka-click it into the board, plus in usb, frown at boot messages, rinse repeat.
Bisecting stuff is awkward (why the above [34578] skips some numbers and a couple patches are broken out as individual lines). Much annoyance because git insists that old is good and new is bad. You're never searching for where something got FIXED, only for where a bug was introduced. Therefore, to find the commit where the turtle board started working again (without reverting commit 790eb67374 in the patch above, that's the "it just started working again" thing) I had to call the one that does not boot "good" and the one that boots to a shell prompt "bad". Because git.
For extra fun, I have to build each one without the revert first to see if I get output, and then build it again with the revert to make sure I DO get output and it's not a different bug. So the cycles are a bit slow.
And mkroot is set up to always do a full build. I can do incremental builds out of tree, but then I have to hack the config file to point to the initramfs directory via absolute path and I dowanna.
The hotel rooms in Japan are lovely. Jeff says they drive him nuts, but they're giving me something I haven't had nearly enough of: quiet well-lit isolation with a desk and an outlet and internet access where I can get work done without interruption. (Especially now I don't have to be out of them by 10am, at least 3 days out of 4.) A 5 minute walk away there's tea the outright weird way I grew up drinking it (cold, sweet, with milk: confusing BOTH sides of the atlantic), and cheap tasty rice triangles (onigiri) which I finally figured out the intended way to open so the seaweed goes around the rice. (There's a pull tab that peels away to split the packaging with a plastic thread down the middle, and then you pull it equally off both sides so the seaweed stays in place. Gets seaweed crumbs on the desk, but otherwise works great. The point is the seaweed and rice are separate until you open them so the seaweed stays dry and crispy.)
I hadn't actually _installed_ the qemu targets I rebuilt back at the end of march, and now that I'm trying to test them "mips" still isn't working. and I don't remember what specifically I built (I can see the git log but it doesn't help), so I think I need to pull and rebuild ust mips, which is probably still "./configure --target-list=mips-softmmu"? (The QEMU devs have a terrible habit of breaking their API for no obvious reason.)
I locally checked in the "move mkroot to its own directory" stuff and pulled it into my main tree, but haven't pushed yet. I should add a "Hey! This moved!" stub when you try to run the old one, but the #!/bin/echo command line gets the name of the script you're currently running as an extra argument, and "This script moved to mkroot/mkroot.sh scripts/mkroot.sh" is... not clear.I need ldd to fix up mkroot/root/dynamic, which runs after populating the airlock. Alas Elliott strenuously and repeatedly objected to toybox containing an ldd capable of running that loop, because it wouldn't invoke glibc's dynamic linker to find where something is currently loaded into memory when you haven't loaded it into memory. How a cross compilation running ldd on a mips binary is supposed to tell you where that library is currently loaded into an x86-64 system, I couldn't tell you, but the FSF keeps making their binaries fatter and fatter. The one in uclibc never did this.
Back in 2020 a third article about the greying of Linux came out, but it's already fallen off the web and you have to fish it out of archive.org because "Linux has gone the way of Unix, maintained by crotchety greying grognards who scoff in all directions outside their insular little niche" isn't really NEWS anymore.
It's one of those days where I'm skittling along a giant dependency chain doing twenty minutes work on one thing and then going "but first I need to do X" and getting five things deep before I go "what can I actually FINISH AND CHECK IN RIGHT NOW".
The most recent email with the guy who needs me to update scripts/root/dynamic wound up with him being able to use the prebuilt musl toolchains I provided (he was saying Linux= instead of LINUX= but it's case sensitive), but I should still poke at the "dynamic" target because mkroot shouldn't REQUIRE musl, which led me to asking whether static linking in bionic is working yet (mkroot didn't used to be able to use it because of the "segfault with no stdin" bug hitting PID 1, which got fixed upstream but hadn't made it into the NDK yet, and there's a new NDK (r25c) so I downloaded that and extracted it and went "huh, creating the cc symlink I've been doing works but seems silly because there's no OTHER tools prefixed like that in that directory anymore, where did they go? There are a bunch prefixed with llvm- except there's no llvm-ld in there...") and so on down a rathole I've also parked and stepped away from because NOT RIGHT NOW.
But I tried to build with that in a fresh directory and defconfig of course barfed with bionic because it hasn't got the shadow password plumbing, except I redid lib/password.c and friends in my tree (the new one doesn't USE the shadow.h nonsense awkwardly bolted alongside the original user/group stuff by shadow-utils back in the 1990s) and I really need to test and check that whole rewrite in, except it's big and intrusive so copying it to a new fresh directory for proper testing required some investigation: "git diff lib" says that lib.h, password.c and pending.h (which I deleted as part of this work) are the changed files to marshall over, but the new password.c has three functions (get_salt, read_password, update_password) which grep says are used by: passwd.c, su.c, login.c, mkpasswd.c, chsh.c, groupadd.c, groupdel.c, sulogin.c, useradd.c, and userdel.c. And in my big dirty working tree the changed files are passwd.c, mkpasswd.c, chsh.c, groupadd.c, and groupdel.c, so that's what I should coopy to the new tree and try to build.
Except to test this stuff I need a mkroot build (not letting it write to my /etc directory as root just yet, thanks), and I ALSO have a toybox directory where I'm moving mkroot out of scripts/ and into it's own mkroot/ subdirectory (where I can give it its own README), and there are two edge cases that I'm not sure whether I should move: 1) mcm-buildall.sh and record-commands.
Design-wise scripts/mcm-buildall.sh remains a rough edge because it populates the ccc/ directory at the top level, not under mkroot.sh. The problem is one again "lifetime rules" (you don't rebuild the toolchains every time you rebuild mkroot). So... does it stay in scripts/ or does it move to mkroot/ with mkroot.sh and test_mkroot.sh and the scripts/root directory? It's not really part of toybox, it's an important dependency for mkroot (CROSS= there is what expects the ccc/ directory), and mkroot is what has the plumbing to download external packages (via mkroot/root/plumbing) so it kind of _does_ need to be in there... But if it IS in there then the README is hard to write, because the logical sequence of scripts is then 1) cccbuild.sh, 2) mkroot.sh, 3) test_mkroot.sh. But 99% of the time, you don't RUN cccbuild.sh. Heck, most newbies will probably download binary toolchains because it's a pain.
The other thing is I want to rewrite mcm-buildall.sh so it doesn't use Rich's musl-cross-make repository anymore and is its own standalone cccbuild.sh instead, because Rich doesn't reliably maintain musl-cross-make (the last commit to it was just over a year ago), and it's really not helping much anyway. The Linux From Scratch partial build script I posted to the toybox list last month builds a gcc variant without jumping through that many hoops, and I'm leaning towards just doing my own build directly rather than working out how to feed configuration stuff through Rich's plumbing to the gcc build. I've already added a couple of my own patches to his that he won't take, and have a couple more queued up that I poked him about but he ignored. (That said, I believe he and in his family are still touring Indonesia? I type this from Tokyo, can't throw glass houses at anybody, but I try to stay in touch. He's been insufficiently communicado for a while now.)
And then there's the whole "llvm toolchains" can of worms I need to reopen at some point, which musl-cross-make is no help at all about... I suppose the pending rewrite is a good excuse to leave the old one in scripts/ for now?
ANYWAY, I'm trying to write up the new README, starting from the ancient README back when it was a standalone project, and the FAQ entry (which is another thing I need to update before checking in the move; I should probably leave a symlink from scripts/mkroot.sh to ../mkroot/mkroot.sh in the tree).
Oh hey, today _is_ the every-fourth-day that they clean the room. When I asked the guy at the front desk what time I had to be out by, he said it was tomorrow.
And lo, I have my laptop available again (yay adapter), a quiet hotel room (APA is now only cleaning the rooms every 3 days so I can stay in it all day if I like), and rather a largeish todo backlog. Let's see:
Upgrade test suite so gentoo can run it. Request filesystem type, umount -l. ldd chroot https://github.com/landley/toybox/commit/e70126eabef8 Finish lspci -x fallout. Check compression? https://github.com/landley/toybox/issues/386 http://lists.landley.net/pipermail/toybox-landley.net/2023-April/029520.html Finish cgroup stat support. https://github.com/landley/toybox/issues/423 Yifan Hong's continuing tar weirdness: https://android-review.googlesource.com/c/2536710 Peter Maydell qemu Malta patch? Tom Lisjac (and previous guy) want scripts/root/dynamic https://github.com/landley/toybox/issues/418 David Legault, fold tests. (Promote fold?) https://github.com/landley/toybox/issues/424 vmstat for zhmars https://github.com/landley/toybox/issues/422 sizeof(toybuf) https://en.cppreference.com/w/c/language/_Alignas fix sh2eb mkroot build (toolchain and kernel) gzip --rsyncable implement deflate, implement rsync... Ongoing cleanup of mdev.c started on plane due to /sys/block poke. http://lists.landley.net/pipermail/toybox-landley.net/2023-April/029525.html Finish the cp -s work so I can do install -T Try to beat a multi-console thing out of mkroot+qemu to test oneit change http://lists.landley.net/pipermail/toybox-landley.net/2023-April/029531.html
Pretty sure I've missed multiple things there. Plus I _was_ planning on cutting a release before visiting Tokyo. And there's the Linux From Scratch automation script so I can go back down the aboriginal path of making a self-hosting toybox environment...
Ah right, there are no three prong outlets in Tokyo. And I brought a three prong laptop charger. That's inconvenient. My plan to program all morning until the sun came up (what with being waaaaay off this timezone in my sleep schedule) hit a bit of a snag there.
Met with Jeff in his office, unboxed, disassembled and reassembled oscilloscope, talked about about his battery project, went out to dinner with Mike and some of Mike's friends in Shibuya where we went to a chinese-run restaurant that allows smoking indoors, where I found out that after a few years of not trying to eat while breathing cigarette smoke I've lost my tolerance for it. (As in "mouthfull of food and lungfull of air combines to convince my brain I've got a mouthful of cigarette ash, and forcing myself to swallow triggers a nausea reaction that lasts all night." That was not fun.)
Air travel moved the clock forward 12 hours and more or less ate today. Went to bed at 8pm local time anyway, which was something like 5am relative to where I got up this(?) morning, after getting maybe an hour of sleep on the plane. (Sitting bolt upright. Horrible neck cramp.)
But at least I have delivered the giant oscilloscope box to Jeff, who dumped it in the office. Tomorrow I need to reclaim the giant pile of laundry and books and such I left in the apartment I couldn't get back to during the pandemic.
Onna plane. Got up at 5am to go to the airport. Flying from Minneapolis to Toronto (which is the wrong direction?) and then Toronto to Narita airport in Tokyo. Between the layover and the going the wrong way part, it's like 17 hours of travel before I even get to customs at the far end.
It's a lot easier to get programming done on a plane that ISN'T 100% full. Getting up at 5am after finally adusting back to a day schedule doesn't help either. I had grand plans for the 14 hour uninterrupted block, but don't have the focus.
Forgot to eat this morning (caffeine yes, food no), was quite appreciative of the first meal on the international flight at like 1pm minneapolis time. That may be a contributing factor to the lack of focus...
Huh. Given the way adler32 works, if you're just looking for a run of zeroes at the bottom and it's 16 bits or less... you don't need the whole algorithm. It's just "add up the bytes modulus by the largest 16 bit prime".
That really seems unreliable? I mean... ok, fast. But "runs of zeroes" are legitimately a thing? If you compress all zeroes is just gonna reset every minimum window size (4k)?
I still want to figure out how to do the rolling addler32 of the top part. I KNOW I worked this out before, my blog says I did it in 2001 and again in 2013 and it would be nice if I'd actually DONE it back then rather than restarting every 10 years.
Of course today's interrupt is updating the filesystem type detection list, which is tricksy because the kernel isn't consistent. I already have one more small patch (basically a repeat of the v850 patch one) to send to lkml, but they'll just ignore it. (I need to reply to Andrew Morton, but "you guys no longer take obvious one line fixes" is hard to say POLITELY.)
[Editorial, April 15th as I'm fixing this up to post and replacing the [LINK] with an actual link... WOW Google search is imploding fast. Googling for "linux landley v850 patch" does not find that patch, nor does adding "elf" before patch. Adding "remove" before v850 finally found one copy of it in mail-archive.org, which is not the kernel's own lore.kernel.org/lkml nor is it the iu.edu one that's been there since 1995, nor did it find a copy in any of the archives in the vger list for linux-kernel. Google search is blind to all of those. I got the above link out of my preferred archive by checking the date on the post in the one copy Google DID eventually find after all those retries, and then going to lkml.iu.edu and manually navigating there from the top down. Remember when it was easier to google for stuff than bookmark things? Not anymore...]
To avoid preparing for my flight I've been stress baking, using up the half-finished ingredients in the fridge of types Fade doesn't use to produce food she'll eat. She tends to make big batches of what she calls "kibble" once a week and put it into individual plastic tubs, and then have the same thing for the majority of her lunches and dinners until it runs out. Generally "pasta or rice with stuff in it". I'm leaving her such a cheese pasta tomato caserole sort of thing, and a chicken rice dish, and a large pile of steamed green beans, and a meat pie.
The household's standard meat pie recipe (which I learned from Fade but I'm the one who always cooks it now) is cook and drain one pound of ground beef, add a can of cream of condensed mushroom soup (as-is), a can of sweet baby peas (drained), a significant amount (most of a pound?) of shredded cheese, a dozen or so shakes of Penzey's "california seasoned pepper", stir it all together and decant into pie crust, bake at 375 for half an hour. Her pie tins are smaller than the ones I use, so trimming the extra off the bottom pie crust leaves enough for stripes of pie crust across the top, which I can bridge with torn up cheddar slices to get two pies from one pair of pie crusts. (Which is good because premade rolls of pie crust are like $5 a box now.)
And Jeff got back to me about the Tokyo trip with less than 36 hours before the plane takes off, because of course he did. Ok, the long-delayed trip to target for 2 more pairs of pants and a new pair of shoes needs to happen tomorrow, because Tokyo hasn't got anything in gaijin sizes. (Hopefully I can get new glasses in Tokyo the same place I got glasses last time, they're much better than you get through Zenni. I think it was somewhere in a Tokyo Hands, but that doesn't narrow it down that much. They're sort of vertical shopping malls, and there's at least 3 of them we went to in Akihabara and Asakusa and possibly Shibuya?)
Sitting down to actually implement gzip --rsyncable, I'm hitting the problem that the USE I'm making of the zlib stuff is "pass off an fd and it returns when done", meaning my code doesn't get to read the data and partition it. I could do a wrapper that reads the data and passes it along, and probably will eventually, but that seems kinda silly?
Weekend. Hung out at my Sister's, saying hi to the niecephews.
The one of the four that maintains their original gender (despite whatever their father's new wife does to them every week that they refuse to talk about but are very unhappy about) got screwed over by his father in a DIFFERENT way, apparently if you've ever gone for mental health counseling even once, the navy's nuclear submarine training program will happily give you a "waiver" to get through boot camp (because they're SO not making their recruitment numbers), but will then kick you out right afterwards even if you come in 6th in your class (because you volunteered to drop a spot so somebody else who was coming in as an E1 could get promoted to E2 by being in the top 5).
Personally, I'm not a fan of career paths which you aren't allowed to quit that could order me into combat when we're not at war, especially when I know multiple people who wound up permanently disabled in military "incidents" that weren't even combat related. (Shinga got crippled in a training accident, and spent the next decade plus having to deal with VA underfunding. Remember my "apprentice" Nick from 10 years ago? Her dad got poisoned working near a burn pit in Iraq, degenerative neurological something or other, I got to watch him get worse every time I visited...) But I also didn't grow up hand-to-mouth poor. (I've done what I can to help, but it's intermittent and from far away. Kris could never move out of Minnesota without losing custody of the kids to their father's new wife...)
Honestly, put Jon Stewart in charge of it. (The six minute video in there counts as "nailing an interview" to me...)
We switched the household slack to Discord. Let's see how that goes. I have so many "send notes to self in my DM-to-me slack channel", which was a scratchpad I could easily access on both my phone and laptop, and now I'm laboriously copying the to-laptop ones by hand now that I've lost access to that in that context. Gotta do that before deleting the slack accounts and uninstalling the app. (Alas, the phone side doesn't have a selection option that can highlight more than one entry. On the web side I could mouse drag and scroll and grab multiple pages of stuff to a test file at once, but the android UI doesn't have anything similar. Press-and-hold to highlight a single entry. No obviout shift-click to highlight another entry without deselecting the first. And so, I laboriously type one entry at a time into the keyboard. I'm back to February...)
Doing gzip --rsyncable kinda implies doing rsync. According to the rsync wikipedia[citation needed] page the checksum in question is adler32, which seems simple enough, although I'm squinting at the modulus: I'm pretty sure that can happen at the END as long as the input length is less than 256? Sigh, wikipedia[citation needed] keeps saying "look at the zlib source code to see a more optimized version" rather than just saying WHAT THE MORE OPTIMIZED VERSION IS. This is a five line algorithm! Obviously it's moving the modulus to the end. Alright, let's do for loop here to see where the overflow is... If the starting input is already ffffffff and you add ff each time you'll overflow a 32 bit counter after... 5552 entries. So page sized inputs are fine. I can add a comment.
Ok, so the theory here is you do a running checksum on the input, and when the bottom X bits are all zero, you reset the deflate stream. (When Hayase Nagotoro invented blockchain there was a lot less originality than I thought: the post proposing --rsyncable for debian came out YEARS earlier.) Since deflate is designed to work on concatenated archives, I don't even really need to communicate with the encoder, this is a "close and reopen, append results together" situation. Probably you want some minimum amount of input before checking the results, and maybe initialize the CRC to something other than 0 so a run of zeroes doesn't leave it zero? (Or does the "minimum amount of input test" cover that case?)
The next question is "how many bits of zeroes" and "what's the mimimum block size", and the original paper isn't even using adler32 like rsync is so I don't want to take its answers? Unfortunately nobody seems to actually document what gzip --rsyncable is actuallly doing here, let alone how many bits it considers worth resetting for in its --rsyncable. I hate looking at gnu crap both for licensing reasons AND because it's always TRULY HORRIBLE CODE, but I'd like to be at least somewhat compatible? And in the absence of ANY sort of documentation, hold my nose and see what's publicly available on github's web view... Looks like they're using 4096. And they're using MODULUS on a POWER OF TWO to check for the zeroes.
That's just sad. I need to step away from the keyboard for a long walk.
The toybox test suite is a bunch of shell scripts in tests/*.test (one for each command name) that get run by scripts/test.sh (which calls scripts/runtest.sh). The actual tests are shell functions that look like:
testing "name" "command line" "expected output" "file input" "stdin input"
I.E. each test has five arguments: 1) the name to print when running the test, 2) the command line to run for the test, 3) what the test is expected to produce on standard output, 4) what to write toa file named "input", and 5) the input to pipe into the command line's stdin.
There's some complexity: each test gets run in an empty directory (generated/testdir/testdir) with the $PATH set up so it's testing the right command(s). Arguments 3, 4, and 5 are run through echo -e to resolve escapes, and if there's a newline at the end you have to explicitly state it. (There's almost always \n on argument 3.) If argument 4 is an empty "" string then no input file is created (and "input" is deleted between tests) so it's not there messing up ls output and so on. If any command fails to produce the expected output, the script exits (unless the environment variable $VERBOSE contains the string "all" somewhere in it) so later tests don't even get run. But that's the basic idea of the test suite.
There's a few more corner cases, such as the checks that conditionally skip tests (shell functions like "toyonly" and "optional" and "skipnot", which set the $SKIP environment variable), and a whole second set of testing aparatus providing the "txpect" function (txpect NAME COMMAND [I/O/E/Xstring]...) that works like 'expect' listing a series of inputs to stdin and expected outputs (on both stdout and stderr) and eventually an expected exit value, ala:
PS1='$ ' txpect 'shell hello' 'bash --norc --noprofile -i' E$'$ ' I$'echo hello\n' O$'hello\n' E$'$ ' I$'exit 3\n' X3
And someday maybe I need to figure out how to hook that up to pty master/slave plumbing (or do they call it dom/sub now? Hey, that's a consensual relationship...) so I can query cursor position in a virtual screen (testing stuff like "top" in an automated fashion is REALLY nonobvious). But implementing "expect" in pure bash was hard enough...
Possibly the most complex part of all this, from my perspective, is that Android doesn't use my scripts/test.sh, it just uses scripts/runtest.sh. All the shell functions are defined in runtest.sh, but the test.sh script is the one "make tests" and "make test_sed" call to set up the generated/testing/testing directory and work out whether we're testing a single toybox command (calling scripts/single.sh to build it and install it into generated/testing if so), all the toybox commands (calling scripts/install.sh to put all of them into generated/testing so they're all in the $PATH at once), or running the tests against the host commands (which is testing the tests themselves, not testing toybox's command implementations: I haven't proved much if I pass tests I wrote but nothing ELSE passes those tests. Alas the host is a moving target and each time I upgrade devuan some tests that used to pass start failing because their output changed, there's some regex fuzzing I can do but it's a red queen's race...) I'm never entirely sure what will and won't break android's testing when I fiddle with my test plumbing, but Elliott pokes me when I do and I can fix it after the fact.
So anyway, I recently added "ls --sort" which needs tests, and first I was converting the existing ls.tests from "testing" to "testcmd", which is another wrapper in scripts/runtest.sh that supplies the name of the command being tested so you don't need to start each command line string with the same command. (Not just to eliminate redundancy, but so it can force testing the _toybox_ command instead of shell builtins and alias trickery by providing absolute path to command as necessary. Otherwise testing "echo" under bash isn't testing toybox, it's testing bash.) I didn't want to change the base "testing" function to do that because sometimes you want to be explicit, ala "VAR=VALUE $COMMAND --blah" or "for i in a b c; do $COMMAND $i; done", and besides: switching them over requires editing the command string to remove the command name, which gives me an excuse to review tests using my standard lazy approach. (Same general idea as the college study advice that taking notes helps because when you write it down you remember it.) But converting entire test files from "testing" to "testcmd" is generallly good because the result is shorter and less redundant, and often avoids wordwrapping. In THEORY a nice low-brain activity I can do when I'm not feeling up to much... (Except that when I do review, I tend to find stuff and go off on tangents. It's basically horror movie logic here, there WILL be something. But I'm getting ahead of myself...)
Another entry in the "there's some complexity" pile above is that 1) the name of the command being tested is prepended to the name of the test, so you don't have to repeat it each time, 2) if the first argument to testing is an empty string, then the second argument gets used as the name of the test. So if I say testing "" "-R" ".." ".." ".." it'll go "PASS: ls -R" in the output. (Or FAIL: or SKIP: depending on what happened. They're all the same number of characters so the output lines up either way. And when going to a tty, it's color-coded.)
I converted "testing" to always prepend the command name because I never want it to NOT do that (or at least couldn't think of any use cases), but when I combined left the first argument of testcmd blank I wound up with output like "PASS: ls ls -R" (or worse, PASS: ls /big/long/path/to/ls -R" because they were BOTH adding it, in a way that was nontrivial to untangle.
So that's where I went off on a tangent and parked the "add ls --sort tests" todo item last time. And this was AFTER my previous excursion into fixing the plumbing ls was using (so the tests actually ran in an EMPTY directory, so ls didn't keep having to duck into a subdirectory to avoid showing debris; this is the downside of me saying "I can always use more tests" when people ask me "how can I help", Divya meant well but left me with some technical debt to shovel out. The real problem is I'm a prefectionist acting like I'm making faberge eggs, but if I'm not doing a BETTER job than what's already there why bother? I mean ok, "licensing", but that's not sufficient reason by itself...)
ANYWAY, with all THAT sorted, I converted the ls tests from "testing" to "testcmd" and now I'm looking at a few of them I noticed are kinda weird. The -N test was actually testing -q, which means back when it went in I didn't review it enough, even though I found one issue right off (which seems obvious: -q isn't the default and -N switches off -b but not -q, so you have to be able to switch -q on first to tell?). And now that I'm going back to try to PROPERLY test it (turns off -b but not -q because gnu/dammit of course) and then I hit:
$ ls --show-control-chars $'hello \rworld'
'hello '$'\r''world'
$ ls --show-control-chars $'hello \rworld' | cat
world
What is this "shell escaping but only to tty" nonsense? Does THAT have a new command line option? I actively do not want to implement it, because it's STUPID. What is the POINT of doing SHELL ESCAPING ONLY TO TERMINAL OUTPUT? And if you're goint to do a $'' wrapper why not just have that be around the whole thing? Why have MORE THAN ONE QUOTE CONTEXT IN THE SAME OUTPUT? What is WRONG WITH THESE PEOPLE?
I miss the days where the gnu/dammit clowns failed to add -j support to tar for 5 years because the gnu development had completely stopped. Everybody just had a standard patch they added and it was all good. That was the period during which the gnu tools became actually popular in Linux, when they WEREN'T CONSTANTLY BREAKING NEW STUFF.
And I rebooted my laptop, losing all my open windows. Not because of any hardware or OS thing this time, but because I was working out test plumbing (to fix gentoo's inability to reliably run the toybox test suite when they build the package, by letting tests request which filesystem they run under), and I did "mount blah.img sub && cd sub && umount -l ." and SOMEHOW instead of unmounting the new loopback filesystem the debian host umount command unmounted my /home partition out from under every running desktop process. So that's nice.
That's a "press the power button and hold it down until it does the unclean override sudden power down" thing. THIS is why I like to test this sort of infrastructure in qemu instances. And of course thunderbird doesn't retain emails in the process of being composed the way kmail did, nor do my 8 gazillion terminal windows+tabs restore their state to show what I was in the middle of...
The design reason I was doing that is a test should be able to go "force_filesystem ext234" at the start and if "stat -fc%T ." says that's not what we're currently using then it should dd if=/dev/zero up an image (because loopback mount can't use sparse files so truncate -s won't work here, although it's transparent to an emulator so qemu can eat one just fine), mke2fs it, loopback mount it in a directory, cd into that directory, delete the loopback file and lazy umount the directory (so both the file and the mount point get freed when the last process using them does; the mount pins the file's inode, and our test process has the mount pinned as its cwd, but no matter _how_ the test exits it can't leave the mount lying around on the host afterwards), and then the rest of the test proceeds as normal and then does a cd out off the directory as part of normal cleanup.
Except of course the gnu/dammit stat command says "ext2/ext3" instead of a proper driver name. Gotta add filters to the parsing because they got "clever" in an actively harmful way. Toddlers "helping" in the kitchen, minus the learning part.
And now that I've rebooted, chromium stopped working with slack, which is now a fullscreen "you can switch to a supported browser or you can install our data mining app but otherwise fuck you" page. I asked on the #devuan channel which says it's a #debian issue because chromium 90.0.4430.212 is the newest version in the "oldstable" category (which devuan beowulf lines up with), so now I'm asking on the #debian channel. There is a "backports" repository, but this package isn't in it.
I've heard that chromium is near-impossible to compile from source, so I'm not TOO surprised that debian can't get it to build in older environments. The standard Google problem of their code both having a zillion dependencies and being completely unportable to the point they care about the specific dot-release of each dependency. Sigh. (And hermetic builds are technically a move to be LESS accepting of variations in build environment. You deploy the one true build environment to target, because building in an emulator on a provided image is slow. I care very much about reproducibility from first principles. This does not appear to be a common viewpoint.)
But needing to do a major OS version update to regain access to my household slack on my laptop? Sigh. I might need to start caring about firefox again. (It's a "household slack" with Fade and Fuzzy, but half of what I use slack for is cut-and-paste of URLs to my phone, and running another OS in a vm to run chromium in there means I'd have to get cut and paste working in kvm, which... enough ratholes for one day, thanks.)
Had to go to the hospital to have a piece of glass professionally removed from my foot. Not my most productive day otherwise.
One of those "spent all day trying to get in the headspace to do productive work, and didn't" days. Cooking and cleaning and generally being Fade's housewife worked out ok. And I did actually invoice the middleman. (Yay!)
(Why does avoidance productivity either put me into DO ALL THE THINGS mode or else completely stop all work, with no middle ground? This wasn't even tax paperwork, it was "resubmit an invoice". Which yes I had a bad bureaucratic experience with 3 months ago, but seriously...)
The mkroot dynamic build (which there's a waiting user for) SEEMS simple, but the current script is using a "cc --print-search-dirs | xargs cp -a $TARGET" approach that winds up populating the target with over a dozen gigabytes of crap which will NEVER fit in a ramfs, and is big enough repeatedly doing that build seems likely to noticeably shorten the life of my laptop's SSD. (And that's after I fixed the "it's copying symlinks" problem that cropped up in an OS version upgrade, so that it wasn't that big, but didn't reliably work either.)
My first stab at cleaning that up was "copy everything to target then sort the hashes, use them to compare files and hardlink together what's identical". Which cuts the space in half but the result is still multiple gigabytes and doesn't reduce the disk thrashing at all (the files still get copied before being discarded).
Now I want to dig up my old "run lld on each file on target to get list of libraries actually in use and copy just those into the new chroot" approach that I had a bash script for even back in the busybox days, code which I recently removed from toybox in theory because mkroot superceded it, and in PRACTICE because I need ldd in the $PATH to make that work, and when I mentioned my desire to add that to toybox Elliott had kittens. (I still don't understand WHY. He doesn't have to enable it for android, but it's a thing I personally have an immediate use case for. Copy this and the library files it needs, and the library files THAT needs. Do it recursively but skip ones already present on target, which prevents endless loops. My first script to do that was in 2001, back when I first put together a tiny boot image with binaries harvested from the distro I was running. I want to say... Red Hat 6?)
Sigh, need to invoice the middleman. I can do it on monday. (It's not EXACTLY rejection sensitive dysporia, "submitted paperwork that got bounced because weird politics, reluctant to do it again" is at least in PART me wondering "do they want to hold on to the money because their finances are dire enough that having it in their bank account reassures THEM, and if so will they actually pass it on once the arbitrary limits they invented are satisfied?" A bank limiting withdrawls to $3000/day is not a healthy bank. Their behavior is NOT A GOOD SIGN, and I'm reluctant to put pressure on the broken thing and see if it hurts... but I'm trying not to invent an unnecessary crisis here either, and "I last got paid in October and we just sent more money to the IRS than I've ever paid for a car" is ticking audibly...)
I suppose the flood of "april fool's" nonsense online shows that people are feeling better? It went away pretty much entirely during the Rump administration because geratric fascists shouting "fake news" to discredit their opposition by loudly and repeatedly asserting that anything they didn't like simply couldn't be true (the previous nazis called this the "big lie") made anybody ELSE not scrupulously telling the truth and fact checking everything they could... kinda inadvisable. It's still really annoying, and seldom even slightly funny. Oh well.
Going through the github requests trying to find simple things to close, but I've just been overcomplicating stuff recently.
For example somebody requested the "shuf" command yesterday, which I added, but I spent a while arguing with myself over whether it should use random(), lrand48(), or getrandom(). In theory the randomest is getrandom() but each calll consumes kernel entropy, which seems overkill for something like this? And it's _wasting_ entropy because it returns whole bytes and I then have to chop it down to just what I need, the common case of which is what, 500 entries?
Initializing a prng from a proper entropy source is a classic middle ground for a reason, and the _easy_ way to chop a randomness source down to a specific integer range is modulus, which will introduce bias unless the modulus is MUCH smaller than the range of the random number (so the uneven coverage of the last wraparound is statistically insignificant).
In the end I just went with srandom(millitime()) and then random()%count which is good enough. (And the trick to make it efficient is lines[ll] = lines[--TT.count] because if you don't care what order the not-yet-used entries are in, swapping the last one down into the hole you just left avoids the memmove() you'd do to close the hole while keeping them in order, or any sort of usage bitmap nonsense.)
What?
FAIL: chmod 750 dir 640 file echo -ne '' | chmod 750 dir 640 file && ls -ld 640 dir file | cut -d' ' -f 1 | cut -d. -f 1 --- expected 2023-04-01 02:57:10.424197685 -0500 +++ actual 2023-04-01 02:57:10.428197685 -0500 @@ -1,3 +1,3 @@ --rwxr-x--- drwxr-x--- -rwxr-x--- +-rwxr-x---
Sigh, I broke ls with the new --sort stuff, because when I reused the -A and -d flags I didn't UNSET them from the base set, so ls -d no longer produces output in the same order. Oops.
Of course I left myself a todo about this: a pet peeve of mine is --longopts without a corresponding sort option are un-unixy, and the new short options I defined for the new --sort types that didn't already have any were -! and -? (except ? is a wildcard and CAN occasionally misbehave, a pet peeve of mine with "qemu-system-mips -M ?" to list available machines is you need to quote the ? if there's a single character file in your current directory; the reason the magic . and .. files don't count here is wildcards won't match hidden files unless the first character is an explicit period... Ahem, anyway maybe -~ would be a better option since tilde is only special to the shell as the _first_ character of an argument, and ~ means approximately anyway so case insensitive shouldn't be TOO hard to remember.)
Using punctuation like that means I'm MUCH less likely to conflict with existing or future gnu nonsense. (The cut -DF support STILL isn't upstream in coreutils, last I checked. I should poke them again...)
So I need to grab the extended argument parsing plumbing I added WAY WAY BACK while working on mkdosfs, which wanted -@ to set the offset and I added a whole mess of lib/args.c and scripts/mkflags.c plumbing to allow that. Which I checked in and tested and everything, but the only user of it is still out of tree in my local pending directory because I got distracted and still haven't finished mkfs.vfat. So, how does it work:
Take your ascii character value (@ is hex 40), set the high bit to turn it into "high ascii", turn that into a good old K&R C octal escape circa 1976, and include the octal escape in the option string: for -@ it's "\300". The FLAG macro you get is FLAG_X followed by two hex digits, in this case FLAG_X40 which means FLAG(X40) should work.
I didn't get a haircut before I left Austin, and Fade's suggestion is next to her office which is a half-hour walk each way, but I could use the exercise. That and visiting the Tiny Target next door at the whole morning.
Fade pointed me at a lovely little study room off the side of one of the light courts in the apartment, which would be perfect for recording tutorial videos. It's also a nice place to get away from Endlessly Barking Dog.
I want to write up a quick mkroot explanation for the qemu guys (who are testing with my mips and misel mkroot images, yay!) but alas, it's not quick and simple. It SHOULD BE, but I'm having Pascal's Apology again. Which is why I need to record a video tutorial for this. (Hearing it out loud helps me get a written version to be concise and intelligible too.
Whole bunch of work, that.
Recovery from travel.
A QEMU thread has me rebuilding all the qemu targets again, which is a bit of a time sink.
It's kind of hilarious that Ubuntu doubling down on Snap and SuSE doing a flatpak distro came out the same day. Snap is Ubuntu's proprietary version of flatpak the same way Ubuntu had the upstart init system, unity desktop, mir 3d compositor thingy... Ubuntu is run by a white male billionaire from south africa, not a whole lot of "listening" or "following" going on there. It's a real pity that their move to replace /bin/sh with the Defective Annoying SHell was swallowed by Debian, but Debian also switched to systemd which was approximately as stupid. Being at the mercenary of a billionaire's whims can't be comfortable. (Debian has an unfortunate history of FSF-adjacency, which means its development got so flamewar constipated the project almost died with many years between "debian stale" releases, and Connonical hired at least one a full-time developer to shovel out the mess on the engineering side of _debian_ (not ubuntu) because the open source project he'd overlaid his proprietary project on going under would have been embarassing. This was also the period where fleeing Debian developers squashed Gentoo, which that distro never really recovered from...)
Travel day. Onna Airplane.
I'm carting two pieces of checked luggage, the first of which is a suitcase inside a suitcase so I can fill one with Japan Loot for the return trip. I've probably missed Milk Seafood Ramen season (to Fuzzy's great disappointment), but I also left a bunch of clothes and books and stuff in the apartment when I left (meaning to return) over the pandemic, and it got packed away into storage and I should reclaim it.
The second piece of luggage is the GIANT CARDBOARD BOX with the oscilloscope Jeff sent me in the middle of the pandemic. Apparently the stuff they made 50 years ago is way better than the stuff you can get today, because nobody makes analogue waveform storage anymore and the digital equivalents are hundreds of thousands of dollars IF you can find something sufficiently high resolution. So when one comes available at a good price (usually because the people who knew how to operate it retired or died, and the inheritors don't value something they can't use), he snatches them up. This sort of thing can record signals for DDR3 and USB3 busses. We've done LPDDR2 and USB2 already because the cheap digital stuff can keep up with that, but anything faster gets expensive rapidly to see what's actually going across the wire.
Analog storage is the same general idea as a mercury delay line: giant capacitor that reproduces the wiggles of the input in its output, then you can loop it back on itself to retain the signal for a while. It is INSANELY high accuracy, calibrated with NASA-style equipment that sadly doesn't exist anymore. The downside of the analog stuff is A) the stored signal only lasts a few minutes, B) there's a hard cap on the SIZE of the capture because the delay between input and output is fixed and you can't record more than that at a time.
(When Scotty stored himself in a transporter pattern buffer for decades, the technobabble description was a bit like this. And in the Dr. Who episode Timelash, the 6th doctor used a McGuffin based on this principle to hit the bad guy with his own zap gun. Modulo "no, it's still totally murder when you're counting down like that while pointing the output at him, you could have just turned it to face the wall; there might be a limit to self defense when the dude is literally begging", but that's the kind of writing Colin Baker suffered under. At least he never had to deal with Yellow Kangs or the Kandy Man.)
Anyway, giant box under my desk in the bedroom became giant box in bedroom closet which is now giant box in Fade's apartement, which I hope to convert to giant box somewhere in japan that is no longer my problem. MAILING it to japan would cost hundreds of dollars (more than Jeff paid for it, that's why the seller only offered US shipping, not even to Canada), but it's just under the weight cap for checked luggage, and they don't charge extra for it being _bulky_. (The weight limit is a health-and-safety thing, maximum weight workers can be expected to individually lift between conveyor belts lots of times per day. Anything heavier than that requires two people to lift for liability reasons, and thus special labeling and handling procedures, and generally gums up the works trying to load and unload the plane quickly.)
No idea when Jeff plans to fly me to Japan, but hanging out with Fade until then. Disposing of Giant Box is a nonzero portion of the reason I agreed to the Japan trip. (I also really LIKE tokyo, and it would be nice if the stuff I worked on for years actually got launched out into the world, although toybox comes first these days thanks to Google.)
Bunch of errands today. Four bus rides.
I've meant to switch credit unions for years now, and as long as I was going to be down at UT after 9AM anyway... I'm seldom still there when I walk because I head back around sunup or I get all hot and sweaty, plus I can't see my phone display in full sunlight. And then I needed to go from UT up to The Domain to close the old credit union account, because that's the closest remaining Amplify location. University of Texas credit union has 2 locations and 2 ATMs within a half hour walk of my house, and two more at the university from a bus that picks up within sight of my driveway. The closest remaining "Amplify used to be the IBM Texas Employees Federal Credit Union but renamed itself" location is eight miles away, a two bus minimum each way or something like a $40 round trip on lyft WITHOUT surge pricing.
Fresh full backup of my laptop to USB drive. This SSD is old enough I'm occasionally checking dmesg to see if it's started to get unhappy about stuff. (Shouldn't, but I can be hard on things...)
Huh, corner case in the toybox test suite. So the general theory of toybox tests is a file full of testing 'name' 'cmdline' 'result' 'infile' 'stdin' lines (each is a call to a bash function) where the first argument's the name of the test to print on the PASS: line, the second argument's what to run, the third is the stdout output to expect, the fourth is data to write into a file named "input" (which only gets created when that's not blank), and the last is what to feed into the command's stdin.
Three complications to this: 1) The 'name' has the name of the command being tested automatically prepended to it so you don't have to repeat it every time, 2) there's a wrapper function testcmd which inserts the name of the command we're testing into the start of the 'cmdline' argument so we don't have to repeat it (and it makes sure we call it out of $PATH instead of a bash builtin by providing the absolute path when necessary), if the 'name' argument is blank it uses 'cmdline'.
The problem is that if you leave 'name' blank in testcmd it prepends the command name TWICE. Once when testing() prints the PASS/FAIL/SKIP line, and once in the testcmd() wrapper.
Got tired of waiting for Jeff to actually schedule a trip, and got a plane ticket to visit Fade up in minneapolis. (If I'm flying to tokyo, it should be from there.)
This means I have SO MUCH TO DO before then. Laundry! Fresh full backup of my laptop! Toybox todo items I should flush up to github... And it means I should NOT walk to the table tonight, because then I won't get anything done during daylight hours tomorrow because sleep schedule. Alas...
Fuzzy's birthday was on the 20th and we ordered her an Oculus 2 so she could play beatsaber, and it does not work. So we're returning it, which means I need to drop it off the return box at the Amazon lockers in Gregory Gym (the building with two first names), and I thought that was my excuse to do my 4 mile nightly walk watching anime on my phone despite the earlier "I shouldn't do that for schedule reasons"... but the building doesn't open until 9am. (I'm currently on a night schedule. The flight tuesday's noon-ish. Gotta impedence match between now and then.)
If I'm planning to be at the university during daylight hours I should get a new credit union account at the UT credit union on Gadalupe. Which means I should also close down my old Amplify account (which used to be IBM Texas Employees Federal Credit Union before they moved entirely out of Austin up into the northern suburbs. The closest location left is in <snootiness>The Domain</snootiness>, which is 8 miles from my house, an hour away by bus or bicycle. All their closer locations closed years ago.)
I deleted the Google Maps app off my phone screen back when it turned into all advertising all the time and stopped showing me black owned businesses (such as the haircut place I regularly go to in Hancock Center) even when I zoom in all the way, but sometimes I still need to see how far it is from point A to point B and what bus to take (and/or when things open, which it's never been quite right about since the pandemic), and when I do that I'm using the web version on my phone. Here's the SERIES of bugs I just hit in Google Maps' web version: enter the two addresses, hit the arrow on the keyboard to actually search and... it doesn't do anything. Plus it's scrolled itself to the right in a way that won't let me scroll back left so I can see the start of what's written on the page. And when I rotate it from landscape to portrait mode in hopes it resets itself... it loses track of the addresses I entered to ask directions about. It loses track of the location I was looking at, and instead reset itself all the way back to zoomed out full city view. That part's trivially reproducible, does it every time. Ask directions, type in the first address, rotate the phone, and the page undergoes a hard reset losing all context. Bravo Google. Your own browser in YOUR PHONE can't handle your website. That's... *chef's kiss*.
Anyway, from UT to The Domain is one bus (the 803). Yay. I should do that. (I don't want to give Patreon and such the banking info for the household account. I'm still paranoid about combining "money" with "internet".)
I got the ls --sort stuff checked in but not properly tested. Confirmed it didn't cause any obvious regressions in the test suite, but then got distracted by the whole Microsoft Github clusterfsckery trying to check it in. Had to delete the man-in-the-middle key four times before it stopped complaining. (IPv6 is not fun.)
Hmmm, tests/ls.test is ugly. Each test is bracketed with "cd lstest && $TEST && cd .." because otherwise the "expected" and "actual" files wind up in the current directory listing, and hence the output of most tests. The first being the output the test is expected to generate (argument 3 to testing()) and the second is the file output is currently redirected to, which are files so we can diff them and naturally get useful labels on the results. There's a fourth file, "input", but these days that's only created when testing() argument 4 isn't blank.
I suppose I could move them up a directory level? Because the action's taking place in generated/testdir/testdir, with the first "testdir" being where temporary binaries we're testing live. Since none of them are called "expected" and "actual" it shouldn't conflict if I use it as a work directory. (Modulo whatever Android's doing to use this test infrastructure, I THINK it should be ok? They use my scripts/runtest.sh but not my scripts/testing.sh which sets this up... Sigh, I should poke Elliott, shouldn't I?)
Walked to the bat bridge instead of UT, 25k steps total instead of just 10k, but my back was killing me when I sat down on the couch in Jester Center and I didn't get anything done. (They're kind of terrible faux leather couches on the second floor, mostly there for show I think, and my lower back's been unhappy since I slept on it wrong a few days ago, like a crick in the neck but older and more decrepit. I _really_ don't want this to become chronic because it wouldn't just suck, it would be CLICHE. The difference between being 15 and being 50 is problems resolving in about 8 minutes vs problems resolving in about 8 days. Lots easier for to fall behind on cumulative wear when it's not clearing itself nearly as fast as it used to.)
Finally dug up an old-style micro-USB cable that WASN'T a charger cable but actualy did data, so I can see the serial output on the turtle board. It works fine once I got a cable, but the linux-kernel I built and released last time does not work at all. (No output to serial once the bootloader hands off to it.) The one the sdcard had on it was linux-5.10 (dunno if I tested something newer since, that's just the reference version I know works), so there's some bisecting to do.
Huh, the musl-cross-make toolchain rebuild I did with gcc 11.3 earlier this month didn't build the sh2eb cross compiler because libgcc/unwind-pe.h had an error: '_Unwind_gnu_Find_got' was not declared in this scope which... I mean clearly it's a gcc bug, but what exactly broke? (It built sh4. Is this a nommu thing?) How do I track that down... What I _want_ to do is bisect it in the git repository, which is tricksy. It's slow to build gcc at the best of times, and mcm with my wrapper script doesn't do partial compiles.
I'm kinda tempted to compare the Linux From Scratch chapter 5+6 build script with musl-cross-make and just do a toolchain build script. If I have to fish out my own patches to make the build work _anyway_... I did that in aboriginal linux, this time it should probably be a proper project all on its own.
That's already 2 nested tangents from what I'm TRYING to do.
Got the LFS chapter 5+6 script building to the end. No idea if the result's actually useful yet, haven't done the chroot and started the second script. For some reason following the current LFS instructions, half the new commands _aren't_ in the /tools directory? They're in the normal paths. What's the point of the airlock step if you do that? I has a CONFUSED...
Alas, my initial naieve attempts to run record-commands to get a log of the host commands called for this build script... did not work. I need to update scripts/record-commands until it works right out of the box even when I haven't looked at it in 6 months and don't remember how I'm "supposed" to use it. (For one thing, it calls scripts/single.sh to build the log wrapper. It should check if "toybox" is already in the $PATH and symlink logwrapper to that if so, and only try to build it if it can't. Otherwise, you can't use it from anywhere OTHER than the toybox directory...)
I also have a github bug request from somebody who did scripts/mkroot.sh and then couldn't "ping" anything because glibc is crap at static linking. Um, yeah. That's why I added a "dynamic" script, but I've updated devuan since the last time I poked at that and now it's copying a bunch of symlinks into the target, including absolute paths outside the chroot. Unfortunately, when I add a -L to the cp -a the result is 1.7 gigabytes of usr/lib space because glibc is an insane pig, so I need to hardlink them back together to get the size down to a dull roar.
Back at the table again (I've missed this), putting together a Linux From Scratch 11.3 build script, so I can do the old trick of substituting in toybox commands one at a time and comparing the output to make sure nothing changed. (I should probably diff the config.log as well. To get consistent results I should do single processor builds, but I'm having the script make -j $(nproc) and then I can just "taskset 1" to force that single threaded later.)
Jeff thinks he might wind up flying me to tokyo on monday, but the hard part is working out hotels. It's cherry blossom viewing season there, which coincided with spring break in the states, and it's the first time in 3 years Japan's been open for tourists. The hotel room shortage has not eased up at all yet. Still a big staff shortage. They've announced plans to allow more foreign workers, but it apparently hasn't manifested results yet...
At the table, with a can of checkerboard tea. It's been a while. (Ok, I'm at one of the tables NEXT to the original one, working on battery because the outlet's blocked off, and ignoring the construction fencing. But still: same porch, same lighting, same comfortable seating.)
Poking at dd.c because I had the tab open, and... ok, that's a kind of painful use of TAGGED_ARRAY. There's nothing BUT the strings and the position indicator for the strings. This makes me sad. There's gotta be a better way to do that. I'm not sure what that better way IS, but this is ugly...
And now distracted by the half finished ls --sort plumbing, which I have now finished and the result compiled and failed the very first test in "make test_ls". Great.
I have done something to my back while sleeping. It's like a crick in my neck, except lower back, and I'm on something like day 3 of this. Reeeeally hoping it doesn't go chronic.
There's an i2c bug report on github that's been... badly explained repeatedly. I think the submitter doesn't have english as their first language, and I have no i2c domain expertise, nor do I have a test environment, which is why I haven't done the normal level of cleanup on this command, which ALSO means I haven't done as much review.
Because writing code is easier than reading code, I tend to rewrite as I go to utilize my far-more-practiced writing code muscles to help with the reading. Yes I know it's a bad habit, and sometimes I throw away the result because it's just marking stuff up in red pen, but that seems a waste with toybox? If I'm gonna clean up the code and thing the result is an improvement, I want to check it in, but I can't test for stupid thinko/typo regressions if I don't have a test environment and ANY change can theoretically introduce a regression. I've borked semicolons or bracket nesting levels in code refactoring before (back in my tinycc fork), and the result compiled but subtly misbehaved. Gotta test. CAN'T test. It's a problem. I've USED i2c tools on various board over the years, but it was all at contracts where I left the hardware behind with the job. My laptop hasn't got it. I don't THINK the turtle board does either but when I just tried to boot it up I didn't get serial console... it uses the old pre-C usb cables and I think this one might just be a charger cable not data? (Why do they DO that?)
I'm poking at qemu to see if that has a good test environment for i2c somewhere, but none of the ones I built did because the kernel hasn't got CONFIG_I2C enabled, and when I switched that on (and CONFIG_I2C_CHARDEV because that's not enabled by the first thing for some reason), then there's DRIVERS: I2C_SCMI, I2C_CBUS_GPIO, I2C_GPIO, I2C_OCORES, I2C_PCA_PLATFORM, I2C_SIMTEC, I2C_XILINX, I2C_MLXCPLD, I2C_VIRTIO... Plus whatever I2C_HELPER_AUTO is for... protocols? I suppose I could just switch it ALL on and see if any of the QEMU board emulations bind to something? Whatever I come up with should probably be added to scripts/root/tests so I can build regression test systems that do this automatically, but first I need to make it work _once_.
You can sing "closing tabs" to "closing time".
Trying to collect old superh patches for Glaubitz (the new arch/sh maintainer in Linux), but... there's so much old debris here and I have no idea what's still relevant. I collected lots of groups of 4 or 5 patches at a time and sent them to Rich when he was nominal maintainer, most of which never got applied, but I didn't exactly archive them again afterwards. (Checking back email in my sent box is one of the avenues of investigation here...)
Hah, scp-ing my blog file and corresponding rss file up to the website takes LESS THAN A SECOND with the new router. The old one took long enough I usually tabbed away and came back.
I keep meaning to find a way to post these bog entries to mastodon so people can reply there. This thing is a text file I edit with vi and periodically rsync, with a python script that generates an rss feed based on the lines that start each entry being regular enough (mostly thanks to cut and paste) that the text parsing to chop stuff out and plonk it into wrappers is pretty simple. But there's no WAY I'm turning that into an activitypub feed any time soon, and
Mastodon can provide an rss feed, but not let you FOLLOW on rss feed. Or easily convert an rss feed into mastodon posts at some known @user@server account. (If you google for it there's dozens of weird little projects on github or websites to do-it-as-a-service that seek to address this, but no real winner emerges and Google's search ranking to indicate which ones to look at first has deteriorated into uselessness over the past few months. My wife regularly complains about google becoming useless and she's not a techie.)
Blah, I need network block device tests, which is fiddly both because it's a client/server thing requiring root access AND kernel support for the /dev nodes, but also because the server and the client test against each other and "make test_nbd_client" would build just the client and then try to grab the server out of the $PATH, which most likely isn't there. As with the tar --xform stuff needing toybox sed, the test is looking at a _combination_ of toybox commands, which... the current test suite isn't really set up to do. (Well, "make tests" that tests ALL of toybox at once can, but not in a more granular fashion.)
I can have the nbd-client test check that nbd-server is there and fail to run if it isn't, but... the tests are mostly the same on both sides? Sigh, what are the tests:
Hmmm, so far most (all?) of the toybox servers are inetd style. I should probably find some way to indicate that in the help text? Ok, sntp isn't because that's a UDP protocol, and figuring out when a UDP transaction is _finished_ is AI-complete. By which I mean "C3P0 could do it, but I wouldn't trust chatgpt near it". There's one of those P=NP things going on with this AI nonsense, where closing the gap is likely to take multiple lifetimes if it can be done.
Possibly I need more lib/net.c code to do a server wrap thing that takes a callback function? Except then my httpd and nbd_server need more command line arguments to indicate the server and port to bind to, which is a UI issue. Hmmm, I need to revisit httpd anyway to add the rest of cgi support. And nbd_server already says "ala inetd" which is funky since I don't have an inetd in toybox.
Long ago the samsung guys contributed tcpsvd to pending, which doesn't share any code with netcat. It does do a number of things netcat doesn't: limits on simultaneous connections, sets a bunch of environment variables... it also doesn't support nommu (which netcat server mode does), and combining vfork() with the -h option to look up remote connections (which can take an arbitrarily long time) does NOT sound like fun. Um, wouldn't the -b N thing be rendered irrelevant by kernel syncookie support? It's been YEARS since I've looked at that, where does -b get used... no FLAG_b and it's not TT.b it's... Sigh, count the arguments: TT.bn. Am I going to have to clean this thing up just to properly EVALUATE it? Grumble grumble... I really dislike duplicate infrastructure, but at the same netcat doesn't track multiple children. Plus this one hasn't got the "cat" part, it's always setting up filehandles and leaving the reading and writing of them to a child process.
Hmmm... I suppose I could clean it up and potentially merge them _later_? They have a hand-rolled hash table implementation. It's doing an error_exit() on recvfrom() errors. Is there a UDP packet you can send that DOS the server? (I remember TCP out of band data, but not UDP?) Why is it using sigemptyset/sigsuspend instead of just pause()? Does tcpsvd MEAN to write a trailing nul byte on the message part of -C COUNT:MESSAGE or is this an accident? (What do other implementations do? Is there a spec? Sigh, break down and look what what busybox does: no they do not have a trailing NUL byte, and they use nonblocking send() instead of write, which seems kind of important. Although I could probably fcntl(F_GETFL/F_SETFL) to set O_NONBLOCK, but why when send() exists? I could also check MTU length vs the message, but again... simple thing.)
Oh this looks like a long cleanup. And learning domain expertise. Why am I opening another can of worms when I'm trying to CLOSE TABS again?
Got the new toolchains built with gcc 11.2, the patch worked and I should poke dalias about merging it into musl-cross-make. (It's a backport, this should not be controversial to upstream? But then I felt that about the kernel patches. Oh well, it's on the #musl backscroll, maybe he'll notice...)
Built scripts/mkroot.sh CROSS=allnonstop LINUX=~/linux/github followed by scripts/test_mkroot.sh and everything except sh4 and the "no kernel" targets (armv4l armv7m microblaze mips64) passed, and the sh4 problem is the qemu+kernel clock issue (that emulated board isn't getting a battery backed up clock, and I ran the test without my laptop connected to the net so it can't set the clock from NTP).
So the new toolchain's working as well as the old I guess? More warnings, such as 'sprintf' argument 4 may overlap destination object 'ifs' in sh.c which... Ok, I can see an "even more optimized" version getting that wrong and I should maybe switch that to memmove() but first I should refresh my "what data is in which variable" mental working state which implies I should have a LOT more comments here (and possibly rename some variables) but reading through this code I did a couple quick simplifications but NO I have like FIVE DIRTY VERSIONS OF THIS FILE to collate already (I was working on this at Fade's last month, where was that... I _just_ dirtied the toybox/toybox file which had previously been clean, where's the recent... not in clean, not in kleen... it's in kl2). Ahem: NOT NOW...
Yay, somebody who seems to know this i2c stuff finally piped up on the confusing bug there. I still haven't got a test environment, and "get a raspberry pi working" is not ideal there. (I've been meaning to do that forever: their bootloader needs horrible proprietary blobs to bring the system up, the hdmi+keyboard setup in front of the TV is awkward and the connections buried and I haven't got a hdmi monitor for the desk in the bedroom (been trying to get out to Discount Electronics to buy one for months but they moved 5 miles further away, up near where Fry's used to be), and the only non-broken pi case I have is in use on my turtle board... Sigh, I should sit down and do it anyway. So many tangents.
Speaking of tangents, the recent "cpio -i extra garbage arguments" thing really SHOULD have them be extract filters, and opening cpio I see I have that "cpio skip NUL" test still not passing on the host, and a TODO about hardlink support since that's what the TRAILER!!! entry actually flushes (the cached hardlink detection), which means I really should try to get the other mother to sew buttons onto the hardlinks in a test directory to confirm what the output looks like, and then confirm the kernel's consuming it the same way AND poke the people who were talking adding xattr support to initramfs...
Sigh. Pull a thread in this jenga tower...
One of my early posts to mastdon was reminiscing about how the INTENDED use of a Tardis in Dr. Who seems to be for very long-lived Time Lords to bog off to deep space or some deserted beach for a few years while they catch up on STUFF, and then return 5 minutes after they left actually caught up on all their reading and browser tabs and todo lists without the society around them moving on so they missed anything or accumulated new todo items while they were gone. And the Doctor got in trouble because using one to go to a planet and interact with people was Doing It Wrong.
Yeah, 500 years between regenerations (both because the second doctor said he was about 450 years old and the eleventh lasted about that long in his little exile town), the ability to pause the world for a decade at a time in a nice quiet workshop area with kitchens and libraries and swimming pools and long corridors to walk down... I can definitely see the appeal.
The coreutils guys have got their knickers in a twist about new gcc releases breaking trying to build existing packages again, and rather than go "our code didn't change, yours did, this is your bug", they're capitulating because gnu. And providing horrible emacs examples.
Anyway, I should probably try newer gcc so I'm at least not surprised and can have -fno-stupid-thing workarounds prepared for fresh compiler bugs from C++ loons? The current musl-cross-make git version has gcc 11.2.0 as its newest toolchain... And it broke. Then new version can't even do a canadian cross:
from ../../../../../src_gcc/libstdc++-v3/src/c++17/floating_to_chars.cc:31:
build/i686-linux-musl/i686-linux-musl/obj_gcc/i686-linux-musl/libstdc++-v3/include/fenv.h:58:11: error: 'fenv_t' has not been declared in '::'
The line in question is "using ::fenv_t;" which can't possibly be a good idea.
The fix is to tell the libstdc++ build not to include the standard C++ headers in its search path. No really! Adds a compile flag. (And according to heat on the #musl irc channel, that's what got merged upstream.)
No wonder each new release breaks. It failed to build itself with itself, and THIS SHIPPED.
Weekly call with the J-core engineering team. Still no word about actually going to Tokyo. The tourists are back, it's sakura season. You'd think the overflow of hotel rooms from the olympics would mean they aren't all full, but having plenty of ROOMS does not mean having plenty of STAFF to service those rooms, and everybody got laid off during the pandemic. Japan does not have extra people in general these days (under the age of 60, anyway), and the covid restrictions allow tourists back but not yet foreign workers to run cash registers. It's apparently a problem, but there are worse problems. (The "nobody has any money, everything's going out of business" problem has at least been arrested by the return of the hordes of tourists. Although a lot of individual shops didn't survive.)
The new router arrived, and we eventually got it set up without installing any apps. It is SO much faster than the little white circle from Google (and the signal strength bar is green rather than yellow when my laptop's on the desk in the bedroom), although we haven't gone all "office space printer" and smashed the google circle with a hammer yet because we're giving it a few days.
The fiber connection itself is actually quite nice: the router SUCKED. The _service_ is mixed: why can't we get a static IP for less than twice what we're paying for the connection now? It's LITERALLY THE SAME SERVICE with a trivial config tweak. Wasn't the whole point of IPv6 that even if you can't get a stable IPv4, everyone everywhere could have a stable IPv6? But no, they want to capitalism at us.
Downloaded a fresh LFS book, and the magic all-in-one source tarball which should probably be more well documented, and I should automate another build and then try to get mkroot to do it. I can insert a toybox dir at the start of the $PATH and switch over commands one by one, just like I did with busybox back in the day.
Alas I'm not feeling inspired, because I have too many open tabs. Closing tabs tends to be hard because they're all only still open if I didn't get them closed last time I sat down at it. But starting anything NEW just makes it worse. And it's deep into "if I work on anything specific I'm not doing anything ELSE" territory. Generally a sign I'm still undervolt. (The cedar pollen is not helping.)
Still not feeling great, but I should do stuff.
I reached the point of editing and uploading blog posts where the entire entry for Feb 22 is "Oh god, kernel people." I know exactly what that's about but... really don't WANT to expand it? For the same reason I stopped replying to the kernel threads. Can I just use old kernels? I want them to stop breaking stuff that USED to work.
Sore throat, couldn't sleep. Spent most of the day huddled on the couch.
Tried to watch the "campfire cooking" isekai with Fuzzy, which Did Not Work because of the stupid Google router continuing to die. (I tried associating my phone with Google's router to save bandwidth for like five minutes when I got back, and then undid it again because even when T-mobile is throttling me for going over my 50 megabyte monthly quota it's STILL WAY FASTER THAN THAT STUPID ROUTER.)
This was finally enough for us to break down and get a new router. It was $50 cheaper to overnight a netgear from Amazon than to buy the exact same router at the Best Buy a fifteen minute walk from here.) So far it looks like it needs an app installed on somebody's phone to set it up (the card in the box says what app to install, or gives a URL to talk to a support being; no other instructions), so we haven't actually swapped it in yet, but there SHOULD be a way to talk to it directly...
Sore throat. Kind of lurgy-ish. Trying to figure out if this is allergies or dryness or microorganisms. Possibly it's a team effort. And, of course, I'm old.
Jeff got his contract signed, which means I may be heading back to Tokyo to help him organize the giant archive of stuff we did so it can get spliced together into a new product. Historically speaking, I can do toybox stuff from tokyo MORE easily than from Austin (he hates Apa hotel rooms, I find them just about my platonic ideal of a work environment, with a conbini downstairs for lunch rice balls), so...
We rebooted the Google Fiber router yesterday because it had become unusable again. Today it's already bad enough that reloading the household slack tab (after a "pkill -f renderer" because chrome was taking up too much memory again) did the ?cdn_fallback=1 then then added ?force_cold_boot=1 for the third attempt and then timed out saying it couldn't contact slack.
I don't mind google.com taking 7 seconds to load nearly as much as I mind being completely unable to use some sites, or thunderbird pausing for ~3 seconds between each email it downloads via pop3 (meaning a 400 message download takes over 10 minutes, so downloading my ~1500 daily messages is a background task that takes over half an hour).
Capitalism's really BIG failure is externalities. Engineers should be forced to dogfood their own products. I want THIS router put on the desk of the person who designed it, with all their traffic going through it, and to be forbidden from rebooting it for a week.
And yes, I'm happy to dogfood toybox. The main reason I don't already is I want a feel for what the other versions do so I can make toybox roughly match it. (When your frame of reference is your own output, it's really easy to spiral off into the weeds.)
Much wrangling with cpio, trying to fix three different issues. Got two of them fixed, calling it good enough since the third isn't a regression and nobody's waiting for it. (That's the "TEST_HOST fails, when did that start?" Moving targets...)
Pondering (st.st_mode&S_IFMT) == (mode&S_IFMT) and wondering if the compiler is smart enough to turn that into !((mode1^mode2)&S_IFMT) or if that's even a win. (3 operations vs 3 operations, although ! is only an operation sometimes? It could also go r1 = S_IFMT; r2 &= r1; r3 &= r1; branch-not-equal r1,r2 or some such. The repeated constant is PROBABLY something the compiler can handle for me, I don't need to go "that's redundant, I could rephrase it in a way it's not stated twice" and then ponder whether or not that's actually an improvement.
Ahem: premature optimization. Back away slowly.
Ok, I _think_ for the help fixes: "toybox --help COMMAND" should print Elliott's advertising line and "toybox help command" should not, and "toybox --help" is equivalent to "toybox --help toybox", but "toybox help" is equivalent to "toybox help help".
This is all UI stuff, so there isn't a right answer, but I'm trying to come up with an answer that makes sense for me without obviously disappointing anybody else.
I have 8 zillion accumulated 80/20 patches were I've done most of the work and then hit "does this cover all the cases, what ARE all the cases, and what are all the test cases I need to put this through to prove that" and I can't quite work that part out. I'd very much LIKE to check all this stuff in, but making sure it's _right_ is hard.
The sad part is I keep trying to grab low-hanging fruit, finding out the thing is not low hanging fruit, parking it at a good "almost finished but not feeling up to finishing just now" parking spot, grabbing OTHER presumably low hanging fruit, and then coming back a couple weeks later and having to reconstruct my mental state from scratch.
The external bug reports are actually easier to field because somebody else is waiting for me to finish and I can tell whether or not I've fixed their test case.
I'd like to have a nommu test system that runs under qemu, and "coldfire" (an m68k variant) is the oldest of the lot. The problem I had back under aboriginal linux is none of the nommu board emulations had the complete set of hardware devices I wanted (256 megs RAM, battery backed up clock, serial I/O, two block devices, network card), but most things have a serial console and I can fake the clock with sntp or an environment variable, and if I have a network card I can use network block devices. It's not ideal, but it's _something_. Alas I can't use swap on nommu so a board with only 64 megs ram isn't running modern gcc on anything complicated.
Alas qemu is terrible about labeling its boards (it's getting better, but there's no docs/system/m68k yet), I can go "qemu-system-m68k -M ?" and I THINK the first two boards there are coldfore (as opposed to the with-mmu ones that Linux inexplicably won't let me built a nommu kernel for) are an5206 (Arnewsh 5206) and mcf5208evb, the latter of which is the default board. As far as I can tell (from reading through hw/m68k/ar5206.c and hw/m68k/mcf5206.c) the 5206 has 128 megs ram but no hardware except a serial port? The 5208 has one network card, which is at least something.
So, back to the linux source: arch/m68k/configs has a file m5208evb_defconfig so let's build that and see if I can feed it to qemu-system-m68k -nographic -no-reboot -kernel vmlinux and hey: boot messages! Panicing because no initramfs. And the no-reboot is ignored implying this board doesn't know how to reboot or power off which is... sigh.
Memory goes from 40000000-41ffffff which... echo $((0x1ffffff)) is 32 megs ram. That's a bit squished. And it ignores qemu's -m option to try to give it more, which beats the cortex-m boards that were erroring out when you gave it any -m value other than the default. (QEMU may be undocumented, but at least its behavior is inconsistent.)
What else is in these boot messages: ttyS0 is the "mcfuart" driver. A dozen TCP/IP layer boot messages about hash table initialization and such but no line about the actual network card initializing itself. (Doesn't mean it didn't, which messages happen at which printk verbosity level is kinda potluck in embedded board drivers.) Oooh, mtd probe address, we've got a Memory Technology Device which means flash chip. Data storage onna block device, which QEMU might be able to stick a host file under. /dev/mtdblock0 which the "initramfs didn't work" root= fallback logic tried to mount as ext2... because apparently the default kernel command line (from qemu? built into the kernel?) is root=/dev/mtdblock0 and WHY does it bother saying /dev/ there? Honestly, what's the alternative?
Ok, I got a kernel to boot and spit out messages to serial port, which means I MIGHT be able to get an initramfs to boot to a shell prompt with serial console, even if I don't have any other I/O devices working yet. Assuming I can figure out how to get musl to...
Ah, darn it. I did this a year ago. And why did google not find musl's official web mirror on openwall? Google searches are getting RAPIDLY less useful, it's very annoying. I manually navigated to the right place but for some reason Google can't find that. Do THEY have a borked robots.txt? No, looks sensible. This is just Google increasingly sucking. I hope they recover.
Anyway, yeah, that's why I didn't do this earlier. Puppy eyes at Rich time again, I guess?
Took ADHD meds _and_ a store brand zirtec AND a prophylactic ibuprofen this morning, just for good measure. Actually able to concentrate for once, at least so far.
And Elliott's having build trouble on mac, which... how slow is it to launch executables on mac? Is it just a homebrew thing, or are all mac binaries latency spike city? And yes, I should have realized old version of bash without "wait -n" isn't just a centos thing, it's also a mac thing. So my centos hack is insufficient if you care about the mac build being well-supported, which Elliott does.
Checked in the fixes for the warnings from yesterday.
Grrr, tests/files/* is design-level wrong, but it would take a largeish rewrite to make it right. I need generally better organization for "not the actual toybox source" files: scripts/make.sh and scripts/mcm-buildall.sh and scripts/mkroot.sh and scripts/root are all slightly different categories.
Cycling back to the "help" redo...
Dear compiler loons:
toys/posix/ls.c:393:16: warning: too many arguments for format [-Wformat-extra-args]
printf(" "+FLAG(m), 0); // shut up the stupid compiler
But if I yank it, llvm goes:
toys/posix/ls.c:393:16: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security]
printf(" "+FLAG(m));
So one compiler warns if you give it an extra argument, the other warns if you DON'T give it an extra argument, and in NEITHER case is it an ACTUAL PROBLEM. (Sigh, switching it to xputsn() but still. This is unsuppressable false positive noise. Stop it.)
Meanwhile, gcc is also going:
toys/posix/cat.c:31:32: warning: the omitted middle operand in ?: will always be 'true', suggest explicit middle operand [-Wparentheses]
int i, len, size = FLAG(u) ? : sizeof(toybuf);
It's not "true", it's a constant 1. Guaranteed by C99. I WANT it to be a 1. When the flag returns 0, I want to replace it with sizeof(). That's what that code is DOING. (This warning showed up when I added the !! because previously it was (integer&mask) which coincidentally was 1 but gcc wasn't treating "1" and "true" as different because THEY ARE NOT DIFFERENT IN C, THAT IS A C++ THING AND C IS NOT C++.
On the bright side, flag position is less important, so less of a lurking land mine. The cost is that gcc's rapacious stupidity triggers on more irrelevant crap. I'm sad I can't just compile with old toolchain versions from Before The Stupid, but I did that in aboriginal and there was a limit.
Garrett (the uclibc++ guy I worked with at timesys way back) drove through Austin and met me for lunch, and we wound up talking for 5 hours. (Rudy's no longer has the reasonably sized reusable plastic cups, it's styrofoam now. Oh well, I've still got like 10 of the old ones.)
Too tired to do more programming after that, although I'm not sure how much was the truly insane quantity of cedar pollen in the air today. Yesterday's apocalypse du jour dropped the temperature 30 degrees, which always wakes up the cedar trees this time of year and gets them bukakkeing their needles off. The recent apocalii also left us with a large pile of broken branches out front, between the ice storm and the tornado warning, and it's a race between mail-ordering a hatchet to make firewood and municipal brush collection to see who gets them first.
How limp did I go after getting back home? I was 3 days behind on reading my webcomics.
Sigh, I was so _amazingly_ spoiled by the speed of Fade's internet connection. I'm back here with Google Fiber and pages are taking 30 seconds to load, and email is downloading at one message every 3 seconds. (In batches of 400. It takes a bit.)
Finally applied Elliott's pending patch. (Saw it in the web archive yesterday but hadn't downloaded enough email to grab a local copy, yesterday I took my laptop to Wendy's and HEB and neither offered net access. Phone tethering drains the laptop battery fast, and the radio signal situation in Hancock center is appalling: it can't see my WIFI access point if I lay the phone on the keyboard, and around the Corpse of Sears (which Wendy's is across the parking lot from) my bluetooth headphones need my 6 inches from my left ear to avoid dropouts. Plus t-mobile did the "you have used 48gb of your 50gb gratuitous metering quota before we throttle the hell out of you" ping in the airport, and doesn't reset until the 5th...
So FLAG(x) now uses !! to force the return value to 0 or 1, which gets optimized away when it's used as a logic value. Audited all the users to remove a bunch of existing VALUE*!!FLAG(x) that are now redundant, and removed several subtle dependencies on a flag having a specific value along the way (some of which were commented, some weren't, including at least one subtle bug introduced by a commit that moved flags). There's still several VALUE*!FLAG(x) which now turns into VALUE*!!!(x&y) but the extra ! also get optimized out.
Whole lot of little style fixes as long as I was doing a review pass, spaces arround the = in assignments, removing inconsistently used parentheses, str = FLAG(x) ? "" : "K" becoming str = "K"+FLAG(x), etc. A few cases of "FLAG(x) ? TT.x : other" becoming "TT.x ? : other" which is actually subtle: sometimes you check the flag to see if it was set because it's an argument that only takes collated arguments, so --blah=abc sets TT.blah to "ABC" but --blah leaves it NULL. But I checked that it wasn't the case, and switched to the "test only one value and hopefully it's still in a register" version. (That said, I _kept_ one in patch.c because TT.p is numeric and could legitimately be -p 0 which is different behavior from not saying -p, so we need to check the flag not just the value.)
Whole lot of other "verification" of VALUE*FLAG(x) was _previously_ the rightmost flag, and not a hidden *4 or something. (The one case where it was had a comment.)
While I was there, I normalized todo and Todo to TODO so it's easier to grep for. (Can't just grep -i because "todo" shows up in comments and at least one local variable name.)
This wasn't (intended as) a micro-optimization to shave a few bytes off the code, this was "remove some conceptual land mines", but I did run bloatcheck a few times in hopes it wasn't making the result noticeably larger.
Oh goddess, this chunk of tar.c:
do { TT.warn = 1; ii = FLAG(h) ? DIRTREE_SYMFOLLOW : 0; if (FLAG(sort)|FLAG(s)) ii |= DIRTREE_BREADTH; dirtree_flagread(dl->data, FLAG(h) ? DIRTREE_SYMFOLLOW : 0, add_to_tar); } while (TT.incl != (dl = dl->next));
Is assigning to ii but not USING it, the argument to dirtree_flagread() recalculates one of the flags and leaves the other zero. How is the passing the test suite? Would fixing it _break_ the tests?
Fixing it does not break the existing test suite. I'm gonna fix it and see who (if anyone) complains? (I think it might only affect sorting at the top level, which might not be a thing since even when the top level is a directory that's one entry. I need to think through it and come up with a test, which I dowanna do now because this is big and I want to get it CHECKED IN.)
I have SO much half-finished crap in my tree I need to FINISH and FLUSH. The recently help plumbing changes aren't quite done yet. My most recent bount of shell work. The lib/passwd.c rewrite. I get a bunch done but don't make it over the hump so it accumulates instead of reducing. Need to CLOSE TABS...
And I think even tar --sort isn't going to sort the command line arguments? Procesed inn the order provided, the CONTENTS get sorted. Which still gives us the stable ordering, which is what they were after... Ahem, NOT FOLLOWING THE TANGENT RIGHT NOW.
Still kinda collapsed. I have pending email from 2 people to reply to, and spent most of the day not doing it. So many tabs to close...
Alright, the design issue with the --help output is when should it have the toybox summary line? Going with my most recent release binary, it looks like "toybox --help ls" prints it but "toybox help ls" does not? I can work with that...
Sigh, there's a lot of THINGY*!!FLAG(x) and Elliott's most recent patch also modified code that assumpes FLAG(x) is producing 1 which is an artifact of position. (There's a comment about making sure the flag is in the right place in the optstr. That's... more brittle than I like.)
Possibly the FLAG() macros should have the !! built in? I should check whether the optimizer is smart enough to produce the same code. (No, I am not going to start using the "boolean" type.) Time to dig out make baseline and make bloatcheck! Which don't quite work here because changing toys.h at the top level doesn't get dependency checked and cause a rebuild, and "make clean" deletes the baseline out of generated/unstripped. Workaround: rm -rf generated/obj before make bloatcheck.It's not _quite_ the same output. (With gcc, anyway.) In do_sha3sum() it's because we care about the flag position, which should be masking instead of using the FLAG() macro anyway. In do_gzip() it's because we're passing the value to a function which does not appear to be being inlined so even though it's only ever being used as a logic value the status doesn't propagate far enough. In cp_main() FLAG(f) and FLAG(n) are being assigned to local variables which are then used as logic values... which shouldn't make a difference to code generation, but does? Ha! And when I yank those local variables and just use the FLAG() macros directly, it shrinks 34 bytes! In touch_main() it's another "we care about flag position" thing saving 3 bytes: I'll live, and cpio_main() is another "flag is 1" with a comment, and also assigning FLAG(t) to a variable which only cares that it's nonzero but the variable's incremented a couple times later (to make it nonzero) so... take the hit. In cksum_main() FLAG(L) is passed as a function argument, so "zero or nonzero" must become "0 or 1" (and crc_init() is in lib/ so I don't expect it to be inlined across compilation units). Still kinda surprised su_main() isn't in pending because that whole subsystem is still unfinished, but reset_env() is taking FLAG(l) as an argument which lives in lib/ so isn't inlined so doesn't see it's being used as true/false. In pidof.c print_pid() is returning FLAG(s) and that function isn't being inlined because the function pointer is passed to names_to_pid(). Ha: nl_main() was doing another "depend on the flag being in position 1" but did NOT have a comment about it... and there's about 5 more of those. Sigh.
Huh, patch -R looks broken: apply_one_hunk() did reverse = FLAG(R) and then part of the "allow fuzz" test was c=="-+"[reverse] which means it depended on FLAG(R) being 1, but when Elliott added -s in commit 6f6b7614e463 I didn't catch that he put it at the end and moved R to 2 meaning in the reverse case it'll be comparing against the NUL terminator instead of the '-'. And we don't have a test for autodetecting fuzz. So adding the !! would actually _fix_ this.
Alright, I think I want to audit all the FLAG() uses in toys/*/*.c because there's a lot of !! I can now remove, and I should be consistent about not parenthesizing VAL*FLAG()|VAL*FLAG() because * is higher priority than |. It's a pity there's no "make test_dmesg" to make sure I didn't break that. I expect this is gonna come up a lot in a treewide audit...
Day after travel. Collaped.
Flying back to Austin.
Oh hey, another email in my inbox this morning about what somebody thinks I SHOULD be doing instead of what I am doing. (Watched a good video on "autistic inertia". I've mentioned before that I work based on momentum, and there you go. Having something Looming can either be extremely motivating (avoidance productivity: I will do SO much cat waxing to "virtuously" avoid the Looming Thing), or extremely demotivating (loss of momentum and traction because I can't muster the executive function to address picking up that piece of paper, it just won't budge).
Anyway, ignoring the "linux-kernel community is so broken" pile, the question du jour in my email was:
what is the official toybox opinion on rust being added to toybox?
And "My gut reaction is "Oh goddess not again" and I've been actively ignoring it?" was too short, so Pascal's Apology kicked in, and I replied:
Define "added"? I'm not putting a rust compiler in toybox, if that's what you mean?
If you mean "should I implement some commands in Rust and some in C", having a single simple context everything is done in the same way is part of toybox's design goals? Early in Toybox development the build needed Python, and I cleaned out that build dependency so it's all C and bash, and I'm implementing my own bash compatible shell so toybox builds under toybox. Early on I even had some commands implemented as shell scripts, and I wound up removing them again and doing them in C even though I planned to ship a shell interpreter, because I wanted the whole thing to be a single file with no external dependencies which you could statically link and drop into an empty chroot directory and have it just work.
If you mean "rewrite the whole project from scratch in a different language", long ago I was thinking of rewriting the whole of toybox in Lua but the problem I hit is that Lua doesn't ship with a standard set of posix bindings so I had to install something like 7 different prerequisite packages just to manage things like "wget", let alone implementing "mount" or "ifconfig", and if I had to implement/ship my own new Lua bindings written in C (and cross compile those to every supported target architecture) I might as well just do everything in C. (Which is a pity, Lua was quiet elegant, but their deployment strategy was too minimalist to be usable on its own.)
If you mean how would Rust affect my variant of countering trusting trust then having the project be in multiple languages again kinda defeats the purpose of a minimal installable base capable of reverse engineered binary auditing.
If you mean coming up with a replacement tiny system written in a single language that's both learnable the way minix and xv6 are _and_ scales up to actual load bearing deployment in real world usage (the way Linux 0.95 through about 2.2 did)... I'm still trying to make that work in _C_ (well a non-GPL one, I had it working with busybox but the insane FSF poisoned that well so thoroughly with GPLv3 around 2007 I wound up starting over.) I'm told the Rust compiler is now written in Rust and dunno what its system call binding approach is, but I still await a Rust kernel that actually ships in a product. (Even a vmworks level of kernel: I decided to wait for that when I saw that blonde lady's Rust talk at linuxconf.au in 2017 and I'm still waiting. Heck, even something as silly as Fuchsia, just something somebody somewhere actually used for something in a non-demonstration manner. There's a dozen different My Little Kernel variants people have done, but nobody actually seems to do real work in Rust? It's all either reimplementing stuff that already exists because "Ew Icky C", or "here's how we're going to change the language governing bureaucracy" and "here's how we're going to add yet more complexity to the language" and it's very tiring...)
Show me a serious attempt at a system that rebuilds itself under itself from source code, all written entirely in Rust with no C anywhere, and I might start to care? ADDING Rust on top of existing complexity is just more xkcd standards layering. (Yes, you have garbage collection and bounds checking like Java did in the 1990s. Yes you have native compilation to binaries like Java did with IBM's Java Native Compiler back in the 1990s. Yes you have a Big Marketing push and drive to rewrite everything in this one language like Java did in the 1990s. Yes you have a strong argument that C++ is a terrible language like Java did in the 1990s, which is not the same as C being a terrible language but try telling any C++ developer that. From a safe distance. Bring popcorn.)
If you mean the "Rust is inevitable, the same way Hillary Clinton was in 2008 and again in 2016", I note I've lived through the following:
- 1990: We don't bother to teach C in college, we teach Pascal instead.
- 2000: We don't bother to teach C in college, we teach Java instead.
- 2010: We don't bother to teach C in college, we teach C++ instead.
- 2020: We don't bother to teach C in college, we teach Rust instead.
I learned C in 1989, spent about 1992-1995 doing C++, and then was all in on Java as my main programming language from 1996 until about 2000, caught the Python 1.x->2.x transition and then bowed out again when staying on 2.x actively offended the 3.x developers... The C I learned way back when remains relevant. If I tried to write new code in the ~1995 version of any of those other languages it wouldn't build in modern environments.
I was part of the "rewrite everything everywhere in Java" crowd for about 5 years. My bug report was the reason the ability to truncate a file was added to Java 1.2. I worked on IBM's port of JavaOS to the PowerPC in 1997, taught Java at the local community college in 1998 and 1999, designed a hard realtime garbage collector... It was really exciting. (I wrote a little about how that ended in my blog.)
Any time someone goes "why aren't you using Rust" as an accusation, I treat it the exact same way as the C++ and Java people doing that before them. I had 20 years of Windows people asking why I didn't do windows (smoothly transitioning from rejecting OS/2 to rejecting Linux). I don't care if "everybody's doing it", I've never had a Facebook account either. It's not _my_ job to "be convinced". Lua had "here's cool stuff Lua does better", which appealed to me enough to take a look. I have yet to see arguments in _favor_ of rust, they've all been _against_ C. "C bad, icky and dangerous, we blame you for perpetuating it, you must stop now". No thanks.
A big reason I keep coming back to C is I can stay 10 years behind on the standards without a problem. Heck, I can still compile K&R stuff from 1978 if I really need to. The main deficiency of ANSI C from 1989 is that the first 64 bit processor came out in 1991 so the 64 bit "long long" type was a widely implemented compiler extension that worked its way into the standard later. I only moved toybox from C99 to C11 recently because of like ~3 minor convenience features (typecast array/struct literals, the "has_include" macro, and an alternate "inline" syntax that let us work around an llvm bug that's probably since been fixed).
Rust still hasn't settled down and decided to be nearly that stable: from a distance it looked to me like the first decade or so of the language was just WILD THRASHING leaving the language unrecognizable 5 years later, and now it sort of knows what it is, but still changes?
Has anybody made Rust work on a nommu system? Or only XIP from read only storage with 256k of sram? (Which Linux has been made to do, for example. Good luck pulling that off with garbage collection...) If not, your argument is "we'll still need C, but just less of it, so a smaller pool of people will have less expertise and age out without replacement". That's kind of Tesla's version of the self driving car argument: 99% of the time it'll drive for you just fine, and the remaining 1% it will crash and/or kill pedestrians and we're calling that the driver's fault but the driver won't be paying attention and may be way out of practice assuming they ever knew how to drive in the first place. How this is supposed to be a net improvement, I couldn't tell you.
Is there a Rust version of tinycc? What's the smallest, simplest Rust compiler out there? (Tinycc could happen because the language wasn't a moving target. If I decided to pick it back up and bang on it again the old stuff I did is still theoretically relevant. Is even a 5 year old version of Rust still relevant?)
If you want to implement commands in rust yourself, you can stick them in the $PATH and it should just work. Is there an obvious reason this should have anything to do with toybox? The "start over and rewrite everything in rust" approach like I was poking at doing with Lua would mean getting all four packages written in Rust. And preferably a stable version of Rust where a newbie could grab an existing system deployed 10 years ago and not touched since then, fire up the old build, reproduce it, understand it, and be able to modify it. As far as I can tell, this isn't a thing the Rust community _wants_, let alone is actively trying to achieve.
Sigh, I haven't got anything _against_ Rust, any more than against Ruby or PHP or Lisp or Prolog. I just don't care. Nor was I _offended_ by the people submitting forth and lisp interpreters (yes, plural) to toybox over the years. (In the absence of toysh, people have decided it needs a programming language.) I understand this guy's interest, and would like to politely decline... except I DO have something against projects like systemd that don't give me a graceful option not to participate, and the push to rewrite the linux kernel in rust without forking it is exhausting in the same way the build requiring perl was exhausting.
This guy didn't exactly knock on my door with a rust version of The Watchtower to tell me the good news about our new savior, but... I'm not getting "live and let live" vibes from this community either.
(I have a youtube video bookmarked, which claims to explain Rust in an hour. It's on my giant to-watch heap. I'm not AGAINST Rust. I just... still don't see the point?)
Fiddling with toybox help plumbing. Kinda spiraled.
So "toybox --help toybox" wasn't producing output, because of fallout from changes to prevent "toybox toybox toybox" stacking arbitrarily deep (and blowing the stack now that Linux doesn't necessarily enforce environment size limits even on mmu systems). So I started poking at that, but the show_help() flags API did the old "this argument was a yes/no boolean, then it grew a second bit, then it grew a third bit, and now it needs #defines" thing that I hadn't cleaned up yet. And while I'm there, "help -au" should print the usage lines for all commands, but calling help as a shell builtin does unique filtering so what happens when you "help -u" on the builtin? And the "See:" logic isn't filtering right as a builtin (redundant lines). And this whole "Toybox 0.8.9 multicall binary (see https://landley.net/toybox)" line at the start (which wasn't my idea, but then calling Linux "linux" wasn't Linus's idea either) should only be output SOME of the time and when is that some?
I keep trying to do quick fixes that wind up touching a half-dozen different files and leave off unfinished after hours of work and then it just ADDS TO THE MESS.
Flying back to Austin on tuesday. Not up for programming stuff today. Reading fanfic on AO3 instead.
Some months back I posted an observation about the Tardis to mastodon, which is why I want one. Just catch up on everything and come back when you're feeling up to it.
I wrote up an email reply which is a bit rambling and off topic for the toybox list (see "not up for" above, combined with pascal's apology for writing a long letter, substituting "spoons" for "time") so here it is instead. The context is that Michael Kerrisk, the man-pages maintainer, retired and handed the project off to a new guy, and didn't properly announce it (quietly added a co-maintainer to the git repo and then ghosted everybody), and now that we've finally figured out what HAPPENED we're trying to adjust.
On 2/24/23 11:46, enh wrote:
> > Possibly the new maintainer needs to poke Konstantin to get access to update the
> > directory, and then put stuff under the actual kernel.org page? (Or you could
> > put some under an android.org location? Either way they'd be up to date with the
> > repo instead of a couple years behind...)
>
> yeah, that's one of the options... generate the html and stick it on one of the
> android-specific sites, but that seems a bit odd (people are already confused by
> places where the man pages are actually only talking about glibc; hosting them
> on an android site would only make that worse) and there are already a lot of
> links to man7.org out there in the wild, that it would be
> unfortunate to see go stale. (though if no-one has access to man7.org
> any more, there's nothing we can do about that anyway.)
The downside of depending on individuals is you're inconvenienced when they cycle out. The downside of depending on organizations is they're all just a bunch of individuals who get together and collectively pretend, so things go just as pear shaped when the people actually doing the work well leave without a proper handoff to someone else who will actually do the work well, but you tend not to notice as fast (before _or_ after: see the Linux Foundation's consumption of the Free Standards Group and thus the Linux Standard Base). This lack of warning isn't necessarily an improvement.
Ahem: man7.org was offered as a community resource but is actually Michael Kerrisk's personal page and he is not handing it off to the next guy. (The maintainer of landley.net does not get to throw stones here, although all the toybox.net variants are camped by people who want thousands of dollars.)
The responsibility for the man-pages git repository was handed off (resulting in the repo effectively moving to a new URL which nobody seems to really care about), but not the website or the release announcement email list. (Haven't gotten one since, if it's still having releases?) If it's a good idea for the project to move to more of a "package deal" where there's a repository+website+mailing list that can be passed to a new maintainer as a group, that's sort of a design issue.
Jeff Dionne set up the original uclinux project, which I believe busybox.net was modeled on after linaro ended in the dot-com crash and the kernel parts got merged upstream and Erik Andersen kept the busybox+uclibc subset of uclinux going as a personal project. He handed busybox.net off to me in 2005 by giving me a login to the server (it moved from the DSL line in his basement to osuosl, but Erik still pays for the domain renewals). When buildroot forked off of uclibc, I'm the one who abused my root login on the shared server image to create a new mailing list and kicked the buildroot traffic off to the new list. (Alas, too late to save uClibc.) Buildroot has since separated itself the rest of the way from uclibc (its own VM with its own domain), so it's not inconvenienced by shared infrastructure going down (as has happened a few times what with uclibc being dead and all), which also means that handing over the keys to a new maintainer is a thing that buildroot could potentially do if necessary.
Sigh, somebody should write up a non stream of consciousness "handing over the keys of an open source project to a new maintainer" document. Do you even have a manifest of what all the project's resources ARE for something as big as Android? Not that Google's ever going to hand off Android. I remember when Red Hat set up Fedora and it (it pretended to be independent until Red Hat finally admitted it was just Red Hat Enterprise Rawhide. (So an independent Centos emerged... and Red Hat bought it.) Anyway, the point is when people/management change, the project's gonna wobble no matter what the corporate structure says because it's people who do things and know things and remember things, whatever they corporate structure says.
> > Let's see, how hard is it to produce html output from this git repo... it's got
> > a top level Makefile to do exactly that as its default target, but it wants a
> > package called "man2html". And installing that on my laptop installed apache
> > which LAUNCHED AN INSTANCE ON LOOPBACK. Why on EARTH would... that's just sad.
> >
> > But ok, I can uninstall it again after building... looks like it populated
> > tmp/html with files? No top level index. Let's see, the first file under "man3"
> > is __after_morecore_hook.3.html which seems to be a synonym for malloc_hook (not
> > symlinks or hardlinks, just redundantly generated files). The "Return to main
> > contents" link goes to file:///cgi-bin/man/man2html which does not exist. The
> > #includelink goes to file:///usr/include/malloc.h which ain't gonna
> > work on a web server either...
> >
> > Looks like there's the start of something workable here, but it needs a bit of
> > shoveling? (Or at least digging into how to configure it?)
>
> yeah, and one problem with being part of a large bureaucracy is that the docs
> folks and the branding folks will all want a say in making it look "right" if
> it's on an android site!
My first really well-paid consulting gig was working at a dot-com that was managing a rewrite of IBM's mainframe pricing and sales system. Various departments within IBM had wrestled for control of the project so extensively that upper management had outsourced our bit of it so NONE of them had it. Taken the ball away and given it to someone else entirely so they'd stop fighting.
Which meant my job was to be on an 8am conference call with IBM Europe (Boblingen, Germany: initial deployment) and IBM USA (Poughkipsee and Dallas, one did frontend one did backend), a 6pm conference call with IBM USA and IBM Australia (Worldwide Integration and Test, it was _not_ in didjabringabeeralong because that's a Discworld reference but don't ask me what city it WAS in, somewhere that was simultaneously under water and on fire at one point but that came later), and when I needed Australia and Europe to talk to each other that was a 3am call and I slept under my desk AND BILLED FOR THE TIME. (The dot-com manager told me to.) I don't think I authored a line of code (for them) the entire contract, the _technical_ part of my job was matching up defect reports that told us to do one thing and defect reports that told us to do the exact opposite (or explicitly NOT to do that thing) and bring up pairs of them in the meeting.
Somebody eventually explained to me that a specific manager in Dallas (Ken somebody?) had figured out how to get promoted by sabotaging projects: during the design phase he demanded to know why implementation hadn't started yet, then when they started implementing an unfinished design he'd demand to know why it wasn't being tested yet... The answer was always "because we're not ready" but he'd make a stink and get it started and the reputation was Ken Got Things Done. It wasn't happening before he made it happen. It all collapsed into chaos the moment he left, but that just showed how vital he'd been didn't it?
So this project had fundamental design changes coming in regularly, requiring not just complete rewrites multiple couple years into the project, but constant changes to the test plan. ("Why can we never get real database data to test with?" "It's their strictest trade secrets." "What is this system for anyway?" "Pricing 360 mainframes." "How much do those usually cost?" "That's not how it works, the salesman figures out how much the customer is able to pay, and then they produce an invoice that adds up to that amount." "So this whole system is a giant bullshit generator that emits nonsense to produce a predetermined result?" "The invoice has to be reproducible and comply with a bunch of legal and regulatory clearance issues, you have to word things right for the technology to be exportable to various jurisdictions..." "You didn't answer my question." "No I did not.")
Eventually the Australians did a
slimyclever political thing to extricate themselves from this cluterfsck death march, by declaring that one of the endless thrashing "release candidates" they'd been given had PASSED THE TESTS and they certified it as deployable, closed out their budget, andscattered to the windsreassigned the testing staff to other projects. Completely ignoring the fact that the testing they were doing was useless (something nobody else could call them on because nobody could, for political reasons, admit it to be true in so many words.) They took a random passing snapshot in time of the vague contradictory specifications they'd been given and ran the red queen's race fast enough to catch up just long enough to call Bingo. They'd been given an impossible job and claimed to have done it, because ignoring the "it's just busy work until we're ready for you" nature and instead declaring victory meant they could STOP DOING IT. Which immediately clogged up the pipeline leading to them, because NOTHING MORE COULD BE TESTED, an existential constipation crisis leading to ALL THE PHONE CALLS.That's about when my 6 months were up, at which point the consulting company all this had been outsourced to offered me a 50% raise to just STAY AND BE ON THE CALLS... and I just couldn't. I couldn't put into words WHY, this was almost 15 years before David Graeber wrote his first article on "Bullshit Jobs". But at the start of the contract, the existing employee who'd been doing it had used all his accumulated vacation time AND some family emergency under the family and medical leave act to take a solid two months sabbatical, forcing them to reassign the project to ANYONE ELSE BUT HIM. They'd thrown money at a passing junior dev to just Sit In The Chair And Be On The Calls, and he'd left me a pile of useless printouts to "get up to speed" with. There WAS no documentation. The job was babysitting, and the burnout was just insane if you didn't understand that and tried to actually accomplish anything ever. I found myself physically unable to just shut up and take the money longer than I'd already done.
Anyway, tl;dr there are sometimes political advantages to having something live outside an organization.
> > > nope... that's still
> > > https://thephd.dev/c-undefined-behavior-and-the-sledgehammer-guideline
> >
> > I apply the sledgehammer to the compiler. (Push back against the abuser causing
> > the damage, don't make the victims endlessly escalate ever-changing "compliance"
> > that's never good enough. Danegeld encourages the dane.)
>
> read the link (or listen to what i've been telling you for the best part of a
> decade) --- the problem is that the compiler folks don't believe we're their-
> customer. they don't care about "is it useful?", they care about "microbenchmark
> line goes up?". or, in your analogy "the law is currently on the abuser's side".
Oh sure. I know. Doesn't mean I'm going to stop fighting. (There's a reason I was poking at tinycc/qcc.)
Steven Universe's "that's why we can't fight them", "that's why we have to fight them" line works in here somewhere.
Got an email from Andrew Morton which would have been great if it was the first one, but after Thomas Gleixner's repeated replies ignoring the code and talking about bureaucracy (I actually MET him, and later recommended his company to Taylor Simpson at Qualcomm for handling Hexagon's kernel patch review and upstreaming, nice guy back in the day...), and that japanese guy going "we voted on this stupid unnecessary API so adding code that renders it irrelevant would highlight how stupid it had been and embarass us all"...
I'm trying to scrape up the politeness to answer Andrew's questions in a constructive manner. Rather than an honest one. "Who do I expect to merge this?" Nobody. I do not expect the kernel clique to be functional enough in 2023 to merge external contributions from individuals. All of this code was submitted to the list before, and ignored. This is a roundup for people outside the kernel. If people build their own kernels, this can add to their patch stack. If lawyers give me guff I go "look, I submitted it to them, they chose not to merge it for their own reasons". But linux-kernel being a functional place to discuss patches? That's LONG gone.
But I can't just SAY that. It's not USEFUL. I'm not sure what would be, and dealing with them makes me SO TIRED. (Andrew is being polite and constructive! I should do the same! I really should. I'm just out of spoons for kernel "community".)
Many moons ago I was trying to add cortex-m support to mkroot but seem to have lost my notes. (I want a qemu nommu target so I can more easily test nommu support without copying stuff to my turtle board. Yeah, I can tell toybox to enable nommu support anywhere and use the nommu codepaths, but that doesn't prove nothing LEAKED, and that the result actually WORKS on a nommu system.)
My old blog entry from the time just says I was working on it, but doesn't provide useful context like what QEMU board or Linux defconfig I was trying to make work. So we start over.
According to qemu-system-arm -M ? | grep '[-]M' the list of QEMU Cortex-m boards includes "stellaris" (64k sram), bbc microbit (no obvious Linux target), and stm32 has two boards: vldiscovery and netduino, neither of which implement ethernet or block devices. That leaves mps2.
The mps2-an500 and mps2-an511 each have 16 megs DRAM, and qemu's hw/arm/mps2.c has a gratuitous explicit test for an -m trying to increase it and then refusing to do so: if (machine->ram_size != mc->default_ram_size) error_report("Invalid RAM size, should be %s", mc->default_ram_size); (Which seems silly, there's space in the mapping? Oh well...)
Linux has an mps2_defconfig build. I need a kernel config, QEMU board emulation, and compiler that all agree on the target, where "compiler" includes both gcc tuple and musl support. I have a static PIE toolchain for armv7m (I.E. thumb 2 I.E. cortex); I'd like an fdpic toolchain but haven't made that work yet because support hadn't been merged upstream yet last I checked. (gcc, binutils, and linux all need it, not sure about musl?)
Standard ELF has absolute memory addresses hardwired into it, which means you could only run at most one instance of each ELF binary on a nommu system (it's kind of the same problem a.out had with shared libraries: in practice the ELF loader just isn't allowed on nommu). Position Independent Executables (PIE) use relocatable Position Independent Code (releative addresses from a base pointer kept in a register), which is basically building your executables the same way you build your shared libraries, so they can be loaded anywhere in memory. It's slightly less efficient but the security nuts love it because exploit shellcode hasn't got known absolute addresses to use on the target system. FDPIC takes that concept and expands it to make all four of the standard ELF segments (text, data, rodata, bss) independently relocatable, which means your program doesn't require one big contiguous chunk of memory to fit into, but can instead fit into four smaller chunks (which is very useful on nommu systems, where memory tends to get fragmented over time), AND it means the read-only segments can be shared between program instances (five copies of bash can all use the same text and rodata, each one just needs their own data, bss, stack, and heap), but the downside is you need 4 registers to store the 4 base pointers (or have your base pointer point to an array of 4 pointers with an extra dereference on most memory accesses). But that's ALSO something the security guys like because foreign exploit shell code can't even know where rodata is relative to text in a given running binary, it's even fiddlier to exploit.
You'd think the FDPIC loader would be the standard one by now (since it can handle normal ELF binaries just fine: FDPIC is ELF with an extra flag in the header, it has the OPTION too make the segments non-contiguous but not the obligation to do so.) But as with the ext2/ext3/ext4 drivers the kernel guys went "no, fork it and have a completely seperate file that will get out of sync with the other one", and then years later it's a mess...
Oh god, kernel people.
[That's all I wrote for this entry at the time. It's now March 19 and I haven't edited and uploaded past this entry yet because I just don't have the emotional energy to deal with that toxic waste dump, but we're coming up on a month behind, so here goes:]
Thomas Gleixner replied, ignoring the actual code parts and instead having a multi-part exchange entriely about the bureaucracy, where I didn't cc: the right people (I cc'd who get_maintainer.pl said!) and my subject line was wrong and UGH, my DESCRIPTION and also there's something in some thousand line documentation file I missed but he literally won't specify what it was because the onus is on me to FIGURE IT OUT. And he's also arguing that if a dependency is EVER needed then it's ALWAYS needed so my patch to be able to build without objtool is conceptually wrong because SOME configurations need that dependency therefor EVER building without it is a crazy thing to want to do.
Meanwhile, Mashahiro Yamada is literally saying that my patch to try the "cc" name before falling back to "gcc" and thus autodetecting llvm in both native and cross compilers (with no other behavior change I am aware of) can't go in because, and I quote: "In the discussion in the past, we decided to go with LLVM=1 switch rather than 'cc'. We do not need both." (With a link to the previous vote.) This was his REPLY to me pointing out that the name "gcc" is like "gawk" and "gmake" (and "gsed" on macos homebrew) and that just about everything else uses the generic name where possible. What's his logic here, "we voted, therefore the topic cannot be revisited"? I just...
So tired.
Got my patch series posted to linux-kernel. The oldest patch in that series was first submitted over 15 years ago, albeit in a different form then. Another one fixes a minor bug I myself introduced 10 years ago, which nobody else has bothered to fix since even when I pointed it out to them.
If you're wondering why I'm tired of dealing with the kernel clique...
Dear bash, up yours:
$ bash -c $'cat << EOF\nthingy' bash: line 1: warning: here-document at line 0 delimited by end-of-file (wanted `EOF') thingy $ echo $? 0
I'm trying to test ERROR PATHS to make sure they exit gracefully instead of throwing ASAN allocation errors, and you have WARNINGS? The shell has errors and the shell has success, having WARNINGS is new territory. (Still exits with 0... I do not have a syntax_warn() function.) Comparing with the Defective Annoying SHell... that accepts it without a warning and also exits 0. Fine, change the error path to be a... strange sort of success? (Dash does not append a newline to thingy, bash does. Dash doing it strongly argues in favor of NOT doing it, so always newline it is.)
Going through the HERE document parsing logic, I hit a "this can't work" bit (comparing two pointers only one of which gets incremented in the loop), and tried a simple ./sh -c $'cat << EOF\nhello\nEOF\n' test and sure enough it didn't (never recognizes EOF), and tried to find the last place it did and gave up in 2021...
That can't be right. There's a regression test suite. I know it doesn't make it thorugh all the tests, but... I had this working at one point.
Sigh, symptom of swap thrashing: sh is a big command that requires a lot of focus and I've had to do it in small increments with all the other demands. Now that I'm focusing way more on toybox, there's still lots of bug reports that spawn off tangents that SEEM quick but aren't. (I spent a couple days basically SCOPING mkisofs. I need to cycle back to diff. I still haven't set up a test environment to check in the lib/passwd.c rewrite that's actually buildable and testable with the bionic NDK...)
I've fixed a lot of sh bugs that were in front of me, which broke other stuff and I either hadn't made it though the test suite (because of the expected failures from still missing features) or the relevant test isn't in the test suite yet. So I need to grind away at fixing stuff for a sadly large hopefully uninterrupted block off time.
I miss the 36 hour porogramming sessions of my youth. These days I look up after 4 and need a long walk...
Sed bug report came in while I was poking at the shell double free, and of course the sed thing is another object lifetime rule issue, introduced by the sed speedups which added extra cacheing. Got it sorted I think?
The double free is in an exit path, where the cleanup does not match the assumptions. The HERE document logic adds the EOF marker to the end of the ARG list: not a COPY of the marker, the actual pointer to the original string we parsed earlier. The sh_pipeline variables "count" and "here" let us know we're in HERE document accumulation mode so each time parse_line() gets called it moves the marker and discards it when matched, but the cleanup function called by the exit path isn't looking at that.
There's also something I called "bridge segments" where additional commands that do NOT have HERE documents attached to them get parsed before the line continuation logic fetches the body of the HERE document(s), ala:
$ cat<<EOF; echo hello potato EOF potato hello
In that case the pipeline segment "echo hello" parses into would get marked as a bridge (its ->count set to -1) so the parse_line() entry path knows to back up through it and look for uncompleted HERE document segments. Once they're completed it works its way forward unmarking completed segments until it can either return "we have a complete thought, you can execute it now" or finds another reason to ask for line continuation (being in the middle of a for loop or if statement, for example).
The PROBLEM is that when you DON'T complete the HERE document, that extra entry indicating what EOF string we're looking for shouldn't be freed, because it exists earlier in the pipeline (in whatever statement had the redirect) so if you free it in both places... double free.
Alas while fiddling with this I found MORE wrong cases. For example, if the redirect ISN'T attached to a statement, it gets freed early (when the NOP statement is freed) an thus the HERE document can't be concluded for a different reason, ala bash does:
$ <<EOF; echo hello potato EOF hello
And toysh can't handle that yet because free("EOF") happens after parsing the first line and then the HERE document fetching use-after-frees it.
I think I need to just xstrdup() it. Premature optimization strikes again, "I don't need to copy this, the original's lifetime is longer than HERE document parsing by definition"... "yes I do to be CONSISTENT".
Sometimes "progress" is just adding yet more tests the existing code doesn't pass.
I've been meaning to post my patch stack to linux-kernel for weeks (not because I think they'll merge it but so it's not my fault that they haven't), and hey: Linus did an -rc8 so this isn't the merge window week. Yay extra time, but I sat down to do mkroot builds of 6.2 anyway and...I broke the shell. Darn it. One of those fixes for Eric Roshan-Eisner's fuzzing bugs introduced a strcmp(ex, blah) without a test for ex being null, and running mkroot's init script triggers that codepath and segfaults. Stupid thinko, but have I really not tested mkroot in a month? Sigh.
Oddly enough, I'd already hit this and fixed it up in the shell work I did yesterday, but getting that to a good stopping point so I can check it in is tricksy. (It started with an attempt to add the read builtin and there's a lot of half-finished debris lying around the tree.)
Went to Walgreens early this morning and bought earplugs. Much less painful work experience. (I am not a dog person. Never developed the skillset. If I lock adverb in the bedroom when he's not alone in the apartment, he claws at the door endlessly and will damage it. If I let him out into the center room (combination living room and kitchen), he barks at the front door for maybe ninety seconds every time somebody else in the apartment complex walks through the hall. Fade sits on the bed with her laptop and closes the door, and Adverb thinks that's the correct way to be home and keeps trying to lure me there, but I'm used to a table and chair and this room has better lighting.)
If you're wondering how my day is going, my attempt to add a shell "read" builtin has diverged into reverse engineering my ${variable} expansion code to figure out what all the corner cases are which led to reading the relevant part of the bash man page which led to me restarting the bash man page from the beginning which led to redoing sh_main() flag parsing and adding tests for sh -cs "arg" thingy vs sh -c "arg" thingy which led to me changing the logic so -c "arg" aren't an arg.c colon attachment (because they aren't in bash: it reinterprets the first argument as a command instead of a shell script but sh -c -s "echo hello" prints "hello" instead of trying to run -s and yes I need a test for it) which circled back to me trying to get all the existing tests to run under ASAN which means tracking down why sh -c '<<0;echo hello' was faulting which is because TT.ff->pl = xrealloc(TT.ff->pl) sometimes ALSO needs to update TT.ff>pl->end and now I'm trying to work out when that's true. (The only realloc of an existing pipeline segment is when attaching HERE documents to one, which expands the arg[] array at the end, but I need to update ALL the pointers.) And then once I added a loop to check all the pl->end in the pipeline and update it if necessary (which SHOULD happen before function bodies get moved so it should all be in the one doubly linked list), that revealed a double free error I need to track down.
None of this was what I planned to do next, but with Android in feature freeze it seems like a good time to make a dive back into shell stuff...
Adverb has been barking continuously throughout this. Fade's dog is unhappy when Fade isn't here, and expresses it when he's not alone. (If he barks at the front door long enough, clearly I will bring Fade back. It's worked every day so far, after enough hours. I have headphones, but need earplugs. I have escaped the clingiest cat to visit a neurotic dog.)
I've taken a break from caffeine here at Fade's, which has resulted in some very long naps. As in more than one unexpected 8 hour nap. Not the most productive, but eh, it's a weekend...
Gentoo's "make tests" is failing on du because overlayfs lies. My first instinct was to mount a tmpfs when run as root, ala if [ $(id -u) -eq ]; then mount -t tmpfs tmpfs .; cd "$PWD"; umount -l .; fi (the lazy unmount means it's still there on the current directory while we're in it, but automatically unmounts as soon as we cd out or exit the process).
Unfortunately, the results from tmpfs are very different from the ext4 I developed it on: mkdir allocates a 4k block up front on ext4 but in tmpfs directories are always size zero (because the dentry cache doesn't take up space in the page cache). And I can't convert the tests to what tmpfs produces unless I'm going to _require_ it to run under tmpfs, which you can't do as a normal user. I think I need to do:
dd if=/dev/zero of=ext2.img bs=1M count=1 status=none mke2fs -b4k ext2.img mount ext2.img . rm ext2.img cd "$PWD" umount -l .
Which should get me a filesystem that behaves like the one I'm developing on. (How does "dd" manage to get "unix" so wrong? Success is silent so your pipeline isn't full of trash, having to status=none to do that... I'm blaming IBM, they got ebcdic in there somehow. I'd use "truncate -s" but you can't loopback mount a sparse file...)
Attempting to close tabs: the gentoo locale thing should be fixable by having it try C.UTF-8 (which macos hasn't got) before en_US.UTF-8 (which gentoo hasn't got). My readingn of "man 7 locale" says it should try C.utf8 in its search path (feed it the "official" name and it tries four different variants: upper and lowercase, with and without dash)... gentoo still didn't work. I tried to run it under strace to see why, but "emerge strace" doesn't work on last Sunday's LiveCD because /etc/portage/make.profile is a broken symlink. Emailed a "huh?" at Patrick Lauer...
Oh goddess, whatever Horrible Gnome Thing gentoo's livecd is using as its' terminal (or is it a Horrible KDE Thing?) is FLASHING the broken symlink at me. Causing KVM to gratuitously eat CPU doing perpetual screen updates just to the display can cause ADDITIONAL EYESTRAIN. That manages to be counterproductive on multiple levels. (And I haven't dug into figuring out how to make the background be actually black instead of dark grey, because they decided "less contrast, that'll help".) Cleared the terminal and my CPU usage graph no longer looks like a heart monitor.
Onna plane. Heading to Minneapolis, visiting Fade until the end of the month. (Flying back on the 28th, which is as far as February goes this year.)
Haven't blogged for the past few days, felt under the weather ever since the ice storm. (It _really_ threw off my sleep schedule.) Made a few notes about "huh, I should blog about that" and then didn't. (Sigh, I should backfill but mostly the things I thought about blogging were when I wasn't in front of the computer, so said notes would be in Austin and I'm onna plane.)
What did I do: aggroed the bash maintainer into a coreutils thread. (Still subscribed because cut -DF still hasn't been merged or rejected.) The arch/sh maintainership transfer is still up in the air. Started researching mkisofs. Did NOT post my kernel patch stack to lkml yet.
On the toysh "read" builtin front, bash's behavior is subtle: read -p hello > /dev/null doesn't work because the prompt is output to stderr not stdout (justs like the $ prompts). If I go read "" it exits with an error immediately (because "" is not a variable name it can assign to), but if I read potato "" it reads a line of data, splits it, assigns the first part to potato, and THEN exits with the error. I don't understand why it only checks the FIRST value for validity before reading input? (Why check it at all before reading if you're not going to check the rest...)
$ read -p % "" bash: read: `': not a valid identifier $ read -p % potato "" %one two three bash: read: `': not a valid identifier $ echo $potato one $ read -p % potato ""; echo $potato %blat bash: read: `': not a valid identifier blat $
First time it doesn't even output the prompt, the third read shows it's not a syntax error (just a normal error exit). So that's good to know. I should add tests...
And then of course, after all that bashing my head against input granularity, sitting down to write "read" I'm hitting OUTPUT granularity. Namely: you can list multiple variable names on the read command line and it does IFS splitting to put a word in each argument the way it does for $1 $2 $3 etc for commands and functions... but if there are fewer variables than arguments it STOPS splitting early, and puts the rest of the string into the last argument, not having consumed the remainder's $IFS characters. Meaning read A B <<< "a b c" will preserve a run of multiple spaces or whatever space/tab/space combo was between "b c" when assigning to $B. Which is NOT the "split and glue back together with the first $IFS charachter" logic of "$*" nor the "glue back together with specifically space regardless of what IFS says" behavior I implemented SEMI_IFS for in "eval" and "case"...)
The problem is, my function that does all this work is expand_arg_nobrace() which is already taking six arguments, the last two of which are usually zero. I'm reluctant to add a third "usually zero" argument, especially since the last one that's currently there is "long *measure" which seems like it could be repurposed, but what it currently does is "set it to a character to search for a bit like $IFS but this one's a hard stop where you write the offset at which you found this character into *measure and return early", which is used to reliably find the semicolons in ((math;moremath;evenmoremath)) regardless of quoting and ${thingy#$((blah))} nesting levels. Totally different from "set NO_SPLIT in flags after argument 3".
(I also hate $IFS as a concept, and spent months wrapping my head around the details of what does and doesn't become a separate argument with "" and ""$EMPTY and """$*" when there are no arguments, and how x() { echo $#;}; x """" should print 1 not 2... and looking back through this code I remember that there ARE a bunch of special cases but not WHAT they all were, which is why I made so many tests/sh.test cases for it, and I dowanna touch this forest of nested horror that laboriously jenga-style made them all work, but I have to find exactly the right place to drop in a state change with no state inappropriately crossing the change point... and I dowanna.)
Setting *measure to a negative number is uncomfortably magic.
Adding an IFS flag to change the meaning of *measure would let me avoid changing all the callers to add another zero, but it has a naming problem: the common prefix of almost all the existing flags is NO_ as in NO_SPLIT and NO_IFS to disable something expand_arg() would otherwise be doing. (Which isn't great either, but EXPAND_NO_SPLIT is too long when you're or-ing together five of them). I already violated that with SEMI_IFS and dowanna do so again or I've just got a bunch of random #defines floating around the code.
I made a quick stab at adding an expand_arg_nobrace() wrapper calling expand_arg_nobrace_raw(). After all the original API is expand_arg() which handles ab{c,d} processing and then passes on to expand_arg_nobrace(). But two of the calls ending in double zeroes are recursive calls within expand_arg_nobrace() itelf, and I'd need to provide a function prototype (with seven complex arguments to keep in sync if anything changes) to let those two call each other, which is exactly the kind of nonsense I'm trying to avoid with the ever-widening API on this sucker as I find new corner cases.
Of course make tests breaks on gentoo, why wouldn't it?
Fixed tar yet again. Here's hoping it sticks this time.
I am now researching mkisofs implementation. (I actually made the mythical "bootable hard drive image" one of the pages said they can't find an example of, back in the yellowbox days. Took some fiddling to get the machine's BIOS to accept it, what with all the legacy hard drive types. Probably why it didn't get used as widely as "floppy image", which had a lot less variants.)
I'm amused by Hyrum's Law. (It's the API version of "with enough eyeballs all bugs are shallow". With enough users, all observable behaviors of your system become "the API" and changing it breaks somebody. That's why my spec for toysh is "what bash does" and then run a bunch of existing scripts through it to see what breaks.)
While emailing somebody I checked to see if I'm still in the first page of Google results for "patch penguin", and the answer is "no, but creepy".
The minor discomfort is Google search no longer produces a paged interface, it's one of those perpetual scroll things that loads more as you scroll down I didn't ask for this and actively don't want it, but they wanna be fancy javascript nonsense. (If I switch off javascript for google.com will I get pages back?)
The MAJOR discomfort is I scrolled down something like a hundred entries and it's ALL ADVERTISEMENTS. Every entry is a product and the google summary gives a price in dollars at the bottom, and half of them say "in stock". And it's a special line that's a slightly different shade of grey than the other lines: Google has a "product" category in the search and is showing me almost entirely products. I don't want products. I confirmed I had NOT selected the "shopping" tab, but 2023 Google weights shopping pretty much to the exclusion of all else. I can't EXCLUDE "shopping" from my search, because they don't want me to and I'm "the product not the customer"...
(Um, since Google is apparently determined to become useless now: the Charged Vacuum Emboitment mentioned above was space technobabble the Tardis passed through in the 4th doctor episode Full Circle to wind up in "E-space" instead of the normal universe. "Emboitment" is apparently a mangled french word meaning something like "to put in a box". All TLAs have bad collisions in the modern world, and my brain tends to lock onto the one I encountered first. Mitre is as far as I can tell an NSA front organization, so I guess it's nice the US government is collecting and publishing security vulnerabilities, but I'm always confused when something I do is considered important enough to mention? But I guess I should finish the httpd Common Gateway Interface functionality.)
Wait... really? There's a toybox CVE for httpd? (Yeah I remember fixing that bug, but was it really worth a Charged Vacuum Emboitment?)
So I came up with an fpfix() function that does the fseek(ftell(fp)) thing (and should PROBABLY also do the fcntl(O_DIRECT) thing with maybe a stat() determining which is appropriate), and I inserted a call to it in both save_redirect() and unredirect() doing if (fd<3) fpfix((FILE *[]){stdin,stdout,stderr}[fd]); and then ripped it back out again because... that's not right. The extra syscalls are expensive if they'll happen a lot, so I want to make sure they happen at only the necessary places. (Yes, it's lifetime rules again. No, garbage collection wouldn't help. Which made me start wondering how rust or go intend to apply to nommu systems until I got a headache and had to walk away for a bit.)
I'm 95% certain we ONLY care about "fixing" stdin, because that's what uses getline(). For everything else toysh is using file descriptors, so our stdout and stderr global FILE * instances should never _get_ out of sync if we just avoid ever using them. (Is THIS why each dprintf() call on glibc does a gratuitous lseek(fd, 0, SEEK_CUR) before doing a write() of the appropriate data? It's mildly annoying that dprintf() on glibc has such noisy strace output, and you'd think that fileno() would do it to if so, but no...)
I can only think of two actual stdin consumers on toysh: get_next_line() and the "read" builtin can each eat extra data because of FILE * readahead, and then when we run child processes those can inherit a gap. So there are three cases in need of potential adjustment, but the further complication is there are two TYPES of adjustment: seekable file descriptors can get fixed up with seek after the fact, but if it's a pipe we want to set O_DIRECT preferably before the producer _writes_ data into it (because once the pipe buffer's collated we've lost the blocking information).
So toysh needs to fixup each pipe() it creates, and _maybe_ sh_main() should fixup the stdin we inherit? Hmmm, what about "read < /dev/tty"? That says we SHOULD set O_DIRECT on nonseekable save_redirect() input? (Or maybe expand_redirect() should do it when opening the redirect file? Grrr...) I really want an elegant design chokepoint everything has to go through rather than trying to whack-a-mole every entrance and exit. Three consumers of the data, two types of fixup, SHOULD be six total cases, but pipe() vs < /dev/tty isn't in that paradigm.
Ok, toysh needs to O_DIRECT incoming pipe inputs as soon as possible (so sh_main() and expand_redir()), and also set that flag on outgoing pipes at creation time before we write anything to them. The seekable kind can need to set back to the right place when we're done reading them, which does NOT belong in get_next_line() but instead should go at the start of run_line() so multiline reads get optimized (line continuations don't have to re-read the input, so scripts can load chunks), and also on the exit path of each read builtin (because we assume we're going to run at least one command on what we read).
Alright, that SEEMS to make sense...
I'm trying to read through the musl source to see what its getline() block read size is... it really looks like that's doing single byte reads too? src/stdio/getdelim.c is repeatedly calling getc_unlocked(f) and getc_unlocked.c is this strange little wrapper function doing int (getc_unlocked)(FILE *F) { return getc_unlocked(f); } which is explained by src/internal/stdio_impl.h which has #define getc_unlocked(f) ( ((f)->rpos < (f)->rend) ? *(f)->rpos++ : __uflow((f)) ) (and thus the parentheses around (getc_unlocked) isn't some weird function pointer syntax, it's so the symbol explicitly has no arguments and thus the macro preprocessor doesn't recognize it as the macro defined to take arguments... and then the body DOES expand to that macro. Me, I would have PUT A COMMENT THERE.) Anyway, this __uflow(f) is in src/stdio/__uflow.c (yes with two underscores on the filename) which is basically doing f->read(f, &c, 1) except... that read() function pointer takes a FILE * as its first argument, not a file descriptor. Where is the function pointer set? Well one of them is function __stdio_read() which... is doing crazy things with an iovec that I am NOT puzzling through right now ("len - !!f->buf_size" again needs a COMMENT) but it looks like it might be reading buf_size, whatever that is.
I no longer care about the numbers. (If I need to know I can run a test program under strace.) I very vaguely remember from years ago it was 512 in at least some cases? Anyway yes, it can maybe read ahead with block size big enough to reasonably amortize the system call overhead. And thus needs some serious unget to pass the file descriptor to other users. No, I am not trying to look at bionic just now, not after that.
Oh goddess fsetpos() is a stupid API, isn't it? The classic ftell() returns long which is signed 32 bits on 32 bit systems, and files are bigger than that these days, but instead of doing some sort of lftell() which returns long long (and an lfseek that accepts it) they invented a new gratuitous fpos_t type which they pretend isn't just a typedef for "long long", and then created two new libc functions with completely unrelated names: int fgetpos(FILE *fp, fpos_t *pos) and int fsetpos(FILE *fp, const fpos_t *pos), both of which are FUCKING STUPID.
WHY does fsetpos() take a POINTER to pos? If you just passed it the value, you wouldn't need to say "const" would you? Yes the get function that WRITES the value is taking a pointer, because they decided these need to return 0 or 1 to indicate error instead of returning -1 when there's an error like the previous one did (since that's not a valid file position), which is itself stupid. (The old way was smarter.) But the set function has ZERO REASON for its pos argument to be a pointer. Feed it the value, then you don't need to annotate it with "restrict" or "auto" or "static" or anything because IT IS A NORMAL ARGUMENT. (Symmetry is not an argument here, the functions DO DIFFERENT THINGS. You don't printf("%d", &i) because %n can write to i and thus needs a pointer, therefore the arguments should ALL be pointers. That would be INSANE.)
The C++ clowns who took over C development make me sad. Ken and Dennis and Doug McIlroy and Brian Kernighan were very smart. The people they handed off to... not so much. (I did NOT point out that gnu would have made rm -rf be "filesystem-modifier remove --no-prompt --recurse-into-dirs-newer-than=all --ignore-read-only", and that unix was all about individual commands that "do one thing and do it well" nd connecting commands with pipes instead of "git subcommand" or "ip subcommand" or "systemd subcommand" or...)
Simple systems survive. Increasing complexity eventually collapses under its own weight. Alas, "this too shall pass" does not usually do so on timescales I get to personally benefit from. There are a lot of "marsupial rat" versions of unix out there (including the 8 zillion posix RTOS variants) because it _works_. Linux wandering away from unix says bad things about LINUX, not about unix.
You can get a full understanding of a unix RTOS in a couple years, although xv6 sadly has the minix problem. (Ken Thompson taught his working Unix system to a generation of grad students who created BSD from it, but ivory tower academics zealously guard their abstract teaching tools from being fouled by any feedback from real world use: patches decidedly unwelcome.)
Which is odd because a complete course on something like vxworks could easily happen in high school, it's CLEVER but not that big and not that complicated, and it's a multitasking posix system with the standard bells and whistles. (NFS over USB? Out of the box, and fits comfortably in 2 megabytes...) Not remotely unique either, that one's just 36 years old and still going so it's easy to talk about. You'd think Linux would have knocked out all the proprietary unixes, but Linux is a PIG that hasn't fit comfortably in 2 megabytes RAM since the 1990s.
Yes it's entirely possible to come up with a brand new replacement paradigm, but it would have to be equally simple and elegant to persist nearly as long. Java/JavaOS tried 20 years ago (back when I taught classes in it at the local community college), but it was an uphill battle even before Sun trashed that quite thoroughly. And then oratroll happened: the other problem with Java was IP entanglements. Technology advances when patents expire, not when they're granted. Unix escaped AT&T early and laboriously purged itself of lingering corporate taint in the early 90's. Anything trying to replace unix has to reckon with late stage capitalism's relentless embrace-extend-extinguish clearcutting and strip mining. The settlers come in and find a carefully curated land with a bounty of buffalo and passenger pigeons and american chestnuts, and all of it's dead and gone within a few decades. The descandants of britain's imperial capitalism do the same thing to any resource that can't defend itself from rapactious unsustainable exploitation as they did to their own people before metastasizing into a global empire, and they are 100% convinced that ideas are property. The livejournal->myspace->twitter->mastodon cycle is about communities as property being embraced extended and extinguished, their members fleeing to a new territory the would-be owners haven't conquered yet. France solved this problem with guillotines.
As SCO proved, there's no money in suing modern Unix. (The Mormon activist behind the lawsuit still managed to take advantage of Novell's founder's descent into altzheimers to elder abuse away all his money and use it to make the handmaid's tale a reality, eventually achieving success under the Trump administration, a misogyny the octagenarian democrats are happily complicit in sustaining to this day.)
Yes this is a cultural thing, the native americans who were here for 36,000 years before the white man came terraformed the place to be full of food you'd just reach out and pick. They modified their environment to make hunting and gathering _easy_, and were also a lot cleaner than europeans. (The ubiquitous "road dust" that medieval europeans brushed off their cloaks was powdered horse manure, which is a health hazard even with modern sanitation, and don't get me started on the cows and pigs and chickens and it somehow managed to be even worse in the cities...) The highly contagious European settlers who came here and killed almost everyone they met (Start watching this charlie brown thanksgiving episode at 18 minutes and 10 seconds, it's educational) didn't realize they were wanding through the equivalent of Kew Gardens, they thought it was wild and that nobody needed to maintain it, and smashed up enormous salmon runs and screwed up controlled burns and just made a mess of the place. Capitalism has ALWAYS been unsustainable. It's just that "expanding until you eat the whole world" was a viable strategy until quite recently, when capitalism predictably ran out of world.
This is why the GOP wants to ban "critical race theory", by the way. When even 1960's Charlie Brown episodes go "we took this land by literal genocide"... the German nazi party literally sent study teams to america in the 1930s to learn how to codify racism in law and get away with mass murder, in response to which president Roosevelt put Japanese americans into american concentration camps, which they could only escape by joining the army to fight in the war. Today we call "plantation owners" billionaires. Might want to maintain some awareness of this general cultural context.
Darn it, fseek() is underspecified. If I lseek() on a file descriptor I know what happens, and what error conditions to check for if the fd isn't seekable. But if I fseek() back a few bytes, is it doing an lseek() on the underlying file descriptor or just adjusting the buffer in the FILE * object? If I fseek() on something that isn't seekable does it cause a problem for future reads?
I just fixed head.c, but toysh's read builtin also needs to put back extra data it read for the corresponding test to work right, and lseek(fileno(FILE)) would leave the FILE * readahead buffer with leftover trash in it, so in THEORY I want to do fseek() but in practice I dunno how much I can trust it? (More debris from the C specification people pretending file descriptors don't exist so they don't need to interact with them, and posix refusing to go far enough specifying the interaction.) Honestly, "fseek() shall fail... IF the call to fseek() causes an underlying lseek() and [error happens]" because calling fseek() is by no means guaranteed to cause an actual lseek() to update system status. (Grr, do an fseek() AND lseek(fileno(FILE)) maybe? I'm not convinced this is BETTER than just doing single byte reads of the input so we never get ahead...
Sigh, time to read multiple libc implementations...
Ok, from musl and bionic it LOOKS like fseek() is generally implemented as a wrapper around lseek that flushes and drops the FILE * internal buffer data when the seek works, and the ambivalence about whether not it actualy does that is because fmemopen() and friends exist, so some FILE * objects AREN'T a wrapper around a file descriptor. And those are weird, but I don't have to care about them here.
Ha! If I feed the O_DIRECT flag to pipe(2) then in THEORY that prevents multiple writes from being collated in the pipe buffer, meaning "while true; echo $((++x)); done | while read i; echo $i; done" shouldn't skip any numbers even if it creates and destroys a separate FILE * each time through. (Which it still shouldn't for stdin/out/err, but I need to throw in whatever the read equivalent of a fflush() is each time we redirect stdin.)
Hmmm. There's a gratuitous artificial limitation on fcntl(F_GETFD/F_SETFD) which ONLY lets it change FD_CLOEXEC and NOTHING ELSE. Why even have the API then?
Wow, glibc is truly craptacular. If I go over to my freebsd-13 image and include unistd.h and fcntl.h and do pipe2(fds, O_DIRECT); it works fine. And it works fine built with musl-libc too. In bionic, they have O_DIRECT but not pipe2 because their unistd.h has an inexplicable #ifdef IA_IA_STALLMAN_FTAGH around the prototype. (And I still haven't figured out how to #ifdef for the presence of a function prototype.) But if I do that on glibc it complains about pipe2 _and_ O_DIRECT both failing to be exported from the header files I included without #defining about how RMS sleeps in ryleh. Guys: pipe2() was introduced in 2008 and O_DIRECT has been in Linux for more than 20 years (and grew its pipe2 meaning in Linux 3.4 released May 2012), it is a Linux system call, not a gnu thing.
Linux is not and never has been part of the gnu project, and RMS explicitly objected to the existence of Linux before he switched to trying to take credit for it, and yes his explanation at that link is a big lie because Linux forked off minix not gnu, which is why the early development was all done on comp.os.minix and he had a famous design argument with Minix' creator (when said professor returned from summer break) who kicked him off minix's usenet newsgroup and made him start his own mailing list. I collected some interesting posts from the first couple years on my history mirror: note the COMPLETE lack of Stallman or FSF participation in any of it, and if you boot 0.0.1 under an emulator, the userspace ain't gnu either. Stallman was 100% talking out of his ass: Linux was inspired by (and developed under) Minix with the help of printed SunOS manuals in Torvalds' university library, and it incorporated a bunch of the BSD work going on at the time. The gnu project was one of MANY unix clones happening in the wake of the 1983 Apple vs Franklin decision extending copyright to cover binaries and inspiring AT&T to try to close and commercialize Unix after 15 years of de facto open source development (and the FIRST full Unix clone shipped in 1980) By the time Linux happened, the GNU "project" had been spinning its wheels for eight years. When Linus's 1991 announcement said it WOULDN'T be like gnu, he was MOCKING WIDELY KNOWN VAPORWARE, like a game developer referencing Duke Nukem Forever or Diakatana.
Anyway, the point is the glibc developers have had PLENTY OF TIME to get these symbols into the darn userspace headers, and the only reason they haven't is the same reason Stallman tries to take credit for Linux, which has led to bad blood in both directions. (Stallman also tries to take credit for the existence of FreeBSD, but they just point and laugh at him. He had nothing to do with Wikipedia or project gutenberg either. The term "Freeware" was invented by Andrew Fluegelman years before Stallman's GNU announcement. Magazines like Compute's Gazette had BASIC listings in the back every month dating back to the 1970s. Dude can shut up and sit down aleady, that sexist privileged white male Boomer has Elon Musk levels of taking credit for other people's work going on, and needs to just stop.)
Aha! There's a SECOND fcntl(F_GETFL/F_SETFL) API which CAN toggle O_DIRECT. That's just _sad_, but sure. Assuming I can reliably beat a definition of O_DIRECT out of the headers, which I can't really #ifdef/#define myself because it varies by architecture. But I can get that from everything except glibc, and maybe I just don't care about it working with glibc? There's only so persistently stupid you get to be before I leave you behind. Define it to zero when glibc's broken headers did not provide, and let the call drop out, you get unreliable behavior due to a libc bug. I will not, ever, define stallman because my code is not part of the gnu project. One of its many goals is to provide an antidote to gnu.
Huh, it's surprisingly easy to get derailed into half an hour of closing tabs. Something like a hundred accumulated open terminal windows in desktop 7 (email) which are mostly just "type exit, hit enter" in each one because it's some man page I was looking at or command line tests I an confirm I finished with (or "pulldown->move to another workspace" and send off to desktop 2 (toybox) or 6 (linux/qemu/mkroot, and my kvm instance running freebsd hangs out there too), a bunch of "last thing here was pushing to git" or git show $HASH, or running some simple command like "pkill -f renderer" or df /mnt (shows me what if anything is currently mounted on it) or doing math with $((123*456)), or grepping for a symbol in /usr/include or the output of something like "aptitude search thingy" (an apt-get wrapper with better syntax) where I recognize and can discard the results but switched away from that window once I had my answer. When vi is editing a file exiting out and doing a git diff shows me whether I was browsing or actually made changes.
And lots and LOTS of "vi was editing a file and then got killed" because when you fire up vim on a file that's already being edited, it tells you the PID of the old vim instance but doesn't have an obvious way to just kill the old one and let you inherit the editing session. Instead you have to "kill PID" manually if it's still running (or search around to try to find the tab but good luck with that), then :recover and if the file's changed write it out under a new name to see if the changes are interesting, then rm the temp file and the .file.swp and THEN you can go back and edit it normally. Wheee...) If I'm feeling posh I can even go collate windows that got moved to the proper desktops (you can not only drag and reorganize tabs within a window, on xfce you can drag and drop then between terminal windows. If you haven't got a tab, open a new tab to force the tab bar to show up, then exit the new tab when it's the last one in the window.)
Heh, here's the directory where I was re-ripping some CDs (usb DVD drive still works, cdparanoia still works, most of the CDs are still in the right cases) and hitting them with flac to scp up to my website so I could download them to my phone. (Long ago I had big youtube music playlists, but youtube became 100% useless without paying. Not just two ads between each song, but interrupting longer songs in the middle to play ads. Digging out old CDs and mp3 collections it is...) Pretty sure I can rm *.wav in there, I could zap the .flac files too but eh, I'm not short of space just now. (2 terabyte ssd covers a multitude of sins. Or at least allows them to quietly accumulate.)
Here's the window I download and filed my twitter archives in (both for my original account, which I then deleted, and the backup account Fade made me during all those years i refused to give @jack my phone number, which I still have but haven't posted to even once since making that archive because downloading a fresh archive wants to do 2FA through Fade's phone in Minneapolis which is just not worth it. (I check a couple individual feeds there about as often as I remember to check Charles Stross' blog or Seanan Mcguire's Tumblr. I don't have an account on either site...)
That's the EASY part of tidying one's desktop, of course. Browser tabs have gone beyond the timesink event horizon. Chrome remebering them between restarts is both a blessing and a curse, but at least "pkill -f renderer" keeps the memory usage down to a dull roar. It would be nice if it could save each inactive tab to a zip file under .chrome somewhere so that tab didn't have to reload from the website as if freshly opened whenever I get back to it, but hey. I've learned I basically never look at bookmarks again, and I _do_ periodically revisit and finish/cull old browser tabs. Not as fast as they accumulate, but still...
The ice storm has REALLY screwed up my sleep schedule. Woozy. (Couldn't work, couldn't go out, the lights were off all day, and it was stressful.) My internal clock is flashing 12, doing the whole "Too tired to focus but lying down does not result in sleep" thing...
It's hard for me to get worked up about "yoda conditions" when it's THE SAME COMPARISON. 1 == x and x == 1 are equivalent, but the one on the left can't be mistaken for/typoed into an assignment. "Correcting" everything to the one on the right because it's not "mentally comfortable" is something I'm having trouble sympathizing with? (My mental arithmetic apparently does not have "handedness". This is a thing the language has always allowed you to do, and there is a reason to do it and zero reason to do the other one. Arguing "but it's not a _strong_ reason to do it" vs having literally zero reason other than aesthetic preference... Sigh.)
Darn it, my clever "while read" combo hack in toysh has a problem.
So getline is a glacial CPU-eating slog without cacheing, and FILE * is the cacheing layer ANSI C decided to provide back in the day and (eventually) implement getline() on top of, and if you're just reading from stdin then the "read" builtin can use the stdin global constant (as get_next_line() is curently doing), and my THEORY was that for anything else (either read -u or read < source) I could fdopen() a FILE * object and cache it in the struct sh_blockstack instance for the enclosing loop (adding a field to the control flow data), and thus not lose cached readahead buffer data by destroying and recreating the FILE * wrapper each time the read command ran and exited.
BUT: read -u $VARIABLE is not guaranteed to be the SAME filehandle each time through the loop. I guess I can call fileno() on the FILE * and compare the fd we're trying to operate on, and tear down the old one and replace it when they change it?
while read i -u 37; do for in {1..10}; do read j k l -u 37; do echo $i $j $k $l; done; done; done
I can come up with a bunch of test cases I don't care about OPTIMIZING, but I'd prefer they didn't actively break. (But why would anyone do that? "for i in a b c d; do read a b c < $i; do_stuff; done" could happen. Hmmm, but then it's doing an open/close on the file object in the read context, so cacheing the FILE * object in the flow control would be wrong. Grrr. Lifetime rules!)
Hmmm... alright, there are two cases here: read from a tty and read from a file. In the tty case, the input (should) come in chunked so the block reads are short and shouldn't readahead much anyway. (If you've ever typed stuff before a shell was ready and the input got lost... that. Password prompts are notorious for it, but it happens elsewhere.)
The other case is "while read... < file.txt" where it will very much read all the way ahead, and if you ever discard extra buffer you deterministically lose bits of the file. Which says (oh goddess) I need a reference counted cache of FILE * wrappers for file descriptors >2 (stdin, stdout, stderr have persistent globals) but bump the reference increment/decrement to the enclosing loop block object (if any), which STILL won't work with "while read x; do command that also reads input $x; done < file.txt" because the FILE * will read ahead and then pass the filehandle to the command which starts reading after whatever the FILE * ate.
$ while read i; do echo =$i; head -n 1; done <<< $'one\ntwo\nthree\nfour\nfive'
=one
two
=three
four
=five
How. HOW? Is it doing single byte reads from input?
$ echo -e 'one\ntwo\nthree\nfour\nfive' | while read i; do echo =$i; head -n 1; done
=one
two
Ah. It gets it right when the input is seekable. Of course.
$ while read i; do echo =$i; toybox head -n 1; done <<< $'one\ntwo\nthree\nfour\nfive'
=one
two
And it's at least partly "head" doing extra work, and toybox is getting it wrong. (New test!)
RIGHT.
And this says that FILE * is generically borked in the presence of fork/exec _anyway_, because the inheritor of our fd 0 won't see the data read ahead into the FILE *stdin buffer. I'm more familiar with this problem as it relates to stdout flushing, because glibc's gotten that very wrong before, and that was just trying to make flush on exit() reliable, let alone exec without exit.
The two big problems in computer science REMAIN naming things, cache invalidation, and off by one errors.
For my birthday, an ice storm knocked out the power from before I woke up in the morning until sometime after 10pm. I had some battery in my laptop, but didn't fire it up because if it drains all the way I lose all my open windows, and with more freezing rain predicted tonight I didn't know if power would be restored before thursday. (Plus our geriatric cat's heating pad was off, so sat on me for hours instead.)
Luckily it got just cold enough to sleet instead of more freezing rain. None of the trees that could have collapsed on my house did, although two on the block dropped some quite big chunks, and one such trees has drooped significantly and is resting half its branches on our roof, but in a bend-not-break sort of way. (One around the corner has bent basically in half and is resting its branches on the _ground_, which I find impressive. Pecans are survivors.)
So yeah, not a productive day, but way better than it could have been. No flood damage, no hurricane scouring the paint off a corner of the house...
Sigh. The very nice glasses I got in Japan shortly before the pandemic are finally wearing out. The lenses were outright scratchproof for a good three years, but the coating's weathered enough they're starting to scratch. They've been WAY more durable than anything I got from Zenni, and I dunno whether they're still functional at all with that whole "outsource to china" strategy meets china's covid lockdowns, the container pileup, and now wolf warrior diplomacy and reshoring? (I didn't get my prescription checked in Japan and instead handed them an old pair of glasses to copy the prescription from, and I've passed them off as "reading glasses" ever since. That was intentional: I'm not driving so I care more about reading up close for long periods, and glasses that focus more naturally at that length cause less eyestrain.
I _have_ newer/stronger glasses somewhere, but about 5 years ago I worked out that my eyes are adjusting to my normal usage patterns (staring at up-close things for hours at a time), and the whole reason my vision sucks is years of a correct-and-adapt cycle I probably could have just avoided if I hadn't been reading comic books all morning before the school eye test back on Kwaj. I'd never needed glasses before, but the roofline was a touch blurry... because my eyes took a couple hours to swing back to looking at far away stuff. I'm a lot older so it takes my eyes a lot longer to move their overton window, but even today it still happens: if I stop wearing glasses for 8 hours or so far away things are WAY sharper when I finally do put them back on. I just... hardly ever do that? No phone, no lights, no motorcars, not a single luxury... Sometimes I take them off on long walks to the table while listening to podcasts, but that's about it.)
Honestly, WHY does qemu keep gratuitously changing its user interfaces? Once again the old one was simple and straightforward, the new one is insane, and removing the old simple API serves no obvious purpose. They broke tcp forwarding, they broke -hda, they broke -bootp... Stoppit.
It occurs to me I can test the lib/passwd.c rewrite under a debootstrap chroot instead of waiting for mkroot, because it's just twiddling files rather than poking at syscalls or /proc the way route and insmod do to actually change the host kernel's system state.
In theory, it's "debootstrap beowulf beowulf" (for devuan anyway) and then when that's finished copy a stripped down version of mkroot's "init" script in there and sudo env -i USER=root TERM=linux SHELL=/bin/bash LANG=$LANG PATH=/bin:/sbin:/usr/bin:/usr/sbin unshare -Cimnpuf chroot beowulf /init and... in PRACTICE it's being stroppy. I dealt with this for Jeff some months back, but apparently didn't blog about it enough, and can't find my notes? Hmmm... I remember tracking down a weird bug involving accidentally running the Defective Annoying SHell instead of bash, hence the SHELL= export there, and that's the kind of thing I WOULD have blogged about, but no?
I might have tweeted about it, in which case it's lost to history because of the muskrat's midlife crisis. (For his quarter life crisis he bought a company that makes shiny red sports cars. The bald Amazon billionaire bought the a newspaper, the south african emerald brat tried to pretend he wasn't copying him by instead buying the latest iteration of aol/livejournal/myspace. Because SpaceX clearly isn't in a dick measuring contest with Blue Origin. A company named after the X-prize, which he lost -- Paul Allen sponsored Burt Rutan to win -- is clearly NOT about competition and ego, it's an entirely original thing that emerged fully formed from his very large brain, which is no way a cry for help.)
Alright, FIX WHAT'S THERE in dirtree. BREADTH traversal means dirtree_recurse() needs to iterate through the child list of stored entries (if any), which calls handle_callback() which frees the node when the callback didn't return DIRTREE_SAVE. The problem is, we're recursing through that list and free(node) doesn't remove it from the list. We're only told AFTERWARDS whether or not it saved it (did handle_callback return a pointer or NULL). So I need to fetch the next entry _before_ calling handle_callback so we can iterate without read-after-free list traversal, but I need to update and advance the saved-node pointer _after_ calling handle_callback, making sure it always points to valid memory.
Dear C++ developers who have hijacked gcc development:
In file included from ./toys.h:69, from lib/dirtree.c:6: lib/dirtree.c: In function 'dirtree_recurse': ./lib/lib.h:71:35: error: label 'done' used but not defined #define DIRTREE_ABORTVAL ((struct dirtree *)1) ^~~~~~~ lib/dirtree.c:174:21: note: in expansion of macro 'DIRTREE_ABORTVAL' else if (new == DIRTREE_ABORTVAL) goto done; ^~~~~~~~~~~~~~~~ lib/dirtree.c:154:18: warning: unused variable 'entry' [-Wunused-variable] struct dirent *entry;
Bravo on the warning and error message generation. Exactly what I would expect from people who think C++ is a good idea. (And yes, that is a single processor build with no output interleaving. I double-checked. And yes, those were the first output messages before it had a chance to get itself good and confused, which it did and complained just as uselessly for quite a while after that. For the record, I had an extra } on line 177, a few lines AFTER all that nonsense. The compiler was no help whatsoever in finding it.)
Ok, got sort checked in. It uses -s as its short option which is a bit questionable (as far as I can tell the gnu/dammit one has -s produce the behavior it was already _doing_ for extract and throws an error if you try to use it with create: bravo guys), and my --sort can take an optional =thingy argument for compatibility but only implements sort by name. (Again, there's no "rm -r --switch-off-r" so --sort=none seems useless, and --sort=inode is a micro-optimization for 1980s vax systems without disk cache? It claims a performancce improvement but extract ain't gonna care (it's not USING the old inodes) and create has to read all the directory entries in order and then do a second pass to open them when it sorts ANYTHING, and then using inode number as a proxy for disk layout is optimizing seek time on uncached spinning disks which is also assuming they're regularly defragmented in a way that doesn't get the file locations out of sync with the inodes AND which assumes the disk was basically empty when all the files were created so the on-disk file locations correspond to the inode numbers, AND assumes a filesystem that's allocating inodes sequentially instead of using them as hash values... seriously, this was a marginal idea in 1989, trying to do it on a VM using virtfs to talk to a host storing data in btrfs is just NONSENSE.
The request was just for generating stable tarballs. I'm a little "eh" about mine vs gnu/dammit producing different output because I'm using strcmp() and the FSF loons are probably listening to the locale information and doing the same "upper case sorts mixed in with lowercase" nonsense that forces everybody to go LC_ALL=c before calling 'sort' out of the host path, but I can't control that and "stable produced with the same tool" is presumably the goal here.
Yes, the test I added for --sort is not using "professional" names. No, I'm not cleaning it up to look presentable. Possibly I should have left sed as it was and let the culture catch back up...
Grrr, the design of dirtree.c isn't right. And I've known it isn't right, but it's hard to GET right. There are FOUR interlocking functions (dirtree_add_node(), dirtree_recurse(), dirtree_handle_callback()), plus a fourth wrapper function dirtree_read() you generally start out by calling, and that's way too complicated.
The job of dirtree_add_node() is to stat a directory entry and populate a struct dirtree instance from it, which is fine. That's good granularity. That's the only one of the lot that ISN'T crazy, although possibly that assumption is what needs to change to let me fix everything...
When each dirtree instance gets created a callback function can happen, with behavior that happens in response to that callback's return code. That's what dirtree_handle_callback() does: you feed it a dirtree instance and the callback function, and it calls one on the other and responds to its return code. Possibly dirtree_add_node() could just take the callback as another argument... except what I was trying to avoid was recursing into subdirectories causing the function to recurse too. I don't want NOMMU systems with tiny unexpandable stacks to have unnecessarily limited directory traversal depth. Although I don't think I've got that right NOW either, so...
The dirtree_recurse() function handles recursion into subdirectories. Badly. Right now it opens a filehandle at each level to use the openat() family of functions, meaning directory traversal depth is limited by number of filehandles a process can open simultaneously. Instead I need to traverse ".." from the directory I'm in to get back to the parent directory, and then compare the saved dev/ino pair in the cached stat structure to see if that's the same node, and if not traverse back down from the top again. (And if THAT doesn't work, prune the traversal. That's "mv a subdir while archiving" levels of Don't Do That. SECDED memory falls back to DETECTING an error it can't correct, quite possibly this is xexit() time.)
The linked list of dirtree structures is less of a problem than the recursion stack depth because a linked list doesn't have to be contiguous, you can fragment that allocation all you want.
Sigh, the real outlier here is ls.c. Everything else just calls dirtree_flagread() and gets callbacks, but ls micromanages the traversal because it had weird sequencing requirements. So I need to refamiliarize myself with the ls weirdness to make sure a new cleaner dirtree implemenation could provide the callbacks it needs (quite possibly it _is_ the new DIRTREE_BREADTH semantics) so I can stop exporting dirtree_recurse().
Grrr, but Elliott pinged me about a new android code freeze and I wanna get him --sort before that goes in. I should debug what's THERE instead of redesigning it, but it's REALLY hard to get the object lifetimes right with multiple functions passing stuff off between them in a loop like it is now.
I think I need two functions: dirtree_add_node() and dirtree_read() that does all the callback handling by non-recursively traversing the tree (adding/removing nodes as it goes if/when the callback says to). Hmmm, but what would the arguments be? There isn't a global "tree" object that can hold things like "flags", and I want to be able to traverse on a path _or_ under an existing struct dirtree *node... Maybe dirtree_read(char *path, int flags, function *callback) which is a wrapper for dirtree_traverse(dirtree_add_node(char *name, int flags), int flags, function *callback)... except the reason dirtree_add_node() needs the parent pointer is for parentfd due to the openat() stuff, that's why the caller can't just set it after it returns. Right...
Fiddly. Hmmm...
When I'm done all this plumbing SHOULD look so simple that it's all obvious and trivial and seems like I didn't do anything. Getting there is usually a flaming pain, and a lot of the times I DON'T and have to ship something overcomplicated, which says to ME that I'm not very good at this. Alas, the reason I _don't_ have impostor syndrome is the rest of the industry turns out, on average, to be even worse at it than me.
Trying to debug tar --sort and it's being stroppy. I'm not sure I've got the design right, which is sad for something so seemingly simple?
Sort of regretting having implemented --no-ignore-case. It's the default, just don't specify it when you don't mean it? I didn't have sort check it, and am going "eh...". (The extra code to check it is bad. Having it and NOT checking it here is bad. Grrr. NOT PICKING AT IT. I haven't figured out how to make lib/args.c gracefully handle this category and I'm trying NOT to go down a rathole of spending 3 days on the design of something relatively unimportant. Not a fan of ---longopts at the best of times, having extra options to put the behavior BACK to the default... rm -r does not have a turn-off-r-again option because it DOES NOT NEED TO.
The gnu/dammit clowns are BAD AT UNIX, Stallman only cloned unix after ITS died because his community had collapsed under him and he wanted to hijack an existing userbase, he hated and fought unix until he was forced by circumstance to join, and was an outsider who never properly understood WHY it worked.
The old history writeup I did on this years ago didn't even MENTION Digital Equipment Corporation's Project Jupiter which was the proposed successor to their 6-bit mainframes (the PDP-6 and PDP-10). The Jupiter prototype system was used to render part of the graphics in the 1982 disney movie Tron, but DEC pulled the plug on development in April 1983, and THAT's what caused Stallman to give up on ITS and start over cloning Unix. He'd backed the wrong horse, the hardware platform he'd inherited (after everybody else who worked on it graduated and moved on with their lives, he stuck around as a perpetual college student) died out from under it, and NOBODY ELSE CARED. He was forced to move because the univesity was going to unplug the old hardware and throw it away. This wasn't a decision, this was a forced REACTION. RMS was always a conservative reactionary working to prevent change, who took the smallest steps possible each time the legacy position he defended became untenable. As with all ultra-conservatives, he mistakes this for "visionary thinking" and talks himself up, but it's the same "looking back to a largely imaginary golden age" you see so much of from any other privileged old fogey complaining about kids these days.
Stallman couldn't even predict the obvious near future: 6 bit systems inevitably lost to 8 bit systems as memory got cheaper because the whole POINT had been that you could fit 25% more text a given amount of memory using 6 bits per symbol instead of 8... with glaringly obvious limitations. With only 64 combinations you just couldn't fit everything: 26 upper chase characters, 26 lower case characters, and 10 digits left only TWO symbols for space and newline -- you couldn't even end sentences with a period. If you wanted ANY puncutation, you had to sacrifice digits, or make everything uppercase, and different compromises meant incompatible encodings.
The first 7 bit ASCII standard was published in 1964. With twice as many symbols there was no need to compromise -- after upper, lower, and digits half the space was still available for punctuation and control characters -- so every 8-bit system could use a compatible encoding for all documents. Gordon Moore's article describing Moore's Law was published in 1965, predicting exponential increases in memory availability for the forseeable future. Clinging to a 6-bit system almost 20 years later (after all his classmates had already abandoned it) was head-in-the-sand levels of stubbornness on Stallman's part.
DEC had introduced its first system with 8-bit bytes (the 16-bit PDP-11) in 1970, 13 years before canceling Jupiter, and its 32-bit successor the VAX came out in 1977. In DEC's entire history it only ever sold about 700 of its 36-bit PDP-10 mainframe systems. DEC sold almost a _thousand_ times as many PDP-11, and DEC shipped a dual-processor VAX the year before canceling Jupiter.
Stallman is the exact opposite of "visionary". He's just another classically educated white male with decades of practice retroactively justifying what he's already decided to do by constructing a convincing shell of logic around his emotional motivations, and it is just as exhausting dealing with his fanboys as it is dealing with the fanboys of muskrat or jordache peterman or the ex-Resident or any of the others.
Jeff's flying back to Japan. I am jealous. But Fade made a flight reservation for me to visit her from Feb 10 to 22, so that's nice. (Her dorm apartment thingy still has the second room empty and locked, so it doesn't both anybody if I stay more than a couple days.)
Last year I ordered a cthulamp for the desk in the bedroom (one of them "five positionable metal tentacles with a lampshade at the end of each" deals), but couldn't figure out how to assemble it properly and then wound up flying off to Fade's and finishing the contract from there. Took another stab at assembling it today and figured out what I got wrong this time (the little plastic not-washer thing with the raised inner bit was both on the wrong side of the shade AND rotated 180 degrees, so it fit perfectly but then the light bulb didn't), and WOW that desk is a nicer workspace with 5 more LED bulbs right next to it.
Finished and checked in --wildcards. Needs more tests in the test suite, but it didn't cause obvious regressions and should be enough to unblock the android kernel guys?
Implementing tar --sort next.
I tried Chloe Ting's "5 minute warmup" video.
Made it to the end this time.
Everything hurts.
(It wasn't even one of her proper EXERCISE videos. I did the WARMUP and am still in pain an hour later. It turns out slowly walking 4 miles a night 3 or 4 times a week not exercise a wide variety of muscle groups.)
Elliott emailed me asking for a bug report if I could reproduce the adb compatibility issue, because he says the policy is the developer kit should be backwards compatible all the way back to kit kat, including ADB working. I apologized and acknowledged it's been a while since I've tried the distro version of ADB. (For file transfer I scp files to my webserver so my phone can download them, and attach stuff to myself in slack going the other way. I installed an ssh app on my phone but haven't bothered to use it in forever.
Back when I was running Devuan Ascii, _many_ things out of the repo didn't work (llvm was too old for the packages I was trying to build, ninja was too old, I finally upgraded to Beowulf because building qemu from source demanded a newer vesion of python 3...) The adb in Ascii having been broken probably wasn't surprising. I got in the habit of downloading a new version of the android tools rather than trying the distro version, and haven't checked if I still NEED to in a while...
My current phone's a Pixel 3a that end-of-lifed on Android 12 (the system->update menu has a big "regular updates have ended for this device" banner, with the last one 10 months ago), so isn't exactly a moving target anymore anyway. (Yeah, I should upgrade my laptop to Devuan Chimaera, but nothing major's broken yet that I've noticed?)
At a guess, debian breaking adb is like debian breaking qemu: I always build that from source because debian's distro version never works. Even when the theoretically exact same release built from source via "./configure; make; make install" works fine.
Alright, where did I leave off with wildcards: --wildcards{-no,}{-match-slash,} --{no-,}anchored --{no-,}ignore-case and this is why I got so distracted by trying to automate no- prefix in the plumbing. Right, just explicitly spell out all 8 flags for now and clean it up later. What are the USERS: Inclusion vs exclusion, creation vs extraction, command line arguments vs recursively encountered arguments: that's 8 combinations. No, 16 with and without case sensitivity. (This is assuming extract and test behave the same.) Each of those can have wildcards default to enabled or disabled: case sensitivity is the global default, exclusion defaults to wildcards no-anchored match-slash. Not everything can be enabled in every position, for example --wildcards does not affect command line arguments when creating an archive. (That's one of the tests I wrote back in October.)
I'm also annoyed at --show-transformed-names and --show-stored-names because it should just pick one. I'm also reminded that --verbtim-files-from exists and I think that's what I'm doing already? (Need to compare with busybox...)
Sigh, it's so easy to find -K and -N and go "I could implement that" but nobody's ASKED for it and if you go down that road even ignoring crap like -n (not implementing multiple codepaths to do the same thing, thanks) and --sparse-version there's gratuitous complication like --owner-map (not the same as --group-map) and the $TAPE environment variable and twelve --exclude variants that really could be done via "find" ("find -print0 | xargs -0" covers a multitude of sins, fairly portably) and then just nuts stuff like --hard-dereference that... what's the alternative? Linux doesn't let you hardlink directories, and a file with more than one hardlink is A FILE. Would --ignore-command-error apply to the compressor or just programmatic output streams?
Busybox NOT implementing stuff for a long time is a useful data point: they got a couple decades of people poking them and going "I need this". If it didn't happen (strongly enough for them to react), that's informative.
Except I got asked (on github somewhere) to support appending: -r and -u and maybe -A? (Which is append with existing archive which you don't need tar for...? I mean, it cuts off the trailing NUL blocks I guess. There's an -i option which... I don't know why that always being on would be a bad thing? Probably some historical reason...)
The existencce of "lzip", "lzop", and "lzma" makes me tired. None of which are "xz". (It's like being back in the days of arj and zoo.)
Ahem: ok, back up to the motivating use case: tar --directory={intermediates_dir} --wildcards --xform='s#^.+/##x' -xf {base_modules_archive} '*.ko'
Oh yes, and with gnu/dammit tar --wildcards affects patterns AFTER it but not before it in the command line. Sequencing! Right.
Ok, wildcards can be switched on for extract but NOT for create because creation isn't doing a directory search but is opening (and recursing into) specific command line thingies so there's no comparison being done: there's no readdir() in that codepath, the open(argv[x]) either succeeds or fails. Comparisons are done for creation exclusion (while recursing?), extraction inclusion, extraction exclusion... which corresponds to toybox tar's 3 existing calls to filter() with add_to_tar() calling filter(TT.excl), and then unpack_tar() doing both filter(TT.incl) and then filter(TT.excl). Both TT.excl calls should default to --no-anchor --wildcards-match-slash but the TT.incl call shouldn't (but currently does because I only implemented one filter behavior). The man page implies incl should default to --anchored --no-wildcards --no-wildcards-match-slash...
Sigh, I can just compare my argument with the global variable to distinguish the two cases, and set the default that was. It's ugly, but having the caller (redundantly!) specify the defaults is also ugly, and having an extra agument to distinguish the modes when I can just test for it... Wanna get this finished and move on to the next thing.
It's been a while since I've had a significant visual migrane.
The experience is not raising any positive nostalgia.
Not a productive evening.
Checked in [LINK] the probably correct but not actually tested DIRTREE_BREADTH code (which at least didn't cause regressions in the test suite) this morning, but haven't used it to implement tar --sort yet because I still have 2/3 of --wildcards in my tree. Which is actually a half-dozen options because there's --no-wildcards-match-slash and so on.
Urgh, why is tar.c not using FNM_LEADING_DIR instead of the constant? I did not leave myself a comment about WHICH build environment barfed on this. The fnmatch.h header is in posix but this particular constant isn't, It's unsurprisingly in glibc, it's in bionic (which says it got it from openbsd), it's in musl. Boot up freebsd-13 under kvm... that's got it too. And Zach got me a mac login... it's there as well.
Ok, is it a 7 year time horizon thing? The date on the line according to git annotate is 4 years ago, so most likely 7 years has expired by now if that was the case? (It's not a kernel thing, it's a libc thing. Annotate on musl's fnmatch.h says it's from 2011, that's a full dozen years ago.) Eh, try constant for the macro and see who complains...
Oh wow. It's glibc that complains. It wants #define ALL_HAIL_STALLMAN to provide the constants, but on Bionic and FreeBSD and MacOS they're just there without magic #defines. And it's the same constant value everywhere. Right, #ifndef in portability.h time, maybe posix will catch up somewhere around 2040...
Yay, dreamhost fixed it. My two posts about it to the list didn't wind up in the web archive and I was all ready to take up my sword again... but it's because I sent the message and the reply to "lists@landley.net" which is not a real address. Hopefully google and archive.org will start populating again at some point.
That tar --xform test failure which only happens on musl is because musl still doesn't have regexec(REG_STARTEND). So it's just a new manifestation of a known failure, eating another round of debugging time because 10 years ago Rich explicitly refused to implement something even the BSDs have.
Sigh. I'm eventually either going to have to fork musl or drop support for it. I should just switch that date test back on. There are multiple "yup, musl and musl only is broken, this even works on BSD" cases already. The test suite needs a MUSL_IS_BROKEN flag on tests, or something...
A tech writer recently boggled at the pointless "undefined behavior" in C compilers written by C++ developers. And here's a rant I edited out of a post to lkml:
The C language is simple. The programs you write aren't, but the LANGUAGE is. C combines the flexibility of assembly language with the power of assembly language: it's basically a portable assembly language, with just enough abstraction between the programmer and what the hardware is actually doing that porting from x86 to arm isn't a complete rewrite. You manually allocate and free all resources (memory, files, mappings) and all sort of stuff like endianness, alignment, and word size is directly whatever the hardware does. In C, single stepping through the resulting assembly and matching it up with what your code does isn't that unusual. I've gone looking at /proc/self/maps on a sigstop'd binary and objdump -d on the elf executable to figure out where it got to, and in C you _can_ do that.
C++... isn't that. The language is DESIGNED to hide implementation details, all that stuff about encapsulation and get/set methods and private and protected and friend and so on is about hiding stuff from the programmer. Then when implementation details leak through anyway, try to fix everything by adding more layers (ala "boost") on top of a broken base, but that's like adding floors to a skyscraper to escape a cracked foundation. It's still static typing with static allocation i(they're insanely proud of tying stuff to local variable lifetimes and claiming that's somehow to garbage collection) and it's GOING to leak implementation details left and right, so they have buckets of magic "don't do that" prohibitions which they cargo cult program off of. Most of C++ is learning what NOT to do with it.
C was simple, so C++ developers hijacked compiler development and have worked very hard for the past 15 years to fill C with hidden land mines so it can't be obviously better than C++.
C is a good language for what it does. C++ is a terrible language. The C++ developers have worked tirelessly to make C and C++ smell identical, and as a result there's a big push to replace BOTH with Rust/Go/Swift and throw the C baby out with the C++ bathwater.
Haven't heard back from dreamhost, so I've submitted ANOTHER support request:
http://lists.landley.net/robots.txt prevents Google from indexing http://lists.landley.net/pipermail/toybox-landley.net/
I did not put http://lists.landley.net/robots.txt there and cannot delete it.
The contents of http://lists.landley.net/robots.txt are:
User-agent: *
Disallow: /Would you please delete this file, or change it to allow Google to index the site? I do not have access to it.
Here's hoping THAT is explicit enough for them to actually do something about it. Third time's the charm?
Properly reported the qemu-mips breakage. That list may be corporate, but it's not the wretched hive of scum and villainy linux-kernel's turned into, so maybe... (Yay, there is a patch, and it Worked For Me.)
So what DIRTREE_BREADTH _should_ look like is something like...
Hmmm, instead of checking for DIRTREE_BREADTH a lot the "populate children" loop should just pass a NULL callback while accumulating children... Sigh, I need to stress test DIRTREE_ABORT to make sure A) it returns from anywhere, B) it doesn't leak memory. Except most of my actual users don't choose the abort path, they continue on despite errors: tar, rm, cp...
We have a dishwasher again! Exact same type as last time, so it looks like nothing has changed but so much work went into this. (Ah, that old story.) The install guy set it doing an empty pratice run first, but then we have so many dishes to wash...
Jeff is trying to set up an sh4 development environment so he can come up with mmu patches and send them to linux-kernel, and I've been feeding him the trail of breacrumbs I've laid out with mdm-buildall and mkroot and so on, even using my prebuilt binary system image tarball the network didn't work for him, and that's becaue I'm using an older qemu version than he is.
Building QEMU from source recently broke network support for all platforms by splitting it out into a separate package your distro has to install for you. Because obviously the ability to talk to the network is not a standard thing a VM would want to do. This now requires "libslirb". There's an existing slirp package, for the serial line internet protocols slip and ppp, which has nothing to do with libslirp that I can tell. Luckily devuan has a "beowulf-backports" repository alongside all the others, which I can add (why didn't the OS install do that?) to get this libslirp-dev package. I'm still annoyed the IBM mainframe guys who took over QEMU development when kvm displaced xen as Linux's standard "hypervisor" are suddenly demanding it, but at least I can get Jeff unblocked now.
Mainframe punched card culture should not be allowed to turn functional software into bloated "enterprise" crap: qemu-system-arm64 (ahem, I mean qemu-system-aarrcchh6644) is A HUNDRED AND TWENTY FIVE MEGABYTES. Dynamically linked! That can't be right. You can tell Fabrice Bellard moved on long ago, and was replaced by a committee.
And test_mkroot.sh says mips is still broken... because the ethernet hardware isn't binding even WITH the library installled. And that's... because an endianness "fix" broke big endian for pretty much the entire PCI bus. Sigh. Vent about it all and move on...
Ok, tangent du jour beaten back down, let's circle back to the toybox design issue I'm frowning at. What notes did I leave myself:
why are recurse and handle_callback split? dirtree_add_node(): clear design, yay - maybe add callback as argument to dirtree_add_node()? dirtree_handle_callback: stages: fetch dir, initial callback: returns DIRTREE_BREADTH fetch children, via recurse with BREADTH. problem: closed fd already? (don't close for BREADTH) breadth callback: returns DIRTREE_RECURSE traverse children now call handle_callback on each?
Which means: DIRTREE_BREADTH isn't that hard to implement, but the existing code has three functions that really seem like they shouldn't be split that way?
dirtree_add_node(dirtree *parent, char *name, int flags) - creates a struct dirtree from a file. Handles the flags FOLLOW, STATLESS, and SHUTUP. Returns a new node with ->parent connected but not ->child.
dirtree_handle_callback(dirtree *new, function *callback) - calls callback(new) and handles the return value: flags RECURSE, COMEAGAIN, SAVE, and ABORT. (And I'm trying to add BREADTH here.)
dirtree_recurse(dirtree *node, function (callback, int dirfd, int flags) - most of the plumbing.
One sharp edge is that handle_callback() is opening the dirfd for recurse, but then recurse is closing it, which is NOT a happy lifetime rule.
I think the reason for all this tangle in the first place is I was trying to recurse the data structure without making the FUNCTIONS recurse, so it didn't eat an unbounded amount of stack when descending into a tree of unbounded depth? (Especially nasty on nommu.) Except that pretty much means having all three of them be a single function, because otherwise they're calling back and forth between each other. Or having one function that calls the others in a loop, which isn't what it's currently doing.
In any case, "implement breadth first search" and "reorganize this to not be designed wrong" really need to be two different passes, otherwise I'm here for a while...
Ha! The dirtree.c plumbing shouldn't have seperate DTA_BLAH flags for the "again" field to distinguish different types of callbacks, it should reuse the existing DIRTREE_COMEAGAIN, DIRTREE_STATLESS, and DIRTREE_BREADTH bits. (The "again" field is a char so can only hold the first flags, but I can reorder the DIRTREE flag list as necessary so the ones that cause callbacks are all at the start. Nobody else cares which flag is which, that's why there's macros.) This way, the again bits are the same as the reason for the callback: no flags is the initial "we found and populated a struct" callback you always get when callback isn't NULL, then BREADTH is "finished populating a directory with implicit DIRTREE_SAVE but did not descend into it yet, so now would be a good time to sort the children", and then COMEAGAIN call would be the final call on the way out of the directory after handling all children. (STATLESS doesn't cause a seperate callback, but is set on any callback when stat isn't valid.)
I should rename DIRTREE_COMEAGAIN to just DIRTREE_AGAIN (it was a Simpsons reference), but my tree's too dirty for comfort, need to check other stuff in first.
For BREADTH child callbacks are deferred until traversal: if the initial no-flags callback on the directory returns DIRTREE_BREADTH the plumbing should populate all the child structures without making any callbacks on them yet, then it does a callback on the same dir again with DIRTREE_BREADTH, then traverses the child list doing normal callbacks but freeing all the non-dir children after each callback returns, and then traverses the now-shortened list again handling the directories it needs to descend into...
Hmmm, that's not what gnu/dammit tar is doing, though. It's populating and sorting the list, then traversing it but descending into each directory as it encounters it in the travesal. Which isn't a true breadth-first search, it has ELEMENTS of breadth-first but... Ok, the return codes from the callback functions need to control order. Maybe if the DIRTREE_BREADTH callback returns DIRTREE_RECURSE then we descend into it now, and if not we do the second pass thing? Hmmm. I've got DIRTREE_SAVE, DIRTREE_RECURSE, and DIRTREE_BREADTH, and can return a chord of any of them to mean what I need it to, the question is what's the most obvious way to signal what I need it to do? What ARE the use cases?
This needs some pacing and staring into the distance....
Sitting at HEB with a stack of beverages I just bought (refill on blueberry energy cylinders, the checkerboard teas are back in stock, and there was a good coconut water coupon today)... but no snacks.
I miss japanese grocery stores and conbini. The conbini of course had rice balls and steamed buns and even microwaveable hamburgers if you wanted to get serious. The grocery store near the office had lovely little 100 yen sandwiches, which were just two pieces of cheap white bread with some filling (I usually got the strawberry jam or tuna varieties), crimped in some variant of a panini press that cut off the crusts and sealed the edges, and then presumably run through a nuclear reactor to sterilize them so it has multi-week shelf life. (Like mythbusters did to sterilize those tortilla chips in the "double dipping" episode: conveyor built moves the product past a strong radiation source is basically a non-heating microwave that kills all the bacteria with a few seconds of intense gamma radiation. The expiration date on the package is when the sandwich dries out slightly and is less tasty, I never had one actually go bad.) We could totally do that here in the states, we just don't: some variant of laws, culture, inclination, and capitalism optimizing for profit over all else.
Ok, tar --sort needs DIRTREE_BREADTH to do breadth first search. I could instead do DIRTREE_SAVE to populate the whole tree up front, then sort the whole tree, and then traverse the resulting whole tree, but don't want to because A) directories changing out from under us are less icky if you do it all in one pass, B) I've already got the openat() directory-specific filehandles for local access (I can open "file in this directory") in that initial pass. A second traversal has to either re-establish the openat() filehandles, or create/open big/long/path and potentially hit PATH_MAX issues. Since I don't have existing plumbing to do either of those yet, as long as I have to to write new plumbing ANYWAY I might as well implement the DIRTREE_BREADTH stuff I have some existing design stubs for.
DIRTREE_BREADTH brings up the DIRTREE_COMEAGAIN callback semantics: to enforce a specific traversal order I need to sort each directory's contents before descending into it. I reserved a DIRTREE_BREADTH flag back at the start but never implemented it, and I now have _three_ users of this plumbing that I'm aware of (ls, find, tar) so sounds like time to implement it. (Whether or not I poke ls.c with a stick afterwards remains an open question.)
Looking at find -depth is.. sigh. The toybox find help text describes -depth as "ignore contents of dir" and the debian man page describes -depth as "Process each directory's contents before the directory itself" and I don't remember if posix even has -depth and I probably need to spend an hour or two on this rathole, but I haven't got spare cycles for right now. (And I've already REVIEWED this one multiple times, so 99% likely I wouldn't be fixing the code but just updating my memory of it.) Anyway, -depth existing implies that _without_ that it's doing a breadth first search... which it demonstrably isn't in simple testing. Ok, find is NOT doing breadth first search. I thought it had an option for this, but no. It has an option to tell it what order to _act_ on what it's traversing, but it still descends into each directory it encounters when it encounters it.The ls.c code is taking manual control of the traversal by having the callback return DIRTREE_SAVE without DIRTREE_RECURSE so the traversal populates a directory's children, then it converts the linked list to an array, sorts the array, uses the array to re-link the list objects in the right order, then it iterates over the sorted list and calls dirtree_recurse() again on each directory entry.
So I want dirtree_recurse to assemble the list, call a sort callback on the directory that can reorder the children, and then traverse them and descend. Which is a different callback from the current DIRTREE_COMEAGAIN callback? Do I need a third dirtree->again flag value? It's got 1 (callback on directory after processing all contents) and 2 (DIRTREE_STATLESS returning a file we can't stat), which are set/used as constants without macros defined for them. A third means macros, what would... DTA_AGAIN and DTA_STATLESS maybe?
Hmmm... but IS this callback a different one than DIRTREE_COMEAGAIN? It sounds like DIRTREE_BREADTH means: 1) DIRTREE_SAVE a linked list of a directory's children without recursing, 2) call the DIRTREE_COMEAGAIN callback on the directory, 3) traverse the saved list... doing what exactly? When are these freed? If we free them the step 3 traversal how do they ever get used?
Ok, I think I do want a third flag: DTA_DIRPOP lets you sort a directory after it's populated, and then we call with DTA_AGAIN on each entry right before we free it. Except the find -depth question comes in: does the directory count as occurring before or after its contents? That's a question for the sort function... ah, ok: while traversing the list, do a DTA_DIRPOP call before descending into it, DTA_DIRPOP|DTA_AGAIN after populating it, an then DTA_AGAIN without DTA_DIRPOP before freeing it. Silly, but it gives the callback multiple bites at the apple while still having generic infrastructure actual do the traversal.
And this is basically a wrapper function before the existing add_to_tar() dirtree callback that checks the flags and does sorting stuff as necessary, but otherwise calls the other callback. And you only insert the second callback when doing --sort. Ok, that seems feasible?
Implementing is easy, figuring out WHAT to implement is hard.
Darn it, one of the commands that came up in need of tweaking when I change dirtree semantics is chgrp... which was never converted to FLAG() macros. But chgrp.tests needs root to run meaning I want to run it under mkroot and that whole BRANCH of development is... several pops down the stack.
My _development_ plan has circular dependencies. Gordian knot cutting time, let's do it "wrong" for a bit just to clear some things...
My sleep schedule has been creeping forward towards my usual "walk to UT well after dark and spend the wee hours at the university with laptop", but I got woken up at the crack of dawn by sirens, flashy lights, and engine sounds right outside my window because the big house on the corner caught fire, and between something like 7 fire trucks and the police blocking off the street at both ends it was Very Definitely A Thing even from bed. I got up to make sure there wasn't incoming danger to us, and then I was up...
Kind of out of it all day as a result. Got a nap later, but "5 hours then FORCED UP" is something I may be too old to handle gracefully...
1pm call with Jeff to go over the Linux arch/sh patches, and the mmu change that apparently motivated the latest round of dickishness.
Elliott wants --sort=name, so looking at that. The man page has a short -s right next to it, which... "sort names to extract to match archive". What does that _do_ exactly? I'm already going through the archive in the order the names in the archive occur. There's not much alternative with tar. You can pass a bunch of match filters on the command line, but it's going to encounter them in the archive it's extracting, and thus extract them, in the order they occur in the archive. Tar != zip, it's not set up to randomly seek around, especially when it's compressed.
Sigh, my tar tree still has 2/3 of a --wildcards implementation in it, and does not currently even compile. Plus a bunch of test suite tests the host passes but my version doesn't. Need to finish that or back it out...
And when I do full tests against the musl build, tar is failing the "xform trailing slash special case". Which I don't notice when it's skipping the xform tests because it's using non-toybox sed (as happens on "make test_tar" unless I do special $PATH setup), and which I don't notice when testing a full glibc build because it works there. 95% likely it's musl's regex implementation, but... what specifically is diverging?
I would have an easier time with this if I remembered exactly what the "xform trailing slash special case" IS. October wasn't that long ago, but I checked this in as part of a large lump after days of work and there were a bunch of tests? It's searching for "^.+/" which... ^ is start of string, . is single character wildcard, + is * except "one or more" instead of "zero or more", and then / is presumably a literal / except it says "special case" here... Sigh, was this in the tar manual?
The example at the very end of that page is about specifying multiple sed transforms on the same command line, the first of which is NOT TERMINATED PROPERLY. (I.E. --transform='s,/usr/var,/var/' is missing a comma at the end.) And they repeat it twice the same way. Is this a doc mistake they cut and pasted, or does their implementation accept that? I'm afraid to check, and have NO idea how to deal with it if their implementation DOES allow it but normal sed doesn't. Maybe circle back to --xform after implementing the new stuff...
Ok, here's how I could cheat on the toysh "read" builtin: the case I care about optimizing is "while read", and the "while/do/done" block has an entry/exit lifespan. I can have the "while" cooperate with "read" to cache a FILE object. The read has to save it because "-u fd" is a read argument, but the while gives it someplace TO save it with a longer lifespan than the individual read call, and passing out of the "done" lets us know when to free the FILE *. Hmmm, I could store it in sh_blockstack's char *fvar with an evil typecast, that's not used by while... I'm dubious. Need to pace about it more. Probably best to implement just the slow path first. (There are SO many read options... timeout, length with and without terminator, -s puts the terminal in raw mode... I'm gonna need to back and implement array variable support in everything at some point? How do I stage this sanely...)
Oh hey, Greg KH is _also_ yanking most of the classic graphics drivers from linux-kernel. It REALLY sounds like linux-kernel development is collapsing and they're throwing code overboard as fast as they can. I hope that's NOT the case, I really thought we had another 5 to 10 years before that happened, but if Linus has decided to retire early because his daughters are all off to college... Let's see, his and his three daughters' birthdays are the easter egg values in "man 2 reboot" LINUX_REBOOT_MAGIC2, which are:
$ printf '%x\n' 672274793 85072278 369367448 537993216 28121969 5121996 16041998 20112000
So Linus is 53 (december 28, 1969) and his _youngest_ daughter is 22. Yeah, he's probably recently become an empty nester, and may be "quiet quitting" to go do other things with his life. And Greg has been waiting DECADES for the opportunity to do to Linux what Elon Musk is doing to twitter. Like an alcoholic buying a distillery. Sigh.
My annoyance with current linux kernel development is "Stop breaking stuff. Can the things that used to work still work?" And the reason we CAN'T have a stable kernel that doesn't shed features is... Greg Kroah-Hartman! Who many years ago proudly wrote a document named stable-api-nonsense about how the concept of Linux EVEN HAVING a stable driver API so you could keep portable divers between versions the way Windows did for many years... Greg said that's a crazy idea that Linux would never ever do. Userspace can still run a static binary from 1996, the kernel can't load a module from 9 months ago. Partly because GPL, and partly because Linux MUST be free to completely rewrite all the plumbing every 18 months to gain half a percent performance improvement with worse latency spikes. And now Greg's deleting a bunch of working drivers that are too hard to maintain under his insane regime. Wheee...
Sigh. Speaking of spiraling narcisists, did you know that Elon Musk got the idea of going to mars from a science fiction book the Nazi rocket scientist Werhner von Braun wrote in 1949, in which the emperor of Mars was named "Elon"? Back in the 1950s the reason Musk's grandparents gave for leaving canada for apartheid south africa was they perceived a "moral decline" in Canada (Wikipedia says "Most of the recorded student deaths at residential schools took place before the 1950s" so Musk's grandparents left Canada's about when the mass kidnapping and murder of native children declined, and instead they traveled halfway across the world to participate in Apartheid). So there's a nonzero chance Musk was named after that character in the 1949 German book, since his family was VERY familiar with a wide range of otherwise obscure nazi materials. So of course various Musk fans are now going "famous rocket scientist predicted Elon would be emperor of mars!" and I'm going "you have the causality here exacty backwards". Why do people keep thinking the man's ever had an original idea? That's NEVER been how he works...
My grandfather also interacted with Von Braun, they worked together on the Apollo program. (My parents met on the apollo program, because my father dated his boss's daughter.) The story grampa told me was that Von Braun's most important contribution to the US space program was statistical failure analysis. Grampa never mentioned the NSA until he got _really_ low while my mother was dying of cancer in the other room, shortly after my grandmother had died of her chronic lung problems (emphysema and eventually lung cancer, from years of smoking back before I was born). They'd had three kids, I never met Uncle Bo who volunteered to fight in vietnam over Grampa's strenuous objections and died there when his helicopter was shot down. Grandpa was now outliving a second kid and not taking it well, and started by complaining about how his hearing was shot and his big interest that got him into electronics had been crystal radios and audio. He was telling me how the allies recorded sound on magnetized metal wire but it got cross-talk when spooled and you couldn't splice it if it broke or got a kink, but they captured desk-sized audio reel to reel tape recorders from nazi bunkers which were a MUCH better design: built-in insulation between the magnetic layers in the spool and the tape could be cut cleanly and attached together with scotch tape on the back, and some of the GIs shipped a couple of them to the states where Bing Crosby paid to have them reverse engineered (and vastly simplified) so he could ship tape reels around to radio stations instead of constantly flying to give live performances, and this became the company "Ampex". Grandpa also told me how he did cryptography during the war creating one time pad vinyl records of "music off the spheres" radiotelescope recordings of ionized particles hitting the upper atmosphere sped up to be good random static which completely drowned out the voice unless you had the exact same record to do noise cancelling on the other side (stacks of these records were carried across the atlantic via courier, each one smashed and the pieces melted after one use). Churchill and FDR used these to securely talk to each other live over transatlantic cable, and this proceeded naturally to grampa venting about being blackmailed into joining (what became) the NSA after the war because they were going to draft him and put him on the front line in Korea if he didn't "volunteer", and then not being able to get out for decades until some idiot almost got him killed in Iraq in the 1980s by trying to hand off intelligence to him in his hotel room while he was there as a technical expert for General Electric upgrading (and bugging) the Iraqi phone system. (Apparently the various spy services are the best technical recruiters, finding you companies to work at. Well, they were decades ago, anyway. My take-away was "don't get any of that crap on you, you'll never get out again", and I learned it from my father's simple defense contracting.)
Oh hey, Dreamhost replied. They escalated to somebody who DID NOT BOTHER TO READ MY SUPPORT REQUEST. Not even the usbject line, which reads "Re: The robots.txt you put on lists.landley.net (which you won't let me log into) blocks google."
On 1/15/23 23:48, DreamHost Customer Support Team wrote:
> Hello,
> Thank you for contacting DreamHost Support! My name is XXX and I'd be happy to assist you with your concerns.
> With regards to the discussion list service, the last time this service was touched was last year in July when we had a maintenance on it to where we upgraded the services to new hardware. This didn't change much of how the service functions, though, as we're still running the same Mailman version as before under 2.1.39.
The robots.txt file is not technically part of mailman. Mailman runs in a web server, and that web server is serving the robots.txt file.
> About the http://lists.landley.net/listinfo.cgi page, that page has been disabled for a long time now.
I noticed. I complained about it at the time.
> The list overview page for the discussion list service was disabled over 5 years ago, actually.
Yes, as I told you in my last email. Closer to ten, really: https://landley.net/notes-2014.html#20-12-2014
> So, that page posted the "The list overview page has been disabled temporarily" message for a very long time now.
What does the word "temporarily" mean here?
> Unfortunately, that cannot be edited, but you already have your list archives set to public, so they can all be accessed here: http://lists.landley.net/listinfo.cgi/toybox-landley.net
Yes, I know, they are linked from https://landley.net/toybox on the left. But if people go to the top level lists.landley.net page, they do not get a list of available lists, and every couple months (for many years) people ask me why, and I tell them "because Dreamhost is bad at this".
For comparison, if I go to http://lists.busybox.net I don't need to remember the exact URL of the list I want to look at, because there is a navigation link right to it. That is not true of the toybox project, and I can't fix it, and my stock response to everyone who asks is "because Dreamhost is bad at this". Your service makes my open source project look bad to the point it's a FAQ.
The top level index page is especially nice if I'm sitting down at a different machine than I'm normally at and using a standard web browser to see if there are new posts, because remembering the full URL with the "dash between toybox and landley but the dot between landley and net and also a dot between listinfo and cgi"... that's tricky to do from memory.
> Since it's public, clicking the "Toybox Archives" link will open up the archives for that list for anyone that finds it.
I know how mailing lists work. I use them. If you looked at the mailing list in question you'd see I last posted to it on thursday. The "enh" posts are from Elliott Hughes, the maintainer of the Android base operating system for Google. He's the second most active poster to the list in question. I used to have other mailing lists for other projects, but they ended or moved off dreamhost "because Dreamhost is bad at this".
> As for the robot.txt file,
It's robots.txt.
> your 'lists.landley.net' sub-domain for the list does not use a robot.txt file.
Because it's robots.txt, as defined in the IETF RFC documents: https://www.rfc-editor.org/rfc/rfc9309.html
Point a web browser at: http://lists.landley.net/robots.txt
Do you see the file there? The file is wrong. The result returned by fetching that URL (which I CUT AND PASTED INTO MY LAST MESSAGE TO YOU) prevents Google from indexing the site. I do not have control over this file, for the same reason I had no control over the "temporarily disabled" message. It is a thing Dreamhost did on a server I do not have direct access to.
> In fact, on the mailman server, the services are not actually under the list sub-domain. That's just the sub-domain that all of your lists are managed under.
Do you see the "which you won't let me log into" up in the subject: line of this email, from my original support request?
In the message you are replying to I explained that "landley.net" and "lists.landley.net" are on different servers and I don't have access to the lists.landley.net one to fix what's wrong with it. You are repeating my problem statement back at me.
> But, on the mailman server, each list has its own set of configurations and files. For example, the stuff for the 'toybox' list is under the 'toybox-landley.net' location on the mailman server and has no robots.txt file.
When you fetch the URL from the server, there _is_ a robots.txt file. (Which you spelled properly this time.) The text "temporarily disabled" probably wasn't in the toybox-landley.net subdirectory either. The mailman install has templates for shared infrastructure.
This implies that it's a global setting, and you have thus blocked google search on EVERY mailman domain that Dreamhost serves. (Which I suspected was the case but don't know what other server pages to look at to confirm it.)
> It's just a sub-domain DNS record that points to the list server for where the list is managed.
Yes, I know. I managed my own DNS for the first few years you hosted my site, until I took advantage of your free domain renewals as part of the bundle.
I'm sure there was a little "yes I am experienced at web stuff" radio button selector when I submitted this help request? It did not have a "I ran my own apache instance for about 10 years, have also configured nginx, and even wrote my own simple web server from scratch in three different langauges" option, but still. (The httpd implementation I wrote last year is at https://github.com/landley/toybox/blob/master/toys/net/httpd.c because I needed something to test the new wget implementation with, so I did a simple little inetd version. Haven't wired up CGI support yet but it's got about 1/3 of the plumbing for it in already.)
The problem isn't that I don't know what's wrong, it's that I do not have access to fix it. I thought I'd explained this already, but I can repeat it.
I can SEE the robots.txt file. So can google. It is there. It should not be.
> And lastly, I'm afraid that our list services are not configured to run through HTTPS and there are no plans on getting that updated at this time, unfortunately.
Yes, I know. But that isn't _fresh_ breakage, so I'm living with it as part of the general "dreamhost is bad at this" Frequently Asked Question.
But Google _could_ find my mailing list entries a year or so back, and can't now, so Dreamhost adding a bad robots.txt is fresh breakage. (Dunno how long the google cache takes to time out when a new deny shows up?)
Given that the project I'm maintaining through that mailing list is Google's command line utilities for Android (I.E. their implementation of ls/cat/set etc as described in https://lwn.net/Articles/629362/ ) that's especially embarrassing.
> This would be quite the project as it would require an upgrade of Mailman, likely to version 3, which is quite different from version 2. So, the list admin page can only be accessed through HTTP. I'm very sorry about that.
Eh, I'm used to it.
I don't _think_ Android has entirely dropped support for non-encrypted URLs yet, only for certain api categories. (Which sadly broke my podcast player when upgrading to Android 12 no longer let it load http:// podcast files, only https.) I think you still have a couple more years before your mailing list infrastructure becomes entirely inaccessible from phones: https://transparencyreport.google.com/https/overview?hl=en
That uptick to 100% in the chart when Android 13 came out is a bit worrying, but I haven't bought a new phone in a few years and mine is only supported through 12. _I_ can still access it. (And from my Linux laptop, of course. No idea if random windows or mac users still can though. Safari's policy and chrome's policy and mozilla's policy don't advance in lockstep, but I hear rumblings.
Most websites have put mandatory http->https forwarding in place where accessing http just gets you a 301 redirect to https for _years_ now. Try explicitly going to "http://cnn.com" or "http://google.com" in your browser, it will load the secure page. It can't _not_ do so.
The rise of "let's encrypt" (nine years ago according to https://lwn.net/Articles/621945/ ) was what finally let people start deprecating the old protocol in clients, because sites no longer have to pay for a certificate so even the third world organizations running a solar powered raspberry pi on their cell phone towers can afford https now.
> I hope that helps to clear things up.
No, it doesn't. The robots.txt file excluding * for / still needs to be removed so Google can index my mailing list posts like it used to do.
> Please contact us back at any time should you have any questions or concerns at all. We're here to help!
The concern I expressed in the subject line is still not fixed.
I'd guess they did this because they didn't have any other way to manage server load, and their servers are underprovisioned. I suppose if they're truly this incompetent and have no other solution, I can set up a cron job to scrape the lists.landley.net web archive and mirror it under landley.net? It's EXTREMELY SILLY to do so, but I can just add that to the FAQ I guess?
Oh hey, Greg Kroah-Hartman is also removing the RNDIS driver from Linux, which is how Android does USB tethering. I wonder when Linus stopped being maintainer? The glory hound's been trying to drag the spotlight onto himself for decades now, but used to get told "no" a lot for hopefuly obvious reasons. Honestly, he's half the reason I don't post to lkml anymore. Al Viro was less abrasive: I'll take honest distain over two-faced self-aggrandizing politics any day.
I have some domain expertise with USB ethernet: a couple years back Jeff and I implemented CDC-ACM USB ethernet hardware for the Turtle boards, which could talk to Linux and MacOS but not Windows because Windows doesn't support CDC-ACM. It's a reference implementation from a standards body, but does NOT have a generic Windows driver because Microsoft wants money from each hardware vendor to be "supported". To test it we got a beta of a driver from somebody that made it work for half an hour at a time (before you had to unplug it and replug it because the driver was an "evaluation" version that timed out), but Microsoft charged $30k to sign a driver for Windows, and each is specific to a vendor ID and model number. Microsoft chose to have no generic driver for the protocol, only drivers for specific devices, so each hardware vendor had to pay microsoft $30k each time they needed to update their driver. (They claim they eliminated unsigned drivers for "security", but it's a profit center for them.)
Everybody Jeff talked to suggested we implement the RNDIS protocol instead, which is something Microsoft invented but both Mac and Linux supported it out of the box, and that one DOES have a generic driver in Windows that doesn't require $30k periodically sent to microsoft. Switching our hardware to RNDIS didn't look hard, we just hadn't done the research to make sure there weren't any lurking patents. (PROBABLY not? https://web.archive.org/web/20120222000514/http://msdn.microsoft.com/en-us/library/windows/hardware/gg463298.aspx says "updated 2009" and "assumes the reader is familiar with Remote NDIS Specification; Universal Serial Bus Specification, the Revision 1.1" but that document has been carefully scrubbed off the internet, the oldest I can find is 5.0. Because implementing against the old version is a prior art defense, so the old version is yanked.
The protocol was all in the FPGA bitstream, the actual USB chip we'd wired to the FPGA pins was just a fancy transciever that didn't even know about packets, and USB 2.x "bulk" protocols are all the same packet transfers with different header info. We never got around to prototyping it, we ran out of time shortly after we got the CDC-ACM version working (including our own TERRIBLE userspace driver that just spun sending data to/from a memory mapped I/O interface into the kernel's "raw packet" plumbing, improving THAT was our next todo item but the benchtop prototype was 2x SMP so the driver eating a processor affected power consumption but not performance). Jeff and I both flew out of Tokyo, and a year and change into the pandemic the funding for that project ran out, so it got mothballed without doing a proper production run, and we just didn't get back to it. But using RNDIS was the easy fix, and it's what everybody ELSE in the industry did, including Android's USB tethering.
Now Greg KH seems to be saying "we're losing features left and right, our collaping development team can't maintain the stuff we've already got, so let's flex OUR market muscle to out-influence microsoft". Or something?
I suspect Android's response will be "USB tethering is no longer supported on desktop Linux then, oh well, here's a Linux driver for RNDIS if you want to make it work". I haven't asked Elliott, but I remember when USB file transfer between my Linux laptop and android phone used to be really simple... and then it was replaced by some Microsoft protocol I could theoretically install an elaborate Gnome program for which never worked. (Or I could install the Android Development Kit, enable the debug menu in my phone, and use ADB file transfer from the command line. I've had to download a new copy of the android tools from their website every time I've needed to get that to work, because version skew.) Linux on the Desktop is not a commercially significant target market, we get _courtesy_ at best.
Even years from now, it would still be WAY easier for the J-core guys to ship an out-of-tree Linux kernel module than externally add a driver to Windows without paying them $30k annually-ish. Stuff like the Steam Deck could 100% use an out of tree driver if they needed to. Greg is making vanilla linux development smaller, but who's really suprised? He was the author of the kernel's "Code of Conflict" after all, and Linus was the one who apologized on behalf of the community and very publicly went to therapy to dig the community even a little way out of that hole, not Greg. The aging development community was emitting distress signals in 2013, and again in 2017, and now it's 2023...
(Yes I know Greg wrote "Android has had this disabled for many years so there should not be any real systems that still need this." My phone's running Android 12, I just tethered to check and dmesg said "rndis_host 3-1.2:1.0 usb0: unregister 'rndis_host' usb-0000:00:1a.0-1.2, RNDIS device". Oh, and hey, there's a more convenient way to configure it than I've been doing. I honestly don't know if Greg is clueless or lying, but does it matter? He is Confidently Wrong White Male.)
USB 2.0 shipped in 2000 so it's fairly recently gone out of patent (hence predictable badmouthing from for-profit manufacturers TERRIFIED of commodity competition from cheap generic hardware; the instant anything becomes available for open royalty-free implementation in it MUST BE DESTROYED. The oldest RNDIS documentation I could find says "updated 2009" (not authored, updated, it's older than that) and "assumes the reader is familiar with Remote NDIS Specification; Universal Serial Bus Specification, the Revision 1.1" but that document has been carefully scrubbed off the internet, the oldest I can find is 5.0. Because implementing against the old version is a prior art defense, so the old version is yanked. It is entirely possible that it recently DID go out of patent... and thus must be destroyed. How that idea made it from one of the Linux Foundation's largest contributors to one of the Linux Foundation's most prominent employees, I couldn't speculate, but he's sure confident about it.
RNDIS isn't tied to a specific USB generation (it's a packet protocol going across a transport), but USB 2.0 should be out of patent now (the spec is dated April 27, 2000) and that chugs along around 40 megabytes per second, which is still a quite useful modern data rate: over 20 parallel 4K HD netflix streams, over two gigabytes per minute, just under 7 hours per terabyte. It's about 1/3 the _theoretical_ max rate of gigabit ethernet (which I never get), and we were implementing it full speed on hardware running at... 60mhz I think? Either 4 bit or 8 bit parallel bus into and out of the chip, moving multiple bits per clock. A USB-powered device talking USB-2.0 RNDIS ethernet isn't hard to implement. Our CDC-ACM implementation fit in an ICE-40 with space left over.
I'm grinding through some of those email files from yesterday, trying to identify all the patches sent to the linux-sh list (grep '^+++ ' seems a reasonable first pass there once they're in individual files), but thunderbird saved all the files with the current date so it's not easy to filter for relevance. So I'm doing for i in sub/*; do toybox touch -d @$(date -d "$(toybox dos2unix < "$i" | sed -n 's/^Date: [ \t]*//p;T;q')" +%s) "$i" || break; done (as you do, yes gnu/dammit date gets unhappy with \r on the end of a date string, apparently), and I get an error message:
date: invalid date ‘Mon Sep 29 01:50:05 2014 +0200’
And I'm going... wha? Cut and paste that string to toybox date and... yes, it fails too. First guess: click back in xfce's little calendar widget, September 29, 2014 was a... sunday. Seriously? Sigh. Ok, FINE. Oddly, that date's not from the headers, it's from an inline patch which means... how is my T;q on the sed not triggering? (Back before I added that, date was complaining that multiple concatenated dates with \n were not a valid date...)
Ah, my sed is wrong. It expects a space after the date and that message has a tab in the headers, so it continued on and pulled one from a "git am"-ish patch in the body of the message. Ok, fix that and check that they all convert... yup, now they do.
Huh. You know, _technically_ netcat UDP server mode could handle one packet and then go back into "listening for new connection" mode, which would solve the 'locks itself to a specific source port' issue. That wouldn't work for child processes: the reason it's handling UDP packets the sameway as TCP connections is so we can pass off stdin/stdout filehandles to child processes. Which is where the "no notification of when a connection with no flow control closes" problem comes from: we'd need some sort of keepalive packet and there's no obvious place to insert that (if the kernel hasn't got a flag we'd need a Gratuitous Forwarder Process of some kind). The reason I didn't do that before is I don't want two codepaths to implement the same thing. Really, my use case here for interactive mode is "Linux net console". Does that send from a consistent source port even across reboots? Hmmm...
At this point I honestly expect healthcare.gov to KEEP sending me emails after today: "You missed the deadline, it was yesterday! How could you!" Yes I did try Obamacare one year, but at the moment I have the classic "VERY NICE health insurance through spouse's work" arrangement, in this case Fade's graduate program through the end of the summer, and then maybe we'll do that Common Object Request Broker Architecture thing to extend it a bit if she hasn't found a job yet, at which point it's _next_ healthcare.gov enrollment period). Alas there's no obvious way to tell obamacare's automated system that A) I'm currently good, B) you are basically useless in Texas because republican assholes bounced the subsidies and sabotaged implementation, C) I schedule doctor's appointments when I visit my wife up to minneapolis because the hospitals _here_ are collapsing unless all you need is a $100 visit to a nurse practitioner in a strip mall to get regulatory permission to purchase pills from a pharmacy, which are all over the place now. (How much of that collapse was covid and how much was foretold in legend and song is an open question. Two answer the follow-up questions: 1) Yes it's intentional, 2) if you don't work for a billionare-owned company getting a UTI costs more than a car and potentially more than a house so you will put up with ANYTHING to keep your job and they have less competition from small businesses and independent contractors. Guillotine the billionaires.)
I wonder if there's some way to get mastodon to do the green check mark thing? If you view source, I've had the link up top for a while now, with the magic rel="me" thing that's apparently an important part of it, but it just doesn't register? (I was reminded by updating the page links for the new year...)
Always odd when I get a request to do a thing I'm in the middle of doing. Yay, I'm on the right track? Not quite sure how to reply... "Um... yeah."
While Fade was here, heading out to poke at my laptop usually meant I'd use like a quarter of my battery then head back. Getting the low battery warning comes as a surprise after a few weeks of Not Doing That.
Dreamhost forwarded my support request to a higher level tech. That's nice. Unlikely to hear back before monday
I am once again impressed by how broken Thunderbird is. This needs some context. So Rich Felker theoretically maintains Linux's arch/sh but he hasn't updated his linux-sh git repo in over a year, and Cristoph Hellwig unilaterally decided to delete Linux support for the architecture despite plenty of people still using it and having an active debian port and so on. He didn't just suggest it, but posted a 22 patch series to remove it. (The charitable explanation is he's doing a don't a "don't ask questions, post errors" thing and putting the onus on US to object loudly enough.) Of course Greg KH immediately jumped up and went "I am deciding" because he's like that, but in THEORY Linus still has the final say here, and has not weighed in last I checked? And of course the motivations for the removal are contradictory: the primary complaint is it hasn't been converted to device tree (which is true of a lot of stuff), so the reply is be sure to remove the stuff that IS using device tree. Thanks ever so much.
The guy who maintains the Debian fork has tenatively volunteered to become the new maintainer, and one thing he'd need is all the patches that Rich chronically hasn't applied for years now. (Jeff informed me of this, and has volunteered to help the new guy, but will NOT say so on the list, and I quote: "Not going to engage with LKM toxicity in any way, got permanently away from that way back in 2002." So I connected them in private email and am very tired of doing that. But I still haven't posted this set to the kernel list myself, so can't exactly blame him?) So a useful concrete thing I can do is grab the accumulated linux-sh patches that have gone by on the list. So I'm giving it a go.
The first problem is gmail is crazy, and only ever keeps ONE copy of a message when I'm sent multiple copies with different headers, which means when I get emails cc'd to linux-kernel and linux-sh I only get one copy and which list-id it is is semi-random. (Usually linux-sh because that server has fewer subscribers so sends my copy out faster, but not reliably.) In each mbox in which I _don't_ get a copy, reply threads get broken, and if I ever wanted to put together a cannonical toybox history mbox file (and start a quest chain to eventually insert it into Dreamhost's web archive to fix the gaps) I'd have to check my toybox folder AND my inbox AND my sent mail (because I don't get my OWN messages sent back to me through the list either). But that's not FRESH stupid.
So I've done a search on my linux-kernel folder in thunderbird for messages with linux-sh in the "to or cc" field, which defaulted to searching subfolders but ok. Some of those subfolders are architectures or subsystems I follow (linux-hexagon and such, linux-sh is a seprate top level folder in my layout because I checked it regularly), but most of those subfolders are previous years of linux-kernel that I moved messages out to because thunderbird not only melts if you try to open a folder with a few hundred thousand messages in it, but email delivery slows down because the filters appending email TO those large mbox files somehow scale with the size of the mbox file they're appending to, and having linux-kernel as a regular destination gets noticeably slow every 3 months or so, and email fetch CRAWLS after 9 months without reorganization. So I have to periodically do maintenance to keep thunderbird running by moving messages into yearly folders to fight off whatever memory eating N^2 nonsense is in thunderbird's algorithms (a name and an offset in an mbox file doesn't seem like THAT big a struct, but it is in C++). Thunderbird's, "click then shift click to highlight a bunch of messages, right click move to other folder" plumbing ALSO scales badly with lots of messages (the "swap thrash" threshold is somewhere around 40k messages, which is much faster with an SSD but really not good for it, and the actual OOM killer kicks in somewhere in the 60k-90k range. There's something N^2 in their algorithm maybe? Yes the right click menu popping up even with 20k messages selected can take 30 seconds; it's a chore). But again, that's not FRESH stupid.
Thunderbird's search results window presents a list of messages but doesn't let me right click and DO anything to them. (I can double click one at a time to open in a new window, but not what I want here.) Instead I have to create a virtual "search subfolder", which has a pink icon and populates itself slowly as it re-performs the search (of each subfolder) each time you go into it, but otherwise seems to act as a regular folder. Fine. And after it had stopped adding messages clicking on the last message in the list pegged the CPU for 45 seconds before it showed me its contents. FINE. So eventually manage to I highlight all the messages in the pink folder, right click and get a menu, tell it to "save as"... and the resulting destination pop-up doesn't give the option of making a new mbox file, it wants to save them as individual messages. Ok. So I give it an empty folder to do so in, and...
Here's the FRESH stupid: a thousand empty files show up at once, with no contents yet, 30 seconds later the contents fill out in the filiesystem but I also get a pop-up saying "couldn't save file". Because it tried to open all the files it was writing IN PARALLEL and ran out of filehandles. (Or maybe loop to open them all, loop to write them all, loop to close them all? Why would anyone do that? The default ulimit -n is 1024, the default HARD ulimit is 4096 filehandles per process without requiring root access to increase. Don't DO that.)
Remember how I said Mozilla was not a real open source development organization? They are BAD AT THIS. So is the Free Software Foundation, the Linux Foundation, and even Red Hat. Capitalism mixes badly with open source development, even when it a nominal foundation claiming to shield developers from capitalism. Red Hat inflicted systemd on the world for profit (we're not allowed to opt out), the FSF zealots became as bad as what they fought, and the Linux Foundation and Mozilla did that wierd 501c6 trade association thing (a for-profit nonprofit tax dodge) where they're endlessly fundraising to provide exclusive members-only benefits.
Fade is on the airplane (as is This Dog).
Sigh, I do not have time for kernel shenanigans, but I guess I need to make time. Grrr. (There's a maintained debian port, Hellwig. Stop it. I didn't even post the bc removal patch to the list! I should do so, every release from here on...)
And Wednesday's question has been answered: the reason the middleman couldn't manage to pay my invoice THIS time is because "our policy is to pay invoices in arrears rather than in advance". Which is news to me because I was previously doing quarterly invoices (not risking TOO much of the money to a single transaction) and they paid Q4 in october. This time I invoiced for 2 quarters at once (hoping not to go through a multi-week debugging/negotiation process QUITE as often), and... Sigh. (They have literally one job. This is the fourth time it has not gone smoothly.)
The Executive Director of the middleman went on to suggest "if there is a strong reason for invoicing in advance, please do let us know, and we may be able to make specific arrangements for this — such a binding you to an agreement as a consultant."
I replied:
I invoiced for 2 quarters this time largely because each of the previous 3 invoices had some sort of multi-week issue. I honestly did not expect this one to go through smoothly either, but was at least hoping to deal with the problem less often. (I left a good chunk of the sponsorship money in there because I'm still not ENTIRELY convinced it won't vanish in transit again and maybe not come back this time.)
Now that I know the fourth roadblock is bureaucracy, let me know when and how much I'm allowed to invoice for to conform with your procedures, and I'll do that then. I'm assuming invoicing for Q1 would still be paying me in advance, so... March? (In previous quarters I got paid for the quarter we were in, but now that I'm aware of "policy" I'm assuming that no longer works either. Can I invoice for Q1 now and get paid March 1, or do I have to wait to submit and approve the invoice?)
As for whether I'm a flight risk, I've been working on https://github.com/landley/toybox/commits/master for 16 years (ala https://github.com/landley/toybox/commit/13bab2f09e91) which is longer than github's existed (hence https://landley.net/hg/toybox). Every commit in that repo was applied to the tree by me, and I personally authored... grep says 3642 of them. Even the current mailing list goes back to 2011 (http://lists.landley.net/pipermail/toybox-landley.net/) and dreamhost is terrible at mailing lists (https://landley.net/dreamhost.txt and https://landley.net/dreamhost2.txt and no I don't know where the threading info went back at http://lists.landley.net/pipermail/toybox-landley.net/2013-August/thread.html but after https://landley.net/toybox/#12-21-2015 and https://landley.net/toybox/#23-07-2022 and such I'm not asking).
Most of that time toybox was a hobby project I did around various day jobs (https://landley.net/resume.html). Google decided to use toybox in late 2014 and I kept working on it as a hobby for another 7 years. I am very grateful to them sponsoring me to focus on this project, and have said so publicly multiple times including in the release notes (https://landley.net/toybox/#12-08-2022). Disclosure: before the sponsorship I did get the Google Open Source Award twice, which came with a $200 gift card each time.
I suppose I could always get hit by a bus or have a stroke or something, but I'm not sure how signing a contract with you would affect that?
How WOULD the middleman perform oversight? Do they have any idea what success looks like? The only other guy who gets cc'd on this sort of thing is Elliott, and even I can't reliably find stuff like that again 6 months later. (Would they assign somebody to read my blog? Would that HELP?) Eh, KLOCs I suppose. Judge the value of a car by the weight of metal added to its construction...
While trying to google for a link writing the above, I noticed that lists.landley.net is no longer visible via google at all, and traced it to Dreamhost adding a robots.txt blocking... everything. I didn't change anything, and don't have ACCESS to change anything (remember: it's a shared server and they don't let me log in directly, everything happens through a web panel). I have opened a support ticket.
Oh goddess:
Subject: The robots.txt you put on lists.landley.net (which you won't let me log into) blocks google.
Hello Rob,
Thank you for contacting the DreamHost support team, I'm sorry you're having this issue, I will be happy to help. After checking your site under landley.net, I was not able to find the robots.txt you've mentioned, so to check the rules and offer you solutions. Have you deleted the file to prevent blocking Google crawling your site?
Please, have a look at our article on how to create a robots.txt file that is convenient for you https://help.dreamhost.com/hc/en-us/articles/216105077
I hope this troubleshooting and information was useful for you. Please, don't hesitate to contact back the support team in case you need it.
They didn't even read the TITLE of my support request, did they?
My reply:
> After checking your site under landley.net, I was not able to find the robots.txt you've mentioned,
Because lists.landley.net is not the same web server as landley.net. Your mailing lists run on a different (shared) server which I don't have direct access to, and which I can only interact with through your web panel.
Your server has been mildly broken for years, such as refusing to give a list of available mailing lists under https://lists.landley.net (which has been "temporarily disabled" for over a decade).
But sometime in the past year or so the robots.txt on lists.landley.net (which is not landley.net) changed, so that:
https://www.google.com/search?q=site%3Alists.landley.net
Says "no information available on this page", and when I click "learn why" under that it goes to:
https://support.google.com/webmasters/answer/7489871?hl=en#zippy=%2Cthis-is-my-site%2Cthe-page-is-blocked-by-robotstxt
> Have you deleted the file to prevent blocking Google crawling your site?
I would love to get access to lists.landley.net to fix stuff there, but the lack of that has been a persistent issue dealing with you for some years now:
https://landley.net/dreamhost.txt
https://landley.net/dreamhost2.txt
https://landley.net/toybox/#23-07-2022I haven't even bothered to ask where the thread information for older months went:
http://lists.landley.net/pipermail/toybox-landley.net/2013-August/thread.html
(It used to be able to indent those, but not anymore.) But far and away the BIGGEST problem with lists.landley.net is you can't access it via https but only http, which means mailing list administration sends a plaintext password across the internet for every page load. (Because the Let's Encrypt certificate for landley.net isn't available to the shared lists.landley.net server.)
> Please, have a look at our article on how to create a robots.txt file that is convenient for you https://help.dreamhost.com/hc/en-us/articles/216105077
I know what a robots.txt file is. But I do not have access to change any of the files at https://lists.landley.net. I can only ssh into the sever that provides landley.net (ala www.landley.net) because the different domain name resolves to a different host.
> I hope this troubleshooting and information was useful for you.
Not really. Here is the issue:
https://landley.net/robots.txt - 404 error https://lists.landley.net/robots.txt - ERR_CONNECTION_REFUSED http://lists.landley.net/robots.txt User-agent: * Disallow: /Meaning Google cannot index the site. It USED to index the site, but it stopped sometime during the pandemic, because of you.
I hate having to explain people's own systems to them. It's embarassing for both of us. I also dislike having to reenact the Dead Parrot Skit. ("If you want to get anything done in this country you've got to complain until you're blue in the mouth.") I feel there should be a better way, but I'm not good enough to find it.
Walked to the table for the first time in a while. (I was hanging out with Fade in the evenings, and mostly staying on a day schedule with her.) Pulled up the list of pending shell items... and then spent the evening editing old blog entries since new year's.
My blog plumbing (such as it is) has a slight year wrapping issue: I switch to a new filename for 2023. The rss feed generator takes the html file as input and emits the most recent 30 entries in rss format, using the stylized start of new day html lines to split entries. Which means if the December entries aren't appended to the new year's file they'll prematurely vanish from the RSS feed, but when I DO append them I keep forgetting to delete them and I think some previous years might STILL have the previous december at the end?
There's always a temptation to cheat and not edit/publish January for a week or two, so that when the RSS feed updates everybody's had plenty of time to notice the old stuff, and anybody new checking it won't see a questionably short list. Not that I need MORE incentive to procrastinate about a large time sink...
Fade flies home tomorrow, mostly spending time with her.
The ls.c stuff (fallout from this) is harder than it needs to be because ls --sort has a lot of long options (and I added a couple more). The current help text looks like:
-c ctime -r reverse -S size -t time -u atime -U none -X extension -? nocase -! dirfirst
And I went "well of course that should be comma separated values with fallback sorts, just like sort itself does!" and that's... tricksy. Each of those can be specified as a short option (which doesn't save the order it saw them in), and you can presumably mix short and long options, and I dowanna re-parse the list each time because that feels slow but I don't have a good format to put it in?
Eh, data format's not hard: array of char that's the offset of the flag value for the sort type. Break the comparison out into its own function and feed it either toys.optflags or 1<<sort[i] in a loop. If I ensure flag 0 isn't interesting (it's currently -w, not a sort option) then it's even a null terminated string (of mostly low ascii values, but still). But the design and user interface parts are still funky: the longopts would accumulate as fallback sorts and the single character sort flags should switch each other off? No, it's more complicated than that: you can do ls -ltr with is reversed mtime, so they DO chord at least sometimes... Actually, "reverse" is specifically weird. Sticky. It should ALWAYS go last because otherwise it has no effect.
No, the really WEIRD chording is -cut with or without -l. (I was working on this before, I know I was, and I document COMPULSIVELY, but it's not in the blog. Is it on the mailing list? One of the myriad github pages? A commit comment? Did I make the mistake of typing it at someone in IRC and setting the little mental "it's been written up!" flag? Who knows...)
(Once upon a time the #uclibc channel on freenode, where all the busybox developers hung out back in the day, was logged to a web page, which I believe went down when the guy hosting it went through a bad divorce, and in any case freenode got bought by a korean billionaire who did to it what the muskrat is doing to twitter. I still sometimes think "written in irc" means "I can find it again later", but have mostly trained myself back out of that these days.)
Anyway, the issue is that the ls man page (and behavior) is nuts:
-c with -lt: sort by, and show, ctime (time of last modification of file status information); with -l: show ctime and sort by name; otherwise: sort by ctime, newest first
So -l disables -c's sorting behavior and you add -t to get it back. Same for -u. That's horrible historical nonsense and I need to make it work, but where does --sort ctime and --sort atime work into this mess?
As always "how to implement" is easy and "what to implement" is hard.
Sigh: [-Cxm1][-Cxml][-Cxmo][-Cxmg][-cu][-ftS][-HL][-Nqb] is tangly. But lib/args.c hasn't got a [-Cxm[1log]] syntax and there haven't been other callers for it.
I have invoiced the middleman! Let's see how it fails to work THIS time.
Staring at the bugs shell fuzzing found. And ls.c. And the shell "read" command. And the shell "command" command, because command -v is where scripts/portability.sh barfs trying to run the test suite under toysh.
Not really making a lot of progress on any of them, but looking. Oh, and I should read that git format documentation...
Finally cut a toybox release. And then updated the date in news.html to the correct day AFTER cutting the release. (The tarball and tagged commit still say the 8th. Always something.)
I have reached the "revert and rip stuff out" stage of release prep: if it's not ready, punt it to later.
Disabled the date test in "make tests" again: not shipping a fresh toolchain this time. Put bc and gcc back in the airlock because not demanding people build a patched linux this time either. More FAQ updates. I can't get the ls.c work in this time...
Right wing loons are flipping out about a possible ban on gas stoves, which means Fuzzy has been involved in an argument online where somebody insisted it was impossible to make proper custard on induction, so we have a big pot of custard now. It's lovely. Peejee has a want. (Cat, YOU may be spry and feisty but your kidneys are 19.)
Peejee had custard.
Found a problem with "make sh_tests" where some of the early tests weren't testing the right shell. There's a context switch before which you can do "sh -c 'test' arguments", and then it switches to having all the tests run through sh -c _for_ you, to ensure it's all being tested in toysh rather than being "eval" under whichever shell the test suite is itself running in. (On Debian, bash. In Android's case, mksh.) You can manually wrap tests before that yourself, but I found a set of tests before the switch but weren't wrapped, and moved them after the switch so they happen in the proper context... and some fail. Now I've gotta fix unrelated stuff before running the test suite gets me back to seeing the failures I was debugging before Progress of a sort, I suppose. But it puts us firmly into "punt this whole mess until AFTER next release" territory.
Oh hey, right after I tried to pivot AWAY from working on toysh, Eric Roshan-Eisner ran a fuzzer on toysh and found several ways to segfault it. Fixed some low hanging fruit, punting the rest until (say it with me) after the release.
I kiiiinda wanted to get "make test_sh" passing its test suite this release. Not happening, but I have multiple dirty sh.c forks I'm trying to check in and delete. The release is the time to finish unfinished things and clean up what you can.
The next "make test_sh" failure was a simple fix [editorial note: while trying to add the link to the blog I realized I'd checked it in to the WRONG TREE: pushed now] but the next thing that test tried to do is call the shell builtin "read"... which I haven't implemented yet. Taking a stab at it now, but there's a design problem: it does line reads but lets read -u substitute a different file descriptor. Hmmm...
Strings are hard, and that includes efficiently reading lines of input. This is why I had get_line() all those years: byte-at-a-time is slow and CPU intensive, but larger reads inevitably overshoot, and you can't ungetc() to an fd. (Well you _can_ but only to a seekable fd, which does not include pipes or tty devices.) This is why the ANSI/ISO C committee invented the FILE * back in the 1980s: somewhere to store a buffer of data you block read and save the extra for next time. But shells don't USE file pointers, they use file descriptors, both for redirect and when spawning child processes.
This isn't AS bad because pipe and tty devices return short reads with the data they've got, so when a typing human is providing input MOST of the time the computer will respond to the enter key before you press the next key. And piping between programs, each printf() turns into a seperate write() system call which sends a batch of data through the pipe and if the read() at the far end receives that data before more gets sent (and concatenated in the pipe buffer) then it hasn't read ahead part of the next line there either. But if you DO type fast (or something like "zcat file.gz | while read i" happens) then the read gets extra characters that go on the next line, but the read returns and the next read happens only knowing the file descriptor. (If you're wondering why you see "echo $X && sleep .1 && echo $Y && .sleep .1" in some places in the test suite... generally this sort of thing. Even that's not ENTIRELY deterministic if the system's heavily loaded enough.)
This same problem would also screw up trying to provide input to a command, such as echo 'fdisk blah.img\nn\np\n1\n\n\nw' | sh because the FILE * stdin used to read the fdisk line will read ahead to the rest of the data in the input block, which is then NOT provided by file descriptor 0 to the fdisk child process, because it was aleady delivered and is waiting in an unrelated buffer. (I bothered the Posix and C99 guys about querying how many bytes of data are waiting in the buffer so I could fread() them back OUT and pass them along, about like my tar.c does when autodetecting compression type from pipe input. You read data and then need to DO something with it, can't go back into the file descriptor.
(If you COULD unget data into a read-only file descriptor, that would be a security issue. Unix semantics generally do make sense when you dig far enough, because everybody's had 50 years to complain and usually would have fixed it by now if it was actually wrong.)
All this reminds me I'm ALREADY mixing FILE * and stdin in sh.c because get_next_line() takes a FILE * argument, but that's always been a placeholder function. I need to implement interative line editing and command history, which I should tackle soon but it's not a _small_ can of worms to open and I want to get the shell more load bearing before putting a big neon "welcome" sign out front. But the interactive stuff should use the input batching trick I introduced to busybox vi years ago to figure out what is and isn't an ANSI escape sequence, which I already iplemented in lib/tty.c scan_key_getsize() and is STRONGER reliance on input batching. (It would be lovely if I could fcntl() a pipe to NOT collate data written to it in its buffers, but this seems to be another "optimization" I can't turn off. It would also make testing dd SO much easier if I could do that...) Anyway, scan_get_getsize() is always doing 1 byte reads to assemble the sequence from a file descriptor without overshooting, because "interactive keyboard stuff" really should not be a big CPU/battery drain on modern systems. (He says knowing that "top" is a giant cpu hog that really needs some sustained frowning to try to be less expensive. I dunno if it's all the /proc data parsing or the display utf8 fontmetrics or what, but something in there needs more work.)
I suppose I could try to lseek() on the input, and do block reads when I can and single byte reads if I can't? The problem is the slow path is the common case. I don't want zcat file | while read i to be an unnecessarily slow CPU hog, and the FAST way is using getline() through a FILE * (or writing my own equivalent; generally not an improvement). Which doesn't work for -u, and if I wrapped that in a FILE * where would I save the cache struct between "read" command calls? How do I know when it's done and I can free it? Can of worms. Redirect and FILE *stdin aren't going to play nice together, but what's the BETTER way?
Sigh, I'm not entirely sure what the corner cases here ARE. Coming up with test cases demonstrating it causing problems is a headache all its own. And some of those corner cases I'm pretty sure bash suffers from and their answer is 1/10th second sleeps.
I don't WANT two codepaths, one for stdin and one for -u other than 0. That's just creepy.
Elliott didn't like xmemcmp() so it's smemcmp() now. (Yeah, I know he's right, just... trying to get a release out!) Bugfix from somebody at google for sh.c (yay, people are looking at it). FAQ update...
The new dishwasher is not arriving today, supply chains are failing to supply, or possibly chain. New estimate is the 20th. Fuzzy is very tired of doing dishes by hand, we have purchased paper blates, cups, bowls, and plastic utensils.
Day 2 of Actual Release Prep, which _seems_ like it's mostly just creating a big page of release notes but is actually "go through each commit since the last release and re-examine them", which is second pass review (more design than code per se) and a MASSIVE tangent generator. It always winds up taking multiple days to actually get a release out, and that's AFTER I've done a full mkroot build-and-test recently on the kernel version I plan to release with, using the toolchain I plan to release with. (I.E. no blocker bugs.)
The toolchain issue's a bit wonky this time, because llvm version skew broke my ability to rebuild the hexagon toolchain with the script that worked last time, but I need to rebuild musl to include the patch for the issue Rich refused to fix but which otherwise breaks the date test that runs on glibc and bionic but not musl. (I enabled the date and gunzip tests in "make tests" because they worked fine on glibc and bionic, and only after I'd checked it in realized that date was still disabled because the musl bug was a pending todo item.)
It's a one line fix to musl, but Rich won't do it because Purity or something, and after many years fruitlessly arguing with Rich I just put the simpler workarounds in my code, and otherwise patch musl in the toolchains I build. I wave a patch at Rich once so it's not MY fault when he says no, same general theory as linux-kernel. (Except there I tend to do second, third, even fourth versions when I get engagement. Less so when they're ignored.)
Speaking of which, my musl patches are still inline in scripts/mcm-buildall.sh but I've moved the kernel patches from scripts/mkroot.sh out to a separate repository. I should philosophically collate my patch design approach at some point, but I'm not holding up the release for it now.
In general broken out patches are better in case other projects want to pick them up. Squashfs was widely used out of tree for many years before lkml deigned to notice, and some of the Android stuff still is I think? Back on Aboriginal I had a patches dir with linux-*.patch in it. For this release I'm probably putting 0001-*.patch files for the kernel in the mkroot binaries release dir because "apply these to your arbitrary kernel version" seems easier than collating two otherwise unrelated kernel trees via git. But how much of that is what I'm used to vs what other people are used to? (I mean I HAVE a branch published on github, but have to redo it each release I do a musl thingy on and then can't delete the old ones if they're load bearing, which is non-collated cruft I dowanna encourage/accumulate. "Probably gonna delete this later" is not a good ongoing policy.
Rant cut and pasted from the list to the blog:
I'm not a fan of over-optimizing compilers. My commodore 64 had a single 1mhz 8 bit processor with 38911 basic bytes free, and it was usable. I'm typing this on a quad processor 2.7 ghz 64 bit processor laptop with 16 gigabytes of ram, and this thing is completely obsolete (as in this model was discontinued 9 years ago: they were surplussed cheap and I bought four of them to have spares).
Performance improvements have come almost entirely from the hardware, not the compiler. The fanciest compiler in the world today, targeting a vintage Compaq Desqpro 386, would lose handily to tinycc on a first generation raspberry pi. Hardware doubled performance roughly annually (cpu was 18 months but memory and storage and stuff in creased in parallel) and each major compiler rewrite would be what, 3% faster? The hardware upgrades seldom broke software (rowhammer and specdown meant the hardware didn't work as advertised, but that's an obvious bug we worked around, not "everything intentionally works different now, adjust your programs"). Every major gcc upgrade had some package it won't build right anymore and the gcc devs say we shouldn't EXPECT it to.
Part of this attitude is fallout from the compiler guys back around 1990 making such a big deal about the move from CISC to RISC needing instruction reordering and populating branch delay slots and only their HEROIC EFFORTS could make proper use of that hardware... and then we mostly stayed on CISC anyway (yes including arm) and the chips grew an instruction translation pipeline with reordering and branch prediction.
I'm aware this is a minority view and I'm "behind the times", but if I wanted the tools to be more clever than necessary I'd be up in javascript land writing code that runs in web browsers, or similar.
This difference of viewpoint between myself and people maintaining compilers in C++ keeps cropping up, and I have yet to see a convincing argument in favor of their side. They're going to break it ANYWAY.
I'm currently editing the December 28 blog entry about tidying up the html help text generator, and I realized a corner case I hadn't handled: nbd-client says "see nbd_client" which doesn't exist. (Public dash version vs private underscore version because C symbol generation.) Sigh. Ok, fix the help text generator AGAIN...
I keep nibbling at the release, but... time to start writing release notes. Ok, git log 0.8.8..HEAD and... there have been a few commits, haven't there? Lot to go through. But first, the hardest part: picking a Hitchhiker's quote I haven't already used.
Working to make ASAN=1 make test_sh pass, which is whack-a-mole. The address sanitizer's a net positive, but it's a close thing at times. (Gimme a stack trace of where the problem OCCURRED, not just where the closest hunk of memory was historically allocated.)
Refrigerator dude stopped by and vacuumed the refrigerator coils, which were COVERED with cat hair. Showed me how to do it myself, not that I own a vacuum cleaner. (Tile floors. Big flood back in 2014. The only carpet in the house these days is throw rugs we take outside and do a sort of bullfighter thing with.)
The outsourced washing machine guy called and said the symptoms I'm seeing on the model I have means the circuit board's almost certainly fried, probably had water leak onto it, which with labor is something like $800 to replace (Bosch is reliable, but not repairable), and getting a new dishwasher of the same model and having the old one hauled away is basically the same price, so there's not much point him coming out to look at it and billing us for not being able to help. (Professional repair dude, no shortage of work.) Thanked him and Fade ordered a new one from the people we got the replacement washer and dryer through, they think it'll be here Friday.
Yes, I am aware I did the refrigerator thing because "they're already coming" and then it was two seperate servicebeings. There was meta-upsell there, apparently. Unlikely to use Sears again, which is convenient since they only barely seem to still exist as a kind of temp agency.
Finishing up pending toysh fixes. I was redoing math stuff and it's always more work to figure out what I was doing when I leave myself half-finished commits in the tree. I can see what the code there is doing, but have to work out what I MEANT it to do and examine how much I got done to figure out what I left out. The design work is as always the tricksy part: is there anything I didn't think of at the time, or thought of but didn't implement, or thought of then but am not remembering now? (There's no such thing as sufficiently copious design notes that do NOT include a finished implementation. Not that the implementation by itself is always sufficient either, which is why there's code comments and commit comments and blog entries...)
Not gonna manage to merge expr.c in with $((math)) this release, and I'm not 100% sure it's doable (well, worth doing) at all. And the big missing piece seems to be a floating point version of this same code. Python seems to do arbitrary precision math: 1<<256 and 1/1.7 resolve but 1.7<<256 is an error. Multiple codepaths!
Eh, punt for now. Close the tab, move on to the next...
The dishwasher died. As in the power button does nothing, acts like it's not plugged in but the outlet works when we plug in other stuff? (RIGHT at new year's. IS there such a thing as a Y2023 bug?)
Despite Sears having died years ago, Google Maps has a number for "sears appliance services" in hancock center. (In the middle of the parking lot on the map.) And when I called it I got... what's probably an indian call center, but sure. They have our file from when we bought the thing. And want $150 for a service technician to come look at it. Hmmm...
I'm not entirely sure how they upsold me on having my refrigerator serviced, but it was just an extra $50 and nobody's looked at it in ~5 years and I'd rather it didn't go out, so sure. Why not. (Fade thought we might as well, anyway. Long as the dude's already making the trip...)
Happy new year! The first in a while that isn't "the year of hindsight", "last year won", or "also that year". We are finally "2020 free", or at least experiencing a diet version thereof.
Grrr. The recent xmemcmp() changes in file.c left me with an open tab where I WANT to replace a bunch of memcmp(x, "string", 8) with #define MEMCMP(x, y) xmemcmp(x, y, sizeof(y)) so you don't need to specify the length, but unfortunately it won't quite work. Yes, sizeof() treats a string constant as an array and thus gives you the allocation size, including the null terminator. Some of the comparisons in file.c are checking the NULL terminator, and some aren't. Having two #defines for the two different cases pushes this out of net-positive mental complexity savings territory. Subtle enough the NEW thing becomes a sharp edge you can cut yourself on. The other is redundant/tedious but very explicit.
While reviewing them I did find a memcmp(s+28, "BBCD\0", 5) so once again no review is ever COMPLETELY wasted...
Maybe I should rename _mkroot_ to "dorodango"...