Topic: mkroot walkthrough https://landley.net/talks/mkroot-2023.txt ------------------------------------------------------------------------------- Speaker: Rob Landley email: rob@landley.net blog: https://landley.net/notes.html mastodon: @landley@mstdn.jp patreon: https://www.patreon.com/landley The toybox package contains a ~300 line bash script that cross-compiles a Linux system from source code and boots it to a shell prompt on a dozen QEMU targets, with a second script that automatically smoketests each image (it boots and runs, and the clock, network, and block device all work) to regression test new versions of Linux, qemu, and itself. project source: https://github.com/landley/toybox (see mkroot/mkroot.sh) documentation: https://landley.net/toybox/faq.html#mkroot Note: https://github.com/landley/mkroot is OBSOLETE Merged into toybox 3 years ago, but nobody reads the notice @ top of README You want the mkroot directory in toybox. ----------------------------------------------------------------------------- Talk sections: What is mkroot? download and run prebuilt binaries build from source code walkthrough of mkroot.sh Introduction (view from orbit): what: tiny system builder (~300 line bash script) 2 source packages (toybox+linux), boot to shell prompt under an emulator - if you really want busybox, there's a build-time option... Support for a dozen architectures (arm, mips, ppc...) Builds with host compiler (but glibc sucks), musl-cross-make, Android NDK - not hard to add others. I did an llvm hexagon toolchain... use binaries: apt-get install qemu-system wget https://landley.net/bin/mkroot/latest/m68k.tgz tar xvf m68k.tgz; cd m68k; ./run-qemu.sh build: grab linux source (kernel.org or github.com/torvalds/linux) git clone https://github.com/landley/toybox; cd toybox wget https://landley.net/bin/toolchains/latest/s390x-linux-musl-cross.tar.xz tar xvf s390x-*cross.tar.xz mkroot/mkroot.sh CROSS_COMPILE=$PWD/s390x*/bin/s390x-linux-musl-cross- \ LINUX=~/linux cd root/s390x; ./run-qemu.sh ccc/ and CROSS= without LINUX= just build root filesystem, no cpio.gz, kernel or run-qemu.sh without CROSS= builds for host (ok on alpine/musl. SUCKS on glibc.) source: https://github.com/landley/toybox/blob/master/mkroot/mkroot.sh three sections: setup create root filesystem build and package kernel mkroot/packages testroot.sh, tar-for-web.sh, record-commands ------------------------------------------------------------------------------ What is mkroot? - tiny system builder - creating filesystems you can chroot into, and bootable system images - supports a dozen-ish architectures x86,arm,powerpc,mips,superh,s390x... - big/little, 32/64, some nommu in there - Part of toybox - "mkroot" subdir of https://github.com/landley/toybox - https://landley.net/toybox/faq.html#mkroot - ~300 line bash script - mkroot/mkroot.sh is 309 lines, does not use any other file in mkroot dir - optional additional build packages that can be listed on the command line - but you don't have to, can ignore that. Expandable but not required. - testroot.sh (automated smoketests), tar-for-web.sh (package for website) Why is mkroot? "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.” - Antoine de Saint-Exupéry My goal is to build the simplest Linux system capable of rebuilding itself under itself from source code, and then building a full Linux distro under the result. "Minimal native development environment" for Linux. I already did this once (https://landley.net/aboriginal/about.html) - Rebuilt itself with 7 source packages - busybox, linux, uclibc, gcc, binutils, make, bash - It built Linux From Scratch 6.8 under the result. - That's why I was busybox maintainer for a while. - That work made Alpine Linux possible. - That project is obsolete, ancient package versions. I left busybox and now do toybox. I'm targeting Android (AOSP) instead of LFS. Android merged toybox in 2015: https://lwn.net/Articles/629362/ I am attempting to make Android self-hosting. Rebuild AOSP on stock phone OS - Plug phone into USB, add keyboard+mouse+HDMI (or chromecast) - Android is cross-compiled from PC because of software, not hardware - having Android phone should be enough to become a first class developer - fix need to buy a PC to do "real programming", second class citizen Links to backstory: https://landley.net/aboriginal/about.html http://lists.landley.net/pipermail/toybox-landley.net/2020-July/011898.html https://landley.net/toybox/about.html 2013 talk https://www.youtube.com/watch?v=SGmtP5Lg_t0 2019 talk https://www.youtube.com/watch?v=MkJkyMuBm3g#t=1m18s https://landley.net/aboriginal/history.html download and run prebuilt binaries - Do you have qemu? apt-get install qemu-system (generally subtly broken in a bunch of ways) or git clone https://gitlab.com/qemu-project/qemu and a lot of arguing - Prebuilt binary system images: wget https://landley.net/bin/mkroot/latest/ tar xvzf, cd, ./run-qemu.sh, ls -l /dev, head /proc/cpuinfo, exit - ls -l linux-kernel - bootable kernel, configured for qemu initramfs.cpio.gz - initramfs archive run-qemu.sh - script to invoke qemu on the other two files - ls -l docs/ cat docs/README KARGS='one=two three=four' ./run-qemu.sh -hda docs/linux-fullconfig cat /proc/cmdline; env ls /dev; head /dev/sda ./run-qemu.sh -m 512; head /proc/meminfo ./run-qemu.sh --help shows qemu help What else is in docs? wc docs/* | sort -n README - above linux-fullconfig - kernel .config file (2000 lines) linux-miniconfig - concise kernel .config (70 lines) linux-microconfig - what build uses (3 lines, 640 bytes) expanded by (more or less): sed 's/([^,]*)($|,)/CONFIG_\1=y\n/g' mkroot/testroot.sh automation harness - Example automation harness. (Advanced "using the system images".) - Emulator boots, emulated system performs action, and then exits. - test serial console, run hello world, block device, clock, network - catches regressions in toybox, emulator, compiler, C library... - tricks - httpd -> wget - mksquashfs, /dev/?da -> /mnt/init (mksquashfs advantage: multiple mounts) - test toybox+mksquashfs in path, create scratch dir, write control script that emulator will run instead of shell prompt, package with mksquashfs, run host httpd with index.html and check, rest is basically: for i in root/*/linux-kernel; do_test $i; done - if that file exits build completed successfully. do_test() - get $ARCH from arg, two qemu bug workarounds (lock+SIGTTOU), timeout 10 ./run-qemu.sh -hda file.sqf > log note V=1 and V=2 output loop: - command line args vs all targets that built? - parallelism magic $(nproc), $COUNT, wait -n with "." output Report pass, failed, and didn't build - 3 different automation approaches - run code at boot time -hda and /mnt/init - or package and /etc/rc (but initramfs size limits) - "expect" style stdin/stdout driver as child process - some targets truncate large input at serial buffer size (QEMU bug) - Adding "dropbear" package to build lets you ssh into image Building from source: Need three things: 1) toybox source, 2) cross compiler, 3) linux source mkroot/mkroot.sh CROSS=sh4 LINUX=~/linux host build git clone https://github.com/landley/toybox, run mkroot/mkroot.sh (no args) - this makes a chroot, not a bootable system image, /init mounts /proc etc - unshare chroot root/host/fs /init - Ok on musl (alpine, etc) but sucks on glibc, because glibc is broken - note the zillion warnings. getaddrinfo() means DNS lookups won't work, breaking ping/wget. getpwuid() means ls -l can't show usernames breaking ls -l - uses dlopen() to load .so at runtime, EVEN WHEN STATICALLY LINKED - This CAN'T work: https://www.openwall.com/lists/musl/2012/12/08/4 - Wind up with two instances of the heap pointer (static, libc.so) and if you malloc() from one and free() from the other... - Ulrich Drepper hated static linking, intentionally sabotaged it - yes, intentionally. google "ulrich drepper static linking harmful" - Hello World is 700k - To work around this, they invented flatpak/snap, copy a chroot - make it work? - copy all host toolchain libraries from debian = 1.7G - copy from toolchain, not host (see mkroot/packages/dynamic) - work with cross compiling - squashable to ~400M - call "ldd" to copy libraries? libraries depend on libraries so you have to do it recursively - https://git.busybox.net/busybox/tree/testsuite/testing.sh#n121 - except that was before invention of dynamic linker scripts - /usr/lib/*/libm.so was a text file - problem: libraries you dlopen() at runtime aren't visible to ldd - haven't yet bothered code robust workaround for intentional sabotage Cross compile - get cross compilers (gcc+musl) - prebuilt binary wget https://landley.net/bin/toolchains/latest/ $ARCH-cross.tar.xz (also native.sqf) or ccc.txz - or build cross compilers from source (takes a while) git clone https://github.com/richfelker/musl-cross-make cd into mcm and run ~/toybox/scripts/mcm-buildall.sh - 247 lines, not a long read. - shout out to magic escape sequence, changes title bar wait a very long time - builds both cross _and_ native compilers. - You can also use Android NDK (llvm+bionic), but that's advanced https://landley.net/toybox/faq.html#cross3 - docs at https://landley.net/toybox/faq.html#cross and #targets - also #cross2 and #cross3 - Build with cross compiler - mkroot/mkroot.sh CROSS_COMPILE=/path/to/m68k-linux-musl- - note: prefix ends with "-", build appends cc/ld/nm as necessary. - mkroot/mkroot.sh CROSS=m68k - searches in ccc/ for appropriate cross compiler - magic target "all" and "allnonstop" iterates through ccc/*-cross - CROSS=help lists available cross compilers under ccc/ - now DNS lookups work! Static linking doesn't add multiple megabytes! far fewer warnings about unsupported features... (Modulo alpine linux.) - This is still just a chroot - qemu has 2 modes: qemu-$ARCH (app) and qemu-system-$ARCH (sys) - distro qemu install sets up binfmt_misc to call qemu application emulation, and transparently run non-native ELF binaries. file root/m68k/fs/bin/toybox root/m68k/fs/bin/toybox - application emulation not hugely load bearing though. $ root/m68k/fs/bin/uname -m m68k $ root/m68k/fs/bin/head /proc/cpuinfo - it's actually a lot more robust to emulate hardware than translate syscalls. - hardware has a couple I/O ports and DMA/ring buffers. - big VM has maybe a dozen interesting pieces of virtual hardware - system calls - pass big structures around with dozens of fields - there are hundreds of them, translation is non-obvious - what they DO is opaque (ala read from /proc/cpuinfo) - weird corner cases - mmap() on a system different huge page size - mips binary really wants a mips kernel: use system emulation. 3) Linux source - Vanilla linux should work, but I have patches to simplify it a lot. - Add LINUX=/path to command line, ala LINUX=~/linux - CROSS= specifies architecture, autodetects if none specified (uname -m) - provides a kernel config - writes a run-qemu.sh script for target - packages "fs" into cpio.gz archive for initramfs (external by default) - Actual system image running in a VM. - Can do real hardware too (sh2eb is an example) - but installing it and configuring bootloaders are up to you ------------------------------------------------------------------------------ mkroot/mkroot.sh walkthrough: Three parts: 1) setup 2) build chroot dir 3) build kernel+package it up Having it all in one file is a design choice. - I could factor "init" out into separate file. - and then kernel build could be a package. - but "boot to a shell prompt" is one coherent idea. - style thing: I make fairly extensive use of && and || for error handling - "test && stuff" also shorter alternative to "if test; then stuff; fi" - grouped error handling: thing && thing && thing || exit 1 Variables set from command line PKG - list of extra packages to build this time (set indirectly) CROSS_COMPILE - cross compiler prefix, with optional path (or blank) CROSS - short name of the architecure we're building for (or "host") LINUX - path to kernel source (if unset, just build root filesystem) KEXTRA - additional kernel config symbols to set (CSV) PENDING - additional toybox commands to enable (not in defconfig) Input directories CCC - $PWD/ccc used by $CROSS PKGDIR - $PWD/mkroot/packages used by $PKG (build scripts live here) Output directories: TOP - $PWD/root output directory BUILD - $TOP/build all non-shipping build output: logs, source... TEMP - $BUILD/$CROSS-tmp Target-specific work dir (cp -s source snapshot) LOG - $BUILD/log log file pairs (stdout+stderr, all commands run) OUTPUT - $TOP/$CROSS Where results of build get installed (ship this) ROOT - $OUTPUT/fs New root filesystem (chrootable) OUTDOC - $OUTPUT/docs README plus saved kernel config in 3 formats $PATH wrappers: AIRLOCK - $BUILD/airlock sanitized $PATH for rest of build: toybox+cc WRAPDIR - $BUILD/record-commands logs each command line before calling next LOGPATH - $LOG/$CROSS-commands.txt where $WRAPDIR appends each command line to Set to disable stuff: NOCLEAR - Don't clear inherited environment variables NOAIRLOCK - Don't replace $PATH with toybox and symlinks to toolchain NOLOG - Don't log stdin/stdout NOLOGPATH - Don't wrap $PATH with record-commands Things the kernel if/else staircase sets for each recognized architecture: QEMU KARCH KARGS VMLINUX KCONF DTB BUILTIN KERNEL_CONFIG - describe these in the part 3 (kernel build) Set by mkroot/root/$PACKAGES QEMU_MORE - extra stuff to add to run-qemu.sh (see $PKGDIR/dropbear) MODULES - which kernel modules to build (see $PKGDIR/tests) 1) Setup: Fiddle with environment variables Clear environment. - pass through HOME PATH LINUX CROSS CROSS_COMPILE Handle any command line arguments - "NAME=value" or "packagename" set path variables define some convenience shell functions - announce: change terminal title bar, and write to log with greppable === - die: exit with error message What compiler are we using? Did they set $CROSS_COMPILE? - Set $CROSS to short name (between last / and first -) Else did they set $CROSS? - Handle magic names "all" and "allnonstop" (call ourselves in loop) - else set $CROSS_COMPILE from $CROSS using wildcards to check $CCC dir - else list available targets. (Complain if $CCC isn't there.) Else default $CROSS to "host" Set output paths with $CROSS in them (couldn't before) Verify compiler can build static binaries Build airlock (NOAIRLOCK=1 to skip) Optional step, Google calls this a "hermetic" build environment. Resulting system built under the same commands it has in it. Halfway to rebuild-under-itself, also insulated a bit from host variation This work's actually done by scripts/install.sh (start on line 94) - Build and install host version of defconfig toybox into $AIRLOCK dir - Symlink in 10 commands toybox doesn't yet provide ($PENDING) - expr git tr bash sh gzip awk bison flex make - Symlink in 6 toolchain binaries kernel needs ($TOOLCHAIN) - I have kernel patches to reduce this to 4 - replace "bc" script with C code - use generic "cc" name instead of "gcc" (bonus: autodetects llvm) - loop handles fallback directories to support distcc and ccache - also historical ubuntu versions that wrapped gcc with a perl script mkdir -p work directories TEMP is set here (not overrideable) because NOT rm -rf user supplied data ROOT is set to a default value here because we only rm -rf if it was blank Wrap $PATH with logpath (NOLOGPATH=1 to skip) Optional step: append each command line called out of the $PATH to log file mkroot/record-commands does the work uses $WRAPDIR and $LOGPATH sourced so it can adjust $PATH Note adjusting CROSS_COMPILE from abs path to just prefix out of $PATH so those calls also get logged Start logging stdout/stderr (NOLOG=1 to skip) 2) Create root filesystem Directory layout: mkdir, chmod, ln -s Write init script (as HERE document, chmod +x at the end) mount /dev /dev/shm /dev/pts /proc /sys - don't need tmpfs in initramfs, already are enable ping (kernel bug workaround) If we are PID 1 (QEMU) bring up net (ifconfig, rout add default gw, static QEMU config) ntp clock if board doesn't emulate battery backed up clock run package init scripts (/etc/rc/* in sort order) source them: so they can export vars, exec to grab PID 1, etc quiet printk run oneit (simple init program, runs one child, shutdown when it exits) - PID 1 is special, controlling TTY easier in child process If not PID 1 (chroot) run shell as child process unmount everything and exit Write /etc/resolv.conf with google's 8.8.8.8 Write /etc/passwd, /etc/group with 3 users/groups: root, guest, nobody If we have packages in $PKG (from command line): insert package "plumbing" at the start of list sets $DOWNLOAD dir to $PWD/root_download defines 3 functions: download: wget source tarball and compare hash (keeps existing file) setupfor: extract tarball and cd into directory cleanup: rm -rf source directory from setupfor source each package script in order not child processes so they can edit vars, ala $QEMU_EXTRA pushd/popd around it so we return to $TOP whatever setupfor did Build static toybox and install into $ROOT - adds "sh" and "route" to defconfig, two commands that need more work (so aren't enabled in defconfig) but mkroot can't work without - working on it - also $PENDING variable so you can test more commands from pending - space separated rather than CSV, yeah that's a rough edge... 3) build kernel, run-qemu.sh and initramfs.cpio.gz - The kernel devs break stuff all the time, each new version needs debugging - initramfs is a tmpfs instance with an archive extracted into it at boot time https://kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt - can be statically linked into kernel or supplied externally via bootloader - ramfs != ramdisk - ramfs resizes dynamically, deleting files frees memory, writes allocate it's basically mounting the page cache as a filesystem - tmpfs is fancy ramfs with bounds checking and ability to flush to swap - devtmpfs is a subclassed tmpfs that populates itself with /dev nodes - ramdisk is fixed size block of memory which emulates a block device - you then format it and mount with second driver (ext2, vfat, etc) - deleting files does NOT free memory, fixed size preallocated - files ALSO copied into disk cache, so two copies of data eat memory - ramdisk basically obsolete now we have ramfs and loopback mounts - initramfs inherited a lot of old initrd plumbing - kernel command line https://kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt - console= never defaults to serial console output, so needs console=ttyS0 - panic=1 If kernel panics, reboot after X seconds - qemu --no-reboot turns reboot into "exit the emulator". otherwise QEMU hangs on panic because that's what hardware would do - HOST=$KARGS unrecognized keyword=value sets env variables for pid 1 - So I can include the target type in shell prompt (eventually) function csv2cfg - convert comma separated values to CONFIG_$BLAH lines if VAR=VAL output it, else VAR=y if no $LINUX, exit immediately if $CROSS=host CROSS=$(uname -m) if/else staircase for each $CROSS type QEMU - chunk of qemu command line for ./run-qemu.sh KARCH - kernel build needs KARCH=$KARCH to select arch/hexagon etc. KARGS - kernel command line arguments (embedded in console=%s) VMLINUX - path to kernel image in the source dir after build KCONF - kernel configuration symbols in microconf format (CSV) DTB - path to device tree binary in source dir after build BUILTIN - if set, statically link cpio.gz into kernel instead of external KERNEL_CONFIG - extra string to append verbatim to fullconfig - can't currently represent CONFIG_NAME="string,with,commas" else die "unknown $CROSS" Write out run-qemu.sh - real hardware won't have one. See sh2eb for example. J-core turtle board. Build kernel: cp -sR trick, creates directory of symlinks to original source consumes inodes but not space, shares disk cache among parallel builds however, kernel "make clean" won't delete symlinks, so tree must be clean cheat: cd to source dir, make distclean there, then cp -sR Write linux-miniconfig into $OUTDOC Only mkdir $OUTDOC when stuff to put in it. (No kernel build, no dir.) run csv2cfg on: 1) symbols common to all arcuitectures 2) $KCONF for this arch 3) $KEXTRA (set from cmdline and/or packages) 4) $MODULES (sets to =m instead of =y) - gotta KEXTRA+="MODULES,MODULE_UNLOAD" (see "tests" package) Expand to create linux-fullconfig... then edit out some stupid stuff. Switch _on_ CONFIG_EXPERT in order to be able to switch OFF some stuff. Can't have a CONFIG_STANDARD with a bunch of "selects", oh no... if $BUILTIN set, append CONFIG_INITRAMFS_SOURCE="$OUTPUT/fs" - kernel will archive it up itself compile and install kernel make ARCH=$KARCH CROSS_COMPILE="$CROSS_COMPILE" -j $(nproc) all Copy $DTB and $VMLINUX to $OUTPUT Archive up modules - The modules go in the root filesystem. - You can rebuild the root filesystem without rebuilding the kernel. - This is a chicken-and-egg problem. - The cpio format allows you to append archives together and they extract as one continuous archive - not all cpio implementations support this, but kernel+toybox do - (Turns out that TRAILER!!! entry just flushes the hardlink cache) - so archive up modules directory, then append it to initramfs.cpio.gz If $BUILTIN not set, create external initramfs.cpio.gz find . | cpio -o -H newc -R +0:0 | gzip > initramfs.cpio.gz - Described in ramfs-rootfs-initramfs above - the find gives correct relative paths from top of filesystem - cpio -o = output, -H newc is modern format (only one kernel supports), -R +0:+0 = force root:root ownership - Notice it extracts that modules.cpio.gz and runs it through gzip again. So two cpio archives within same gzip session. If we got this far, rename log.n to log.y to mark success - and delete "$TEMP" dir if empty mkroot/packages - plumbing - download, setupfor, cleanup - dropbear - big package build with dependencies (ssh/sshd with zlib support) downloads+builds 2 source packages, edits /etc/shadow, adds /etc/rc/dropbear, adds $QEMU_MORE, adds ./ssh2dropbear.sh - overlay - copies directory into root filesystem, easy way to add more files mkroot/mkroot.sh CROSS=armv7l overlay OVERLAY=$PWD/potato - dynamic - start of dynamic build (package that copies shared libraries) alas, can of worms (about 1/2 hr to just to explain the problems) - busybox - build busybox and add to the $PATH before/after/instead of toybox - slightly bit-rotted but easy enough to polish up again - tests - in-progress, working to run the toybox test suite under mkroot - lots of tests require root access or specially configured system - mkroot/testroot.sh could do a lot more than a simple smoketest mkroot/tar-for-web.sh - fairly straightforward. Adds two README files.