From 35dccebea56fac58131890f30b4d377aafad5839 Mon Sep 17 00:00:00 2001 From: Rob Landley Date: Sat, 24 Oct 2020 05:58:24 -0500 Subject: Just the FAQ's, ma'am. --- www/faq.html | 628 ++++++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 452 insertions(+), 176 deletions(-) diff --git a/www/faq.html b/www/faq.html index 3072b80e..6704cc5d 100755 --- a/www/faq.html +++ b/www/faq.html @@ -6,11 +6,18 @@

General Questions

Using toybox

@@ -19,42 +26,32 @@
  • How do I install toybox?

  • How do I cross compile toybox?

  • -
  • Where does toybox fit into the Linux/Android -ecosystem?
    -(I.E. What part of the operating system does toybox provide, -and what does it depend on?)

  • +
  • What part of Linux/Android does toybox provide?

  • How do I build a working Linux system with toybox?

  • - -

    Q: Do you capitalize toybox?

    - -

    A: Only at the start of a sentence. The command name is all lower case so -it seems silly to capitalize the project name, but not capitalizing the -start of sentences is awkward, so... compromise. (It is _not_ "ToyBox".)

    - -
    -

    Q: "Why is there toybox? What was wrong with busybox?"

    +

    Q: "Why is there toybox? What was wrong with busybox?"

    -

    A: Toybox started back in 2006 when I +

    A: Toybox started back in 2006 when I (Rob Landley) handed off BusyBox maintainership and started over from scratch on a new codebase after a protracted licensing argument took all the fun out of working on BusyBox.

    Toybox was just a personal project until it got -relaunched -in November 2011 with a new goal to -make Android -self-hosting. This involved me relicensing my own -code, which made people who had -never used or participated in the project loudly angry. The switch came +relaunched +in November 2011 with a new goal to make Android +self-hosting. +This involved me relicensing my own +code, which made people who had never used or participated in the project +loudly angry. The switch came after a lot of thinking about licenses and the transition to smartphones, which led to a -2013 -talk laying -out a strategy to make Android self-hosting using toybox. This helped +2013 talk laying +out a +strategy +to make Android self-hosting using toybox. This helped bring it to Android's attention, and they merged it into Android M.

    @@ -66,25 +63,34 @@ out the GPLv2 baby with the GPLv3 bathwater. Android explicitly discourages use of GPL and LGPL licenses in its products, and has gradually reimplemented historical GPL components such as its bluetooth stack under the -Apache license. Similarly, Apple froze xcode at the last GPLv2 releases -(GCC 4.2.1 with binutils 2.17) for over 5 years while it sponsored the +Apache license. Apple's even +more pronounced response was to freeze xcode at the last GPLv2 releases +(GCC 4.2.1 with binutils 2.17) for over 5 years while sponsoring the development of new projects (clang/llvm/lld) to replace them, -implemented its SMB server from scratch to replace samba, -and so -on. Toybox itself exists because somebody with in a legacy position +implementing a +new SMB server from scratch to +replace samba, +switching bash with zsh, and so on. +Toybox itself exists because somebody with in a legacy position just wouldn't shut up about GPLv3, otherwise I would probably still happily be maintaining BusyBox. (For more on how I wound up working on busybox in the first place, see here.)

    -

    Q: Why a 7 year support horizon?

    +

    Q: Do you capitalize toybox?

    + +

    A: Only at the start of a sentence. The command name is all lower case so +it seems silly to capitalize the project name, but not capitalizing the +start of sentences is awkward, so... compromise. (It is _not_ "ToyBox".)

    + +

    Q: Why a 7 year support horizon?

    A: Our longstanding rule of thumb is to try to run and build on hardware and distributions released up to 7 years ago, and feel ok dropping support for stuff older than that. (This is a little longer than Ubuntu's Long Term Support, but not by much.)

    -

    My original theory was "4 to 5 18-month cycles of moore's law should cover +

    My original theory was "4 to 5 of the 18-month cycles of moore's law should cover the vast majority of the installed base of PC hardware", loosely based on some research I did back in 2003 and updated in 2006 @@ -93,59 +99,62 @@ law below the high end systems, and that another 2-3 iterations should cover the useful lifetime of most systems no longer being sold but still in use and potentially being upgraded to new software releases.

    -

    It turns out I missed -industry changes in the 1990's that stretched the gap -from low end to high end from 2 cycles to 4 cycles, and _that_ analysis -ignored the switch from PC to smartphone cutting off the R&D air supply of the -laptop market. Meanwhile the Moore's Law s-curve started bending -down in 2000 and these days is pretty flat because the drive for faster clock +

    That analysis missed industry +changes in the 1990's that stretched the gap +from low end to high end from 2 cycles to 4 cycles, and ignored +the switch from PC to smartphone cutting off the R&D air supply of the +laptop market. Meanwhile the Moore's Law s-curve started bending back down (as they +always do) +back in 2000, and these days is pretty flat: the drive for faster clock speeds stumbled -then died, and -the subsequent drive to go wide maxed out around 4x SMP with ~2 megabyte -caches for most applications. These days the switch from exponential to +and died, with +the subsequent drive to go "wide" maxing out for most applications +around 4x SMP with maybe 2 megabyte caches. These days the switch from exponential to linear growth in hardware capabilities is -common -knowledge.

    +common knowledge and +widely +accepted.

    But the 7 year rule of thumb stuck around anyway: if a kernel or libc feature is less than 7 years old, I try to have a build-time configure test -for it and let the functionality cleanly drop out. I also keep old Ubuntu -images around in VMs and perform the occasional defconfig build there to +for it to let the functionality cleanly drop out. I also keep old Ubuntu +images around in VMs to perform the occasional defconfig build there to see what breaks. (I'm not perfect about this, but I accept bug reports.)

    -

    Q: Why time based releases?

    +

    Q: Why time based releases?

    A: Toybox targets quarterly releases (a similar schedule to the Linux -kernel) because Martin Michlmayr's -talk on the -subject was convincing.

    +kernel) because Martin Michlmayr's excellent +talk on the +subject was convincing. This is actually two questions, "why have +releases" and "why schedule them".

    Releases provide synchronization points where the developers certify "it worked for me". Each release is a known version with predictable behavior, and right or wrong at least everyone should be seeing -similar results where you might be able to google an unexpected outcome. +similar results so might be able to google an unexpected outcome. Releases focus end-user testing on specific versions where issues can be reproduced, diagnosed, and fixed. Releases also force the developers to do periodic tidying, packaging, documentation review, finish up partially implemented features languishing in their private trees, and give regular checkpoints to measure progress.

    -

    Over time feature sets change, data formats change, control knobs change... -For example toybox's switch from "ls -q" to "ls -b" as the default output -format wasn't exactly a bug, it was a design improvement... but the +

    Changes accumulate over time: different feature sets, data formats, +control knobs... Toybox's switch from "ls -q" to "ls -b" as the default output +format was not-a-bug-it's-a "design improvement", but the difference is academic if the change breaks somebody's script. -Releases give you the option to schedule upgrades later, and not to rock -the boat just now: just use a known working release version.

    +Releases give you the option to schedule upgrades as maintenance, not to rock +the boat just now, and use a known working release version until later.

    The counter-argument is that "continuous integration" can be made robust with sufficient automated testing. But like the -waterfall method, this places insufficent +waterfall method, this places insufficent emphasis on end-user feedback and learning from real world experience. Developer testing is either testing that the code does what the developers -expect given expected inputs running in an expected environment, or it's +expect given known inputs running in an established environment, or it's regression testing against bugs previously found in the field. No plan survives contact with the enemy, and technology always breaks once it -leaves the lab and encounters real world data and use cases, not just -at runtime but in different build environments.

    +leaves the lab and encounters real world data and use cases in new +runtime and build environments.

    The best way to give new users a reasonable first experience is to point them at specific stable versions where development quiesced and @@ -153,10 +162,16 @@ extra testing occurred. There will still be teething troubles, but multiple people experiencing the _same_ teething troubles can potentially help each other out.

    -

    As for why releases on a schedule are better than releases "when it's -ready", watch the video.

    +

    Releases on a schedule are better than releases "when it's ready" for +the same reason a regularly scheduled bus beats one that leaves when it's +"full enough": the schedule lets its users make plans. Even if the bus leaves +empty you know when the next one arrives so missing this one isn't a disaster. +and starting the engine to leave doesn't provoke a last-minute rush of nearby +not-quite-ready passengers racing to catch it causing further delay and +repeated start/stop cycles as it ALMOST leaves. +(The video in the first paragraph goes into much greater detail.)

    -

    Q: Where do I start understanding the source code?

    +

    Q: Where do I start understanding the source code?

    A: Toybox is written in C. There are longer writeups of the design ideas and a code walkthrough, @@ -168,31 +183,33 @@ accomplish, but here's a quick start:

    make; make install". Type "make help" to see available make targets.

    -

    The configure stage is copied from the Linux kernel (in the "kconfig" +

    The configure stage is copied from the Linux kernel (in the "kconfig" directory), and saves your selections in the file ".config" at the top -level. The "defconfig" target selects the +level. The "make defconfig" target selects the maximum sane configuration (enabling all the commands and features that -aren't unfinished, only intended as examples, debug code, etc) and is -probably what you want. You can use "make menuconfig" to manually select +aren't unfinished, or only intended as examples, or debug code...) and is +probably what you want. You can use "make menuconfig" to manually select specific commands to include, through an interactive menu (cursor up and down, enter to descend into a sub-menu, space to select an entry, ? to see an entry's help text, esc to exit). The menuconfig help text is the -same as the command's --help output.

    +same as the command's "--help" output.

    -

    The "make" stage creates a toybox binary (which is stripped, look in -generated/unstripped for the debug versions), and "install" adds a bunch of +

    The "make" stage creates a toybox binary (which is stripped, look in +generated/unstripped for the debug versions), and "make install" adds a bunch of symlinks to toybox under the various command names. Toybox determines which command to run based on the filename, or you can use the "toybox" name in which case the first -argument is the command to run (ala "toybox ls -l"). You can also build -individual commands as standalone executables, ala "make sed cat ls". -(The "make change" target builds all of them, as in "change for a $20".)

    +argument is the command to run (ala "toybox ls -l").

    + +

    You can also build +individual commands as standalone executables, ala "make sed cat ls". +The "make change" target builds all of them, as in "change for a $20".

    -

    The main() function is in main.c at the top level, +

    The main() function is in main.c at the top level, along with setup plumbing and selecting which command to run this time. The function toybox_main() in the same file implements the "toybox" multiplexer command that lists and selects the other commands.

    -

    The individual command implementations are under "toys", and are grouped +

    The individual command implementations are under "toys", and are grouped into categories (mostly based on which standard they come from, posix, lsb, android...) The "pending" directory contains unfinished commands, and the "examples" directory contains example code that aren't really useful commands. @@ -201,16 +218,16 @@ are _not_ selected by defconfig. (Most of the files in the pending directory are third party submissions that have not yet undergone proper code review.)

    -

    Common infrastructure shared between commands is under "lib". Most +

    Common infrastructure shared between commands is under "lib". Most commands call lib/args.c to parse their command line arguments before calling the command's own main() function, which uses the option string in the command's NEWTOY() macro. This is similar to the libc function getopt(), but more powerful, and is documented at the top of lib/args.c. A NULL option string prevents this code from being called for that command.

    -

    Most of the actual build/install infrastructure is shell scripts under -"scripts" (starting with scripts/make.sh and scripts/install.sh). -These populate the "generated" directory with headers +

    The build/install infrastructure is shell scripts under +"scripts" (starting with scripts/make.sh and scripts/install.sh). +These populate the "generated" directory with headers created from other files, which are described in the code walkthrough. All the build's temporary files live under generated, including the .o files built @@ -218,8 +235,8 @@ from the .c files (in generated/obj). The "make clean" target deletes that directory. ("make distclean" also deletes your .config and deletes the kconfig binaries that process .config.)

    -

    Each command's .c file contains all the information for that command, so -adding a command to toybox means adding a single file under "toys". +

    Each command's .c file contains all the information for that command, so +adding a command to toybox means adding a single file under "toys". Usually you start a new command by copying an existing command file to a new filename (toys/examples/hello.c, toys/examples/skeleton.c, toys/posix/cat.c, @@ -229,53 +246,294 @@ new filename), and modifying the help text, argument string, and what the code does. You might have to "make distclean" before your new command shows up in defconfig or menuconfig.

    -

    The toybox test suite lives in the "tests" directory, and is +

    The toybox test suite lives in the "tests" directory, and is driven by scripts/test.sh and scripts/runtest.sh. From the top level you can "make tests" to test everything, or "make test_sed" to test a single command's standalone version (which should behave identically, -but that's why we test). You can set TEST_HOST=1 to test the host versionn +but that's why we test). You can set TEST_HOST=1 to test the host version instead of the toybox version (in theory they should work the same), and VERBOSE=1 to see diffs of the expected and actual output when a -test fails (VERBOSE=fail to stop at the first such failure)

    - - -

    Q: How do I install toybox?

    +test fails. Set VERBOSE=fail to stop at the first such failure.

    + +

    Q: When were historical toybox versions released?

    + +

    A: For vanilla releases, check the +date on the commit tag +or the +example binaries against the output of "toybox --version". +Between releases the --version +information is in "git describe --tags" format with "tag-count-hash" showing the +most recent commit tag, the number of commits since that tag, and +the hash of the current commit.

    + +

    Android makes its own releases on its own +schedule +using its own version tags, but lists corresponding upstream toybox release +versions here. For more detail you can look up +AOSP's +git tags. (The Android Open Source Project is the "upstream" android vendors +start form when making their own releases. Google's phones run AOSP versions +verbatim, other vendors tend to take those releases as starting points to +modify.)

    + +

    If you want to find the vanilla toybox commit corresponding to an AOSP +toybox version, find the most recent commit in the android log that isn't from a +@google or @android address and search for it in the vanilla commit log. +(The timestamp should match but the hash will differ, +because each git hash includes the previous +git hash in the data used to generate it so all later commits have a different +hash if any of the tree's history differs; yes Linus Torvalds published 3 years +before Satoshi Nakamoto.) Once you've identified the vanilla commit's hash, +"git describe --tags $HASH" in the vanilla tree should give you the --version +info for that one.

    + +

    Q: Where do I report bugs?

    + +

    A: Ideally on the mailing list, although emailing the +maintainer is a popular if slightly less reliable alternative. +Issues submitted to github +are generally dealt with less promptly, but mostly get done eventually. +AOSP has its own bug reporting mechanism (although for toybox they usually forward them +to the mailing list) and Android vendors usually forward them to AOSP which +forwards them to the list.

    + +

    Note that if we can't reproduce a bug, we probably can't fix it. +Not only does this mean providing enough information for us to see the +behavior ourselves, but ideally doing so in a reasonably current version. +The older it is the greater the chance somebody else found and fixed it +already, so the more out of date the version you're reporting a bug against +the less effort we're going to put into reproducing the problem.

    + +

    Q: What are those /b/number bug report +links in the git log?

    + +

    A: It's a Google thing. Replace /b/$NUMBER with +https://issuetracker.google.com/$NUMBER to read it outside the googleplex.

    + +

    Q: What is the relationship between toybox and android?

    + +

    A: The about page tries to explain that, +and Linux Weekly News has covered toybox's history a +little +over +the +years.

    + +

    Toybox is a traditional open source project created and maintained +by hobbyist (volunteer) developers, originally for Linux but these days +also running on Android, BSD, and MacOS. The project started in 2006 +and its original author (Rob Landley) +continues to maintain the open source project.

    + +

    Android's base OS maintainer (Elliott Hughes, I.E. enh) +ported +toybox +to Android in 2014, merged it into Android M (Marshmallow), and remains +Android's toybox maintainer. (He explained it in his own words in +this podcast, starting either 18 or 20 minutes in depending how +much backstory you want.)

    + +

    Android's policy for toybox development is to push patches to the +open source project (submitting them via the mailing list) then +"git pull" the public tree into Android's tree. To avoid merge conflicts, Android's +tree doesn't change any of the existing toybox files but instead adds parallel +build infrastructure off to one side. (Toybox uses a make wrapper around bash +scripts, AOSP builds with soong/ninja instead and checks in a snapshot of the +generated/ directory to avoid running kconfig each build). +Android's changes to toybox going into the open source tree first +and being pulled from there into Android keeps the two trees in +sync, and makes sure each change undergoes full open source design review +and discussion.

    + +

    Rob acknowledges Android is by far the largest userbase for the project, +but develops on a standard 64-bit Linux+glibc distro while building embedded +32-bit big-endian nommu musl systems requiring proper data alignment for work, +and is not a Google employee so does not have access +to the Google build cluster of powerful machines capable of running the full +AOSP build in a reasonable amount of time. Rob is working to get android +building under android (the list of +toybox tools Android's build uses is +here, +and what else it needs from its build environment is +here), and he hopes someday to not only make a usable development +environment out of it but also nudge the base OS towards a more granular +package management system allowing you to upgrade things like toybox without +a complete reinstall and reboot, plus the introduction of a "posix container" +within which you can not only run builds, but selinux lets you run binaries +you've just built). In the meantime, Rob tests static bionic +builds via the Android NDK when he remembers, but has limited time to work +on toybox because it's not his day job. (The products his company makes ship +toybox and they do sponsor the project's development, but it's one of many +responsibilities at work.)

    + +

    Elliott is the Android base OS maintainer, in which role he manages +a team of engineers. He also has limited time for toybox, both because it's one +of many packages he's responsible for (he maintains bionic, used to maintain +dalvik...) and because he allowed himself to be promoted into management +and thus spends less time coding than he does sitting in meetings where testers +talk to security people about vendor issues.

    + +

    Android has many other coders and security people who submit the occasional +toybox patch, but of the last 1000 commits at the time +of writing this FAQ entry, Elliott submitted 276 and all other google.com +or android.com addresses combined totaled 17. (Rob submitted 591, leaving +116 from other sources, but for both Rob and Elliott there's a lot of "somebody +else pointed out an issue, and then we wrote a patch". A lot of patches +from both "Author:" lines thank someone else for the suggestion in the +commit comment.)

    + +

    Q: Will you backport fixes to old versions?

    + +

    A: Probably not. The easiest thing to do is get your issue fixed upstream +in the current release, then get the newest version of the +project built and running in the old environment.

    + +

    Backporting fixes generally isn't something open source projects run by +volunteer developers do because the goal of the project's development community +is to extend and improve the project. We're happy to respond to our users' +needs, but if you're coming to the us for free tech support we're going +to ask you to upgrade to a current version before we try to diagnose your +problem.

    + +

    The volunteers are happy to fix any bugs you point out in the current +versions because doing so helps everybody and makes the project better. We +want to make the current version work for you. But diagnosing, debugging, and +backporting fixes to old versions doesn't help anybody but you, so isn't +something we do for free. The cost of volunteer tech support is using a +reasonably current version of the project.

    + +

    If you're using an old version built with an old +compiler on an old OS (kernel and libc), there's a fairly large chance +whatever problem you're +seeing already got fixed, and to get that fix all you have to do is upgrade +to a newer version. Diagnosing a problem that wasn't our bug means we spent +time that only helps you, without improving the project. +If you don't at least _try_ a current version, you're asking us for free +personalized tech support.

    + +

    Reproducing bugs in current versions also makes our job easier. +The further back in time +you are, the more work it is for us digging back in the history to figure +out what we hadn't done yet in your version. If spot a problem in a git +build pulled 3 days ago, it's obvious what changed and easy to fix or back out. +If you ask about the current release version 3 months after it came out, +we may have to think a while to remember what we did and there are a number of +possible culprits, but it's still tractable. If you ask about 3 year old +code, we have to reconstruct the history and the problem could be anything, +there's a lot more ground to cover and we haven't seen it in a while.

    + +

    As a rule of thumb, volunteers will generally answer polite questions about +a given version for about three years after its release before it's so old +we don't remember the answer off the top of our head. And if you want us to +put any _effort_ into tracking it down, we want you to put in a little effort +of your own by confirming it's still a problem with the current version +(I.E. we didn't fix it already). It's +also hard for us to fix a problem of yours if we can't reproduce it because +we don't have any systems running an environment that old.

    + +

    If you don't want to upgrade, you have the complete source code and thus +the ability to fix it yourself, or can hire a consultant to do it for you. If +you got your version from a vendor who still supports the older version, they +can help you. But there are limits as to what volunteers will feel obliged to +do for you.

    + +

    Commercial companies have different incentives. Your OS vendor, or +hardware vendor for preinstalled systems, may have their own bug reporting +mechanism and update channel providing backported fixes. And a paid consultant +will happily set up a special environment just to reproduce your problem.

    + +

    Q: How do I install toybox?

    A: Multicall binaries like toybox behave differently based on the filename -used to call them, so if you "mv toybox ls; ./ls" it acts like ls. Creating +used to call them, so if you "mv toybox ls; ./ls -l" it acts like ls. Creating symlinks or hardlinks and adding them to the $PATH lets you run the -commands normally by name.

    +commands normally by name, so that's probably what you want to do.

    If you already have a toybox binary -you can install a tree of command symlinks living in +you can install a tree of command symlinks to the standard path locations (export PATH=/bin:/usr/bin:/sbin:/usr/sbin) by doing:

    for i in $(/bin/toybox --long); do ln -s /bin/toybox $i; done

    -

    or you can install all the symlinks in the same directory as the toybox binary +

    Or you can install all the symlinks in the same directory as the toybox binary (export PATH="$PWD:$PATH") via:

    for i in $(./toybox); do ln -s toybox $i; done

    When building from source, use the "make install" and "make install_flat" -targets with an appropriate PREFIX=/path/to/new/directory either -exported or on the make command line -(as mentioned in "make help" output).

    - - -

    Q: How do I cross compile toybox?

    +targets with an appropriate PREFIX=/target/path either +exported or on the make command line. When cross compiling, +"make list" outputs the command names enabled by defconfig. +For more information, see "make help".

    + +

    The command name "toybox" takes the second argument as the name of the +command to run, so "./toybox ls -l" also behaves like ls. The "toybox" +name is special in that it can have a suffix (toybox-i686 or toybox-1.2.3) +and still be recognized, so you can have multiple versions of toybox in the +same directory.

    + +

    When toybox doesn't recognize its +filename as a command, it dereferences one +level of symlink. So if your script needs "gsed" you can "ln -s sed gsed", +then when you run "gsed" toybox knows how to be "sed".

    + +

    Q: What's this ./ on the front of commands in your examples?

    + +

    A: When you don't give a path to a command's executable file, +linux command shells search the directories listed in the $PATH envionment +variable (in order), which usually doesn't include the current directory +for security reasons. The +magic name "." indicates the current directory (the same way ".." means +the parent directory and starting with "/" means the root directory) +so "./file" gives a path to the executable file, and thus runs a command +out of the current directory where just typing "file" won't find it. +For historical reasons PATH is colon-separated, and treats an +empty entry (including leading/trailing colon) as "check the current +directory", so if you WANT to add the current directory to PATH you +can PATH="$PATH:" but doing so is a TERRIBLE idea.

    + +

    Toybox's shell (toysh) checks for built-in commands before looking at the +$PATH (using the standard "bash builtin" logic just with lots more builtins), +so "ls" doesn't have to exist in your filesystem for toybox to find it. When +you give a path to a command the shell won't run the built-in version +but will run the file at that location. (But the multiplexer command +won't: "toybox /bin/ls" runs the built-in ls, you can't point it at an +arbitrary file out of the filesystem and have it run that. You could +"toybox nice /bin/ls" though.)

    + +

    Q: How do I make individual/standalone toybox command binaries?

    + +

    After running the configure step (generally "make defconfig") +you can "make list" to see available command names you can use as build +targets to build just that command +(ala "make sed"). Commands built this way do not contain a multiplexer and +don't care what the command filename is.

    + +

    The "make change" target (as in change for a $20) builds every command +standalone (in the "change" subdirectory). Note that this is collectively +about 10 times as large as the multiplexer version, both in disk space and +runtime memory. (Even more when statically linked.)

    + +

    Q: How do I cross compile toybox?

    + +

    A: You need a compiler "toolchain" capable of producing binaries that +run on your target. A toolchain is an +integrated suite of compiler, assembler, and linker, plus the standard +headers and +libraries necessary to build C programs. (And a few miscellaneous binaries like +nm and objdump.)

    -

    A: toybox is tested against three C libraries (bionic, musl, glibc) -with 2 compilers (llvm, gcc). The easy way to get coverage (if not every -combination) is:

    +

    Toybox is tested against two compilers (llvm, gcc) and three C libraries +(bionic, musl, glibc) in the following combinations:

    -

    1) gcc+glibc = host toolchain

    +

    1) gcc+glibc = host toolchain

    -

    Most Linux distros come with that as a host compiler, just build normally +

    Most Linux distros come with that as a host compiler, which is used by +default when you build normally (make distclean defconfig toybox, or make menuconfig followed by make).

    @@ -285,17 +543,19 @@ zillion linker warnings because one of its previous maintainers
    was insane (which meant at the time he refused to fix obvious bugs), plus it uses dlopen() at runtime to implement basic things like -DNS lookup (which is impossible +DNS lookup (which is almost impossible to support properly from a static binary because you wind up with two instances of malloc() managing two heaps which corrupt as soon as a malloc() -from one is free()d into the other), although glibc added +from one is free()d into the other, although glibc added improper support which still requires the shared libraries to be installed on the system alongside the static binary: -in brief, avoid.) -These days glibc is maintained by a committee instead of a single -maintainer, if you consider that an improvement.

    +in brief, avoid). +These days glibc is maintained +by a committee instead of a single +maintainer, if that's an improvement. (As with Windows and +Cobol, most people deal with it and get on with their lives.)

    -

    2) gcc+musl = musl-cross-make +

    2) gcc+musl = musl-cross-make

    The cross compilers I test this with are built from the musl-libc maintainer's @@ -329,13 +589,16 @@ export "PATH=~/musl-cross-make/ccc/m68k-linux-musl-cross/bin:$PATH"
    LDFLAGS=--static make distclean defconfig toybox CROSS=m68k-linux-musl-

    -

    Note: a non-static build won't run unless you install musl on your host. +

    Note: these examples use static linking becausae a dynamic musl binary +won't run on your host unless you install musl's libc.so into the system +libraries (which is an accident waiting to happen adding a second C library +to most glibc linux distribution) or play with $LD_LIBRARY_PATH. In theory you could "make root" a dynamic root filesystem with musl by copying the shared libraries out of the toolchain, but I haven't bothered implementing -that yet because a static linked musl hello world is 10k on x86 +that in mkroot yet because a static linked musl hello world is 10k on x86 (5k if stripped).

    -

    3) llvm+bionic = Android NDK

    +

    3) llvm+bionic = Android NDK

    The Android Native Development Kit provides an llvm toolchain with the bionic @@ -353,96 +616,110 @@ make distclean make LDFLAGS=--static CROSS_COMPILE=x86_64-linux-android- defconfig toybox -

    Again, you need a static link unless you want to install bionic on your +

    Again, you need to static link unless you want to install bionic on your host. Binaries statically linked against bionic are almost as big as with -glibc, but at least it doesn't have the dlopen() issues.

    - -

    Unfortunately, although the resulting toybox will run a bionic-based -chroot will not, because even "hello world" statically linked -against bionic will segfault before calling main() if /dev/null isn't -present, and the init script written by mkroot.sh has to run a shell linked -against bionic in order to mount devtmpfs on /dev to provide /dev/null.

    - - -

    Q: Where does toybox fit into the Linux/Android ecosystem?
    -(I.E. What part -of the operating system does toybox provide, and what does it depend on?)

    - -

    A: Toybox is a set of standard Linux command line -utilities, so that three packages (a Linux kernel, C library, and toybox) -provide a complete bootable unix-style command line system. Toybox provides a command -shell and over a hundred different commands to call from that command shell.

    - -

    Toybox is not a complete operating system, it's a program that runs under -an operating system. Booting a simple system to a shell prompt requires -an kernel (such as Linux, or BSD with a Linux emulation layer) -to drive the hardware, one or more programs for the system to run (toybox), -and a C library ("libc") to connect them together (toybox has been tested with -musl, uClibc, glibc, and bionic).

    - -

    The C library is delivered as part of a "toolchain", which is an integrated -suite of compiler, assembler, and linker, plus the standard headers and -libraries necessary to build C programs. (And miscellaneous binaries like -nm and objdump.)

    +glibc, but at least it doesn't have the dlopen() issues. (You still can't +sanely use dlopen() from a static binary, but bionic doesn't use dlopen() +internally to implement basic features.)

    + +

    Note: although the resulting toybox will run in a standard +Linux system, even "hello world" +statically linked against bionic segfaults before calling main() +when /dev/null isn't present. This presents mkroot with a chicken and +egg problem for both chroot and qemu cases, because mkroot's init script +has to mount devtmpfs on /dev to provide /dev/null before the shell binary +can run mkroot's init script. +Since mkroot runs as a normal user, we can't "mknod dev/null" at build +time to create a "null" device in the filesystem we're packaging up so +initramfs doesn't start with an empty /dev, and the +kernel +developers +repeatedly +rejected a patch to +make the Linux kernel honor DEVTMPFS_MOUNT in initramfs. Teaching toybox +cpio to accept synthetic filesystem metadata, +presumably in get_init_cpio format, remains a todo item.

    + +

    Q: What part of Linux/Android does toybox provide?

    -

    Static linking (with the --static option) copies the shared library contents -into the program, resulting in larger but more portable programs, which -can run even if they're the only file in the filesystem. Otherwise, -the "dynamically" linked programs require the shared library files to be -present on the target system, either copied from the toolchain or built -again from source (with potential version skew if they don't match the toolchain -versions exactly). See -"man ldd", -"man ld.so", -and "man libc" for details.

    +

    A: +Toybox is one of three packages (linux, libc, command line) which together provide a bootable unix-style command line operating system. +Toybox provides the "command line" part, with a +bash compatible +command line interpreter +and over two hundred commands +to call from it, as documented in +posix, +the Linux Standard Base, and the +Linux Manual +Pages.

    + +

    Toybox is not by itself a complete operating system, it's a set of standard command line utilities that run in an operating system. +Booting a simple system to a shell prompt requires a kernel to drive the hardware (such as Linux, or BSD with a Linux emulation layer), programs for the system to run (such as toybox's commands), and a C library ("libc") to connect them together.

    Toybox has a policy of requiring no external dependencies other than the -C library for defconfig builds. You can optionally enable support for +kernel and C library (at least for defconfig builds). You can optionally enable support for additional libraries in menuconfig (such as openssl, zlib, or selinux), but toybox either provides its own built-in versions of such functionality (which the libraries provide larger, more complex, often assembly optimized alternatives to), or allows things like selinux support to cleanly drop out.

    -

    Most embedded systems will add a fourth package to the kernel/libc/cmdline -above containing dedicated "application" that the embedded system exists to +

    Static linking (with the --static option) copies library contents +into the resulting binary, creating larger but more portable programs which +can run even if they're the only file in the filesystem. Otherwise, +the "dynamically" linked programs require each shared library file to be +present on the target system, either copied out of the toolchain or built +again from source (with potential version skew if they don't match the toolchain +versions exactly), plus a dynamic linker executable installed at a specific +absolute path. See the +ldd, +ld.so, +and libc +man pages for details.

    + +

    Most embedded systems will add another package to the kernel/libc/cmdline +above containing the dedicated "application" that the embedded system exists to run, plus any other packages that application depends on. -Build systems will add a native version of the toolchain packages so -they can build additional software on the resulting system. Desktop systems -will add a GUI and additional application packages like web browsers -and video players. A linux distro like Debian would add hundreds of packages. +Build systems add a native version of the toolchain packages so +they can compile additional software on the resulting system. Desktop systems +add a GUI and additional application packages like web browsers +and video players. A linux distro like Debian adds hundreds of packages. Android adds around a thousand.

    But all of these systems conceptually sit on a common three-package -"kernel/libc/cmdline" base, and toybox aims to provide a simple, reproducible, +"kernel/libc/cmdline" base (often inefficiently implemented and broken up +into more packages), and toybox aims to provide a simple, reproducible, auditable version of the cmdline portion of that base.

    - -

    Q: How do you build a working Linux system with toybox?

    +

    Q: How do you build a working Linux system with toybox?

    -

    A: Toybox has a built-in system builder, which has a Makefile target. -To build a native root filesystem you can chroot into, -"make root" then "sudo chroot -root/host/fs /init" to enter it. Type "exit" to get back out.

    +

    A: Toybox has a built-in system builder, with the Makefile target "make +root". To enter the resulting root filesystem, "sudo chroot +root/host/fs /init". Type "exit" to get back out.

    -

    You can also cross compile simple three package (toybox+libc+linux) -systems that boot to a shell prompt under qemu, +

    You can cross compile simple three package (toybox+libc+linux) +systems configured to boot to a shell prompt under the emulator +qemu by specifying a target type with CROSS= -(or setting CROSS_COMPILE= to a cross compiler prefix with optional absolute +(or by setting CROSS_COMPILE= to a cross compiler prefix with optional absolute path), and pointing the build at a Linux kernel source directory, ala:

    make root CROSS=sh4 LINUX=~/linux

    -

    Then you can cd root/sh4; ./qemu-sh4.sh to launch the emulator -(you need the appropriate qemu-system-* binary installed, it'll complain -if it can't find it). Type "exit" -when done and it should shut down the emulator on the way out.

    +

    Then you can cd root/sh4; ./qemu-sh4.sh to launch the emulator. +(You'll need the appropriate qemu-system-* emulator binary installed.) +Type "exit" when done and it should shut down the emulator on the way out, +similar to exiting the chroot version. (Except this is more like you ssh'd +to a remote machine: the emulator created its own CPU with its own memory +and I/O devices, and booted a kernel in it.)

    The build finds the three packages needed to produce this system because 1) you're in a toybox source directory, 2) your cross compiler has a libc built into it, 3) you tell it where to find a Linux kernel -source directory with LINUX= on the command line. (If you don't say LINUX=, -it skips that part of the build and just produces a root filesystem directory.)

    +source directory with LINUX= on the command line. If you don't say LINUX=, +it skips that part of the build and just produces a root filesystem directory +ala the first example in this FAQ answer.

    The CROSS= shortcut expects a "ccc" symlink in the toybox source directory pointing at a directory full of cross compilers. The ones I test this with are built from the musl-libc @@ -450,7 +727,7 @@ maintainer's musl-cross-make project, built by running toybox's scripts/mcm-buildall.sh in that directory, and then symlink the resulting "ccc" subdirectory into toybox where CROSS= -can find them, ala:

    +can find them:

     cd ~
    @@ -461,7 +738,7 @@ cd musl-cross-make
     ln -s $(realpath ccc) ../toybox/ccc
     
    -

    If you don't want to do that, you can download prebuilt binary versions (from Zach van Rijn's site) and +

    If you don't want to do that, you can download prebuilt binary versions from Zach van Rijn's site and just extract them into a "ccc" subdirectory under the toybox source.

    Once you've installed the cross compilers, "make root CROSS=help" @@ -472,8 +749,7 @@ something like:

    aarch64 armv4l armv5l armv7l armv7m armv7r i486 i686 m68k microblaze mips mips64 mipsel powerpc powerpc64 powerpc64le s390x sh2eb sh4 x32 x86_64

    - -(A long time ago I +

    (A long time ago I tried to explain what some of these architectures were.)

    -- cgit v1.2.3