Frequently Asked Questions

General Questions

Using toybox

Q: Do you capitalize toybox?

A: Only at the start of a sentence. The command name is all lower case so it seems silly to capitalize the project name, but not capitalizing the start of sentences is awkward, so... compromise. (It is _not_ "ToyBox".)

Q: "Why is there toybox? What was wrong with busybox?"

A: Toybox started back in 2006 when I handed off BusyBox maintainership and started over from scratch on a new codebase after a protracted licensing argument took all the fun out of working on BusyBox.

Toybox was just a personal project until it got relaunched in November 2011 with a new goal to make Android self-hosting. This involved me relicensing my own code, which made people who had never used or participated in the project loudly angry. The switch came after a lot of thinking about licenses and the transition to smartphones, which led to a 2013 talk laying out a strategy to make Android self-hosting using toybox. This helped bring it to Android's attention, and they merged it into Android M.

The answer to the second question is "licensing". BusyBox predates Android by almost a decade but Android still doesn't ship with it because GPLv3 came out around the same time Android did and caused many people to throw out the GPLv2 baby with the GPLv3 bathwater. Android explicitly discourages use of GPL and LGPL licenses in its products, and has gradually reimplemented historical GPL components such as its bluetooth stack under the Apache license. Similarly, Apple froze xcode at the last GPLv2 releases (GCC 4.2.1 with binutils 2.17) for over 5 years while it sponsored the development of new projects (clang/llvm/lld) to replace them, implemented its SMB server from scratch to replace samba, and so on. Toybox itself exists because somebody with in a legacy position just wouldn't shut up about GPLv3, otherwise I would probably still happily be maintaining BusyBox. (For more on how I wound up working on busybox in the first place, see here.)

Q: Why a 7 year support horizon?

A: Our longstanding rule of thumb is to try to run and build on hardware and distributions released up to 7 years ago, and feel ok dropping support for stuff older than that. (This is a little longer than Ubuntu's Long Term Support, but not by much.)

My original theory was "4 to 5 18-month cycles of moore's law should cover the vast majority of the installed base of PC hardware", loosely based on some research I did back in 2003 and updated in 2006 which said that low end systems were 2 iterations of moore's law below the high end systems, and that another 2-3 iterations should cover the useful lifetime of most systems no longer being sold but still in use and potentially being upgraded to new software releases.

It turns out I missed industry changes in the 1990's that stretched the gap from low end to high end from 2 cycles to 4 cycles, and _that_ analysis ignored the switch from PC to smartphone cutting off the R&D air supply of the laptop market. Meanwhile the Moore's Law s-curve started bending down in 2000 and these days is pretty flat because the drive for faster clock speeds stumbled then died, and the subsequent drive to go wide maxed out around 4x SMP with ~2 megabyte caches for most applications. These days the switch from exponential to linear growth in hardware capabilities is common knowledge.

But the 7 year rule of thumb stuck around anyway: if a kernel or libc feature is less than 7 years old, I try to have a build-time configure test for it and let the functionality cleanly drop out. I also keep old Ubuntu images around in VMs and perform the occasional defconfig build there to see what breaks. (I'm not perfect about this, but I accept bug reports.)

Q: Why time based releases?

A: Toybox targets quarterly releases (a similar schedule to the Linux kernel) because Martin Michlmayr's talk on the subject was convincing.

Releases provide synchronization points where the developers certify "it worked for me". Each release is a known version with predictable behavior, and right or wrong at least everyone should be seeing similar results where you might be able to google an unexpected outcome. Releases focus end-user testing on specific versions where issues can be reproduced, diagnosed, and fixed. Releases also force the developers to do periodic tidying, packaging, documentation review, finish up partially implemented features languishing in their private trees, and give regular checkpoints to measure progress.

Over time feature sets change, data formats change, control knobs change... For example toybox's switch from "ls -q" to "ls -b" as the default output format wasn't exactly a bug, it was a design improvement... but the difference is academic if the change breaks somebody's script. Releases give you the option to schedule upgrades later, and not to rock the boat just now: just use a known working release version.

The counter-argument is that "continuous integration" can be made robust with sufficient automated testing. But like the waterfall method, this places insufficent emphasis on end-user feedback and learning from real world experience. Developer testing is either testing that the code does what the developers expect given expected inputs running in an expected environment, or it's regression testing against bugs previously found in the field. No plan survives contact with the enemy, and technology always breaks once it leaves the lab and encounters real world data and use cases, not just at runtime but in different build environments.

The best way to give new users a reasonable first experience is to point them at specific stable versions where development quiesced and extra testing occurred. There will still be teething troubles, but multiple people experiencing the _same_ teething troubles can potentially help each other out.

As for why releases on a schedule are better than releases "when it's ready", watch the video.

Q: Where do I start understanding the source code?

A: Toybox is written in C. There are longer writeups of the design ideas and a code walkthrough, and the about page summarizes what we're trying to accomplish, but here's a quick start:

Toybox uses the standard three stage configure/make/install build, in this case "make defconfig; make; make install". Type "make help" to see available make targets.

The configure stage is copied from the Linux kernel (in the "kconfig" directory), and saves your selections in the file ".config" at the top level. The "defconfig" target selects the maximum sane configuration (enabling all the commands and features that aren't unfinished, only intended as examples, debug code, etc) and is probably what you want. You can use "make menuconfig" to manually select specific commands to include, through an interactive menu (cursor up and down, enter to descend into a sub-menu, space to select an entry, ? to see an entry's help text, esc to exit). The menuconfig help text is the same as the command's --help output.

The "make" stage creates a toybox binary (which is stripped, look in generated/unstripped for the debug versions), and "install" adds a bunch of symlinks to toybox under the various command names. Toybox determines which command to run based on the filename, or you can use the "toybox" name in which case the first argument is the command to run (ala "toybox ls -l"). You can also build individual commands as standalone executables, ala "make sed cat ls". (The "make change" target builds all of them, as in "change for a $20".)

The main() function is in main.c at the top level, along with setup plumbing and selecting which command to run this time. The function toybox_main() in the same file implements the "toybox" multiplexer command that lists and selects the other commands.

The individual command implementations are under "toys", and are grouped into categories (mostly based on which standard they come from, posix, lsb, android...) The "pending" directory contains unfinished commands, and the "examples" directory contains example code that aren't really useful commands. Commands in those two directories are _not_ selected by defconfig. (Most of the files in the pending directory are third party submissions that have not yet undergone proper code review.)

Common infrastructure shared between commands is under "lib". Most commands call lib/args.c to parse their command line arguments before calling the command's own main() function, which uses the option string in the command's NEWTOY() macro. This is similar to the libc function getopt(), but more powerful, and is documented at the top of lib/args.c. A NULL option string prevents this code from being called for that command.

Most of the actual build/install infrastructure is shell scripts under "scripts" (starting with scripts/make.sh and scripts/install.sh). These populate the "generated" directory with headers created from other files, which are described in the code walkthrough. All the build's temporary files live under generated, including the .o files built from the .c files (in generated/obj). The "make clean" target deletes that directory. ("make distclean" also deletes your .config and deletes the kconfig binaries that process .config.)

Each command's .c file contains all the information for that command, so adding a command to toybox means adding a single file under "toys". Usually you start a new command by copying an existing command file to a new filename (toys/examples/hello.c, toys/examples/skeleton.c, toys/posix/cat.c, and toys/posix/true.c have all been used for this purpose) and then replacing all instances of its old name with the new name (which should match the new filename), and modifying the help text, argument string, and what the code does. You might have to "make distclean" before your new command shows up in defconfig or menuconfig.

The toybox test suite lives in the "tests" directory, and is driven by scripts/test.sh and scripts/runtest.sh. From the top level you can "make tests" to test everything, or "make test_sed" to test a single command's standalone version (which should behave identically, but that's why we test). You can set TEST_HOST=1 to test the host versionn instead of the toybox version (in theory they should work the same), and VERBOSE=1 to see diffs of the expected and actual output when a test fails (VERBOSE=fail to stop at the first such failure)

Q: How do I cross compile toybox?

A: toybox is tested against three C libraries (bionic, musl, glibc) with 2 compilers (llvm, gcc). The easy way to get coverage (if not every combination) is:

1) gcc+glibc = host toolchain

Most Linux distros come with that as a host compiler, just build normally (make distclean defconfig toybox, or make menuconfig followed by make).

You can use LDFLAGS=--static if you want static binaries, but static glibc is hugely inefficient ("hello world" is 810k on x86-64) and throws a zillion linker warnings because one of its previous maintainers was insane (which meant at the time he refused to fix obvious bugs), plus it uses dlopen() at runtime to implement basic things like DNS lookup (which is impossible to support properly from a static binary because you wind up with two instances of malloc() managing two heaps which corrupt as soon as a malloc() from one is free()d into the other), although glibc added improper support which still requires the shared libraries to be installed on the system alongside the static binary: in brief, avoid.) These days glibc is maintained by a committee instead of a single maintainer, if you consider that an improvement.

2) gcc+musl = musl-cross-make

The cross compilers I test this with are built from the musl-libc maintainer's musl-cross-make project, built by running toybox's scripts/mcm-buildall.sh in that directory, and then symlink the resulting "ccc" subdirectory into toybox where "make root CROSS=" can find them, ala:

cd ~
git clone https://github.com/landley/toybox
git clone https://github.com/richfelker/musl-cross-make
cd musl-cross-make
../toybox/scripts/mcm-buildall.sh # this takes a while
ln -s $(realpath ccc) ../toybox/ccc

Instead of symlinking ccc, you can specify a CROSS_COMPILE= prefix in the same format the Linux kernel build uses. You can either provide a full path in the CROSS_COMPILE string, or add the appropriate bin directory to your $PATH. I.E:

make LDFLAGS=--static CROSS_COMPILE=~/musl-cross-make/ccc/m68k-linux-musl-cross/bin/m68k-linux-musl- distclean defconfig toybox

Is equivalent to:

export "PATH=~/musl-cross-make/ccc/m68k-linux-musl-cross/bin:$PATH"
LDFLAGS=--static make distclean defconfig toybox CROSS=m68k-linux-musl-

Note: a non-static build won't run unless you install musl on your host. In theory you could "make root" a dynamic root filesystem with musl by copying the shared libraries out of the toolchain, but I haven't bothered implementing that yet because a static linked musl hello world is 10k on x86 (5k if stripped).

3) llvm+bionic = Android NDK

The Android Native Development Kit provides an llvm toolchain with the bionic libc used by Android. To turn it into something toybox can use, you just have to add an appropriately prefixed "cc" symlink to the other prefixed tools, ala:

unzip android-ndk-r21b-linux-x86_64.zip
cd android-ndk-21b/toolchains/llvm/prebuilt/linux-x86_64/bin
ln -s x86_64-linux-android29-clang x86_64-linux-android-cc
PATH="$PWD:$PATH"
cd ~/toybox
make distclean
make LDFLAGS=--static CROSS_COMPILE=x86_64-linux-android- defconfig toybox

Again, you need a static link unless you want to install bionic on your host. Binaries statically linked against bionic are almost as big as with glibc, but at least it doesn't have the dlopen() issues.

Unfortunately, although the resulting toybox will run a bionic-based chroot will not, because even "hello world" statically linked against bionic will segfault before calling main() if /dev/null isn't present, and the init script written by mkroot.sh has to run a shell linked against bionic in order to mount devtmpfs on /dev to provide /dev/null.

Q: Where does toybox fit into the Linux/Android ecosystem?
(I.E. What part of the operating system does toybox provide, and what does it depend on?)

A: Toybox is a set of standard Linux command line utilities, so that three packages (a Linux kernel, C library, and toybox) provide a complete bootable unix-style command line system. Toybox provides a command shell and over a hundred different commands to call from that command shell.

Toybox is not a complete operating system, it's a program that runs under an operating system. Booting a simple system to a shell prompt requires an kernel (such as Linux, or BSD with a Linux emulation layer) to drive the hardware, one or more programs for the system to run (toybox), and a C library ("libc") to connect them together (toybox has been tested with musl, uClibc, glibc, and bionic).

The C library is delivered as part of a "toolchain", which is an integrated suite of compiler, assembler, and linker, plus the standard headers and libraries necessary to build C programs. (And miscellaneous binaries like nm and objdump.)

Static linking (with the --static option) copies the shared library contents into the program, resulting in larger but more portable programs, which can run even if they're the only file in the filesystem. Otherwise, the "dynamically" linked programs require the shared library files to be present on the target system, either copied from the toolchain or built again from source (with potential version skew if they don't match the toolchain versions exactly). See "man ldd", "man ld.so", and "man libc" for details.

Toybox has a policy of requiring no external dependencies other than the C library for defconfig builds. You can optionally enable support for additional libraries in menuconfig (such as openssl, zlib, or selinux), but toybox either provides its own built-in versions of such functionality (which the libraries provide larger, more complex, often assembly optimized alternatives to), or allows things like selinux support to cleanly drop out.

Most embedded systems will add a fourth package to the kernel/libc/cmdline above containing dedicated "application" that the embedded system exists to run, plus any other packages that application depends on. Build systems will add a native version of the toolchain packages so they can build additional software on the resulting system. Desktop systems will add a GUI and additional application packages like web browsers and video players. A linux distro like Debian would add hundreds of packages. Android adds around a thousand.

But all of these systems conceptually sit on a common three-package "kernel/libc/cmdline" base, and toybox aims to provide a simple, reproducible, auditable version of the cmdline portion of that base.

Q: How do you build a working Linux system with toybox?

A: Toybox has a built-in system builder, which has a Makefile target. To build a native root filesystem you can chroot into, "make root" then "sudo chroot root/host/fs /init" to enter it. Type "exit" to get back out.

You can also cross compile simple three package (toybox+libc+linux) systems that boot to a shell prompt under qemu, by specifying a target type with CROSS= (or setting CROSS_COMPILE= to a cross compiler prefix with optional absolute path), and pointing the build at a Linux kernel source directory, ala:

make root CROSS=sh4 LINUX=~/linux

Then you can cd root/sh4; ./qemu-sh4.sh to launch the emulator (you need the appropriate qemu-system-* binary installed, it'll complain if it can't find it). Type "exit" when done and it should shut down the emulator on the way out.

The build finds the three packages needed to produce this system because 1) you're in a toybox source directory, 2) your cross compiler has a libc built into it, 3) you tell it where to find a Linux kernel source directory with LINUX= on the command line. (If you don't say LINUX=, it skips that part of the build and just produces a root filesystem directory.)

The CROSS= shortcut expects a "ccc" symlink in the toybox source directory pointing at a directory full of cross compilers. The ones I test this with are built from the musl-libc maintainer's musl-cross-make project, built by running toybox's scripts/mcm-buildall.sh in that directory, and then symlink the resulting "ccc" subdirectory into toybox where CROSS= can find them, ala:

cd ~
git clone https://github.com/landley/toybox
git clone https://github.com/richfelker/musl-cross-make
cd musl-cross-make
../toybox/scripts/mcm-buildall.sh # this takes a while
ln -s $(realpath ccc) ../toybox/ccc

If you don't want to do that, you can download prebuilt binary versions (from Zach van Rijn's site) and just extract them into a "ccc" subdirectory under the toybox source.

Once you've installed the cross compilers, "make root CROSS=help" should list all the available cross compilers it recognizes under ccc, something like:

aarch64 armv4l armv5l armv7l armv7m armv7r i486 i686 m68k microblaze mips mips64 mipsel powerpc powerpc64 powerpc64le s390x sh2eb sh4 x32 x86_64

(A long time ago I tried to explain what some of these architectures were.)

You can build all the targets at once, and can add additonal packages to the build, by calling the script directly and listing packages on the command line:

scripts/mkroot.sh CROSS=all LINUX=~/linux dropbear

An example package build script (building the dropbear ssh server, adding a port forward from 127.0.0.1:2222 to the qemu command line, and providing a ssh2dropbear.sh convenience script to the output directory) is provided in the scripts/root directory. If you add your own scripts elsewhere, just give a path to them on the command line. (No, I'm not merging more package build scripts, I learned that lesson long ago. But if you want to write your own, feel free.)

(Note: currently mkroot.sh cheats. If you don't have a .config it'll make defconfig and add CONFIG_SH and CONFIG_ROUTE to it, because the new root filesystem kinda needs those commands to function properly. If you already have a .config that _doesn't_ have CONFIG_SH in it, you won't get a shell prompt or be able to run the init script without a shell. This is currently a problem because sh and route are still in pending and thus not in defconfig, so "make root" cheats and adds them. I'm working on it. tl;dr if make root doesn't work "rm .config" and run it again, and all this should be fixed up in future when those two commands are promoted out of pending so "make defconfig" would have what you need anyway. It's designed to let yout tweak your config, which is why it uses the .config that's there when there is one, but the default is currently wrong because it's not quite finished yet. All this should be cleaned up in a future release, before 1.0.)