Goals and use cases

We have several potential use cases for a new set of command line utilities, and are using those to determine which commands to implement for Toybox's 1.0 release. (Most of these have their own section in the status page.)

The most interesting publicly available standards are POSIX-2008 (also known as the Single Unix Specification version 4) and the Linux Standard Base (version 4.1). The main test harness is including toybox in Aboriginal Linux and if that can build itself using the result to build Linux From Scratch (version 6.8). We also aim to replace Android's Toolbox.

At a secondary level we'd like to meet other use cases. We've analyzed the commands provided by similar projects (klibc, sash, sbase, embutils, nash, and beastiebox), along with various vendor configurations of busybox, and some end user requests.

Finally, we'd like to provide a good replacement for the Bash shell, which was the first program Linux ever ran and remains the standard shell of Linux no matter what Ubuntu says. This doesn't mean including the full set of Bash 4.x functionality, but does involve {various,features} <(beyond) posix.

See the status page for the categorized command list and progress towards implementing it. There's also a historical todo list from the project's 2011 relaunch.


Use case: standards compliance.

POSIX-2008/SUSv4

The best standards describe reality rather than attempting to impose a new one. A good standard should document, not legislate. Standards which document existing reality tend to be approved by more than one standards body, such ANSI and ISO both approving C99. That's why the IEEE POSIX committee's 2008 standard, the Single Unix Specification version 4, and the Open Group Base Specification edition 7 are all the same standard from three sources, but most people just call it "posix" (portable operating system derived from unix). It's available online in full, and may be downloaded as a tarball... with a caveat.

Although previous versions of Posix have their own stable URLs (where you can still find SUSv3 and SUSv2), the 2008 release of SUSv4 was replaced by a 2013 release also claiming to be SUSv4, then again by a 2018 release still at the same URL. Similarly, the other version numbers claim not to have changed, but instead adopted some sort of "Windows 95" naming scheme ("The Open Group Base Specifications Issue 7, 2018 edition"). Since a moving target isn't a standard, we've stuck with the 2008 version and ignored whatever changes they make until they stop this forced-upgrade-behind-your back nonsense. Luckily you can still find the original content here. (We haven't changed the URLs in each command to the longer version yet, but can if conflicts arise.)

Why not just use posix for everything?

Unfortunately posix describes an incomplete subset of reality, lacking any mention of commands such as init or mount required to actually boot a system. It describes logname but not login. It provides ipcrm and ipcs, but not ipcmk, so you can use System V IPC resources but not create them. And widely used real-world commands such as tar and cpio (the basis of initramfs and RPM) which were present in earlier versions of the standard have been removed, while obsolete commands like cksum, compress, sccs and uucp remain with no mention of modern counterparts like crc32/sha1sum, gzip/xz, svn/git or scp/rsync. Meanwhile the commands themselves are missing dozens of features and specify silly things like ebcdic support in dd or that wc should use %d (not %lld) for byte counts. So we have to extensively filter posix to get a useful set of recommendations.

Starting with the full "utilities" list, we first remove generally obsolete commands (compress ed ex pr uncompress uccp uustat uux), commands for the pre-CVS "SCCS" source control system (admin delta get prs rmdel sact sccs unget val what), fortran support (asa fort77), and batch processing support (batch qalter qdel qhold qmove qmsg qrerun qrls qselect qsig qstat qsub).

Some commands are for a compiler toolchain (ar c99 cflow ctags cxref gencat iconv lex m4 make nm strings strip tsort yacc), which is outside of toybox's mandate and should be supplied externally. (Again, some of these may be revisited later, but not for toybox 1.0.)

Some commands are part of a command shell, and can't be implemented as separate executables (alias bg cd command fc fg getopts hash jobs kill read type ulimit umask unalias wait). These may be revisited as part of a built-in toybox shell, but are not exported into $PATH via symlinks. (If you fork a child process and have it "cd" then exit, you've accomplished nothing.) Again, what posix provides is incomplete: a shell also needs exit, if, while, for, case, export, set, unset, trap, exec... (And for bash compatibility function, source...)

A few other commands are judgement calls, providing command-line internationalization support (iconv locale localedef), System V inter-process communication (ipcrm ipcs), and cross-tty communication from the minicomputer days (talk mesg write). The "pax" utility failed to replace tar, "mailx" is a command line email client, and "lp" submits files for printing to... what exactly? (cups?) The standard defines crontab but not crond. What is pathchk supposed to be portable _to_? (Linux accepts 255 byte path components with any char except NUL or / and no max length on the total path, and EXPLICITLY doesn't care if it's an invalid utf8 sequence.)

Removing all of that leaves the following commands, which toybox should implement:

at awk basename bc cal cat chgrp chmod chown cksum cmp comm cp csplit cut date dd df diff dirname du echo env expand expr false file find fold fuser getconf grep head id join kill link ln logger logname ls man mkdir mkfifo more mv newgrp nice nl nohup od paste patch printf ps pwd renice rm rmdir sed sh sleep sort split stty tabs tail tee test time touch tput tr true tty uname unexpand uniq unlink uudecode uuencode vi wc who xargs zcat

Linux Standard Base

One attempt to supplement POSIX towards an actual usable system was the Linux Standard Base. Unfortunately, the quality of this "standard" is fairly low.

POSIX allowed its standards process to be compromised by leaving things out, thus allowing IBM mainframes and Windows NT to drive a truck through the holes and declare themselves compilant. But it means what they DID standardize tends to be respected (if sometimes obsolete).

The Linux Standard Base's failure mode is different, they respond to pressure by including anything their members pay them enough to promote, such as allowing Red Hat to push RPM into the standard even though all sorts of distros (Debian, Slackware, Arch, Gentoo) don't use it and never will. This means anything in the LSB is at best a suggestion: arbitrary portions of this standard are widely ignored.

The community perception seems to be that the Linux Standard Base is the best standard money can buy, I.E. the Linux Foundation is supported by financial donations from large companies and the LSB represents the interests of those donors more than technical merit. (The Linux Foundation, which maintains the LSB, isn't a 501c3. It's a 501c6, the same kind of legal entity as the Tobacco Institute and Microsoft's old "Don't Copy That Floppy" program.) Debian officially washed its hands of LSB when 5.0 came out in 2015, and no longer even pretends to support it (which may affect Debian derivatives like Ubuntu and Knoppix). Toybox hasn't moved to 5.0 for similar reasons.

That said, Posix by itself isn't enough, and this is the next most comprehensive standards effort for Linux so far, so we salvage what we can.

The LSB specifies a list of command line utilities:

ar at awk batch bc chfn chsh col cpio crontab df dmesg du echo egrep fgrep file fuser gettext grep groupadd groupdel groupmod groups gunzip gzip hostname install install_initd ipcrm ipcs killall lpr ls lsb_release m4 md5sum mknod mktemp more mount msgfmt newgrp od passwd patch pidof remove_initd renice sed sendmail seq sh shutdown su sync tar umount useradd userdel usermod xargs zcat

Where posix specifies one of those commands, LSB's deltas tend to be accomodations for broken tool versions which aren't up to date with the standard yet. (See more and xargs for examples.)

Since we've already committed to using our own judgement to skip bits of POSIX, and LSB's "judgement" in this regard is purely bug workarounds to declare various legacy tool implementations "compliant", this means we're mostly interested in the set of LSB tools that aren't mentioned in posix.

Of these, gettext and msgfmt are internationalization, install_initd and remove_initd weren't present in Ubuntu 10.04, lpr is out of scope, lsb_release just reports information in /etc/os-release, and sendmail's turned into a pile of cryptographic verification and DNS shenanigans due to spammers.

This leaves:

chfn chsh dmesg egrep fgrep groupadd groupdel groupmod groups gunzip gzip hostname install killall md5sum mknod mktemp mount passwd pidof seq shutdown su sync tar umount useradd userdel usermod zcat

IETF RFCs

Discussion of standards wouldn't be complete without the Internet Engineering Task Force's "Request For Comments" collection.

These are more about protocols than commands. The noise level is extremely high: there's thousands of RFCs, many describing a proposed idea that never took off, and less than 1% of the resulting documents are currently relevant to toybox. And the documents are numbered based on the order they were received, with no real attempt at coherently indexing the result. As with man pages they can be long and complicated or terse and impenetrable, have developed a certain amount of bureaucracy over the years, and often the easiest way to understand what they document is to find an earlier version to read first.

That said, RFC documents can be useful (especially for networking protocols) and the three URL templates the recommended starting files for new commands (toys/example/skeleton.c or toys/example/hello.c depending on how much plumbing you want to start with) provide point to are posix, lsb, and rfc pages.


Use case: provide a self-hosting development environment

The following commands were enough to build the Aboriginal Linux development environment, boot it to a shell prompt, and build Linux From Scratch 6.8 under it.

bzcat cat cp dirname echo env patch rmdir sha1sum sleep sort sync true uname wc which yes zcat awk basename chmod chown cmp cut date dd diff egrep expr fdisk find grep gzip head hostname id install ln ls mkdir mktemp mv od readlink rm sed sh tail tar touch tr uniq wget whoami xargs chgrp comm gunzip less logname split tee test time bunzip2 chgrp chroot comm cpio dmesg dnsdomainname ftpd ftpget ftpput gunzip ifconfig init less logname losetup mdev mount mountpoint nc pgrep pkill pwd route split stat switch_root tac umount vi resize2fs tune2fs fsck.ext2 genext2fs mke2fs xzcat

This use case includes running init scripts and other shell scripts, running configure, make, and install in each package, and providing basic command line facilities such as a text editor. (It does not include a compiler toolchain or C library, those are outside the scope of the toybox project, although mkroot has a potentialy follow-up project. For now we use distro toolchains, musl-cross-make, and the Android NDK for build testing.) That build system also instaled bash 2.05b as #!/bin/sh and its scripts required bash extensions not present in shells such as busybox ash. To replace that toysh needs to supply several bash extensions _and_ work when called under the name "bash".

The development methodology used a command logging wrapper that intercepted each command called out of the $PATH and append the command line to a log file, then analyze the result to create a list of commands, then create a directory of symlinks pointing to those commands out of the host $PATH. Then the new implementation can replace these commands one at a time, checking the results and the log output to spot any behavior changes.

Stages and moving targets

This use case has two stages: 1) building a bootable system that can rebuild itself from source, and 2) a build environment capable of bootstrapping up to arbitrary complexity (as exemplified by building Linux From Scratch and Beyond Linux From Scratch under the resulting system). To accomplish just the first goal, the old build still needs the following busybox commands for which toybox does not yet supply adequate replacements:

awk dd diff expr fdisk ftpd gzip less route sh sha512sum tr unxz vi wget xzcat

All of those except awk, ftpd, and less have partial implementations in "pending".

In 2017 Aboriginal Linux development ended, replaced by the mkroot project designed to use an existing cross+native toolchain (such as musl-cross-make or the Android NDK) instead of building its own. In 2019 the still-incomplete mkroot was merged into toybox as the "make root" target. This is intended as a simpler way of providing essentially the same build environment, and doesn't significantly affect the rest of this analysis (although the "rebuild itself from source" test now includes building musl-cross-make under either mkroot or toybox's "make airlock" host environment).

Building Linux From Scratch is not the same as building the Android Open Source Project, but after toybox 1.0 we plan to try modifying the AOSP build to reduce dependencies. (It's fairly likely we'll have to add at least a read-only git utility so repo can download the build's source code, but that's actually not that hard. We'll probably also need our own "make" at some point after 1.0, which is its own moving target thanks to cmake and ninja and so on.) The ongoing Android hermetic build work is already advancing this goal.


Use case: Replacing Android Toolbox

Android has a policy against GPL in userspace, so even though BusyBox predates Android by many years, they couldn't use it. Instead they grabbed an old version of ash (later replaced by mksh) and implemented their own command line utility set called "toolbox" (which toybox has already mostly replaced).

Toolbox doesn't have its own repository, instead it's part of Android's system/core git repository. Android's Native Development Kit (their standalone downloadable toolchain) has its own roadmap, and each version has release notes.

Toolbox commands:

According to system/core/toolbox/Android.bp the toolbox directory builds the following commands:

getevent getprop modprobe setprop start

getprop/setprop/start were in toybox and moved back because they're so tied to non-public system interfaces. modprobe shares the implementation used in init. getevent is a board bringup tool built with a python script that pulls all the constants from the latest kernel headers.

Other Android /system/bin commands

Other than the toolbox links, the currently interesting binaries in /system/bin are:

The names in parentheses are the upstream source of the command.

Analysis

For reference, combining everything listed above that's still "fair game" for toybox, we get:

arping blkid e2fsck dd fsck.f2fs fsck_msdos gzip ip iptables ip6tables iw logwrapper make_ext4fs make_f2fs modpobe newfs_msdos ping ping6 reboot resize2fs sh ss tc tracepath tracepath6 traceroute traceroute6

We may eventually implement all of that, but for toybox 1.0 we need to focus a bit. If Android has an acceptable external package, and the command isn't needed for system bootstrapping, replacing the external package is not a priority.

However, several commands toybox plans to implement anyway could potentially replace existing Android versions, so we should take into account Android's use cases when doing so. This includes:

dd getevent gzip modprobe newfs_msdos sh

Update: external/toybox/Android.bp has symlinks for the following toys out of "pending". (The toybox modprobe is also built for the device, but it isn't actually used and is only there for sanity checking against the libmodprobe-based implementation.) These should be a priority for cleanup:

bc dd diff expr getfattr lsof more stty tr traceroute

Android wishlist:

mtools genvfatfs mke2fs gene2fs

Use case: Building AOSP

The list of external tools used to build AOSP was here, but as they're switched over to toybox they disappear and reappear here.

awk basename bash bc bzip2 cat chmod cmp comm cp cut date dd diff dirname du echo egrep env expr find fuser getconf getopt git grep gzip head hexdump hostname id jar java javap ln ls lsof m4 make md5sum mkdir mktemp mv od openssl paste patch pgrep pkill ps pstree pwd python python2.7 python3 readlink realpath rm rmdir rsync sed setsid sh sha1sum sha256sum sha512sum sleep sort stat tar tail tee todos touch tr true uname uniq unix2dos unzip wc which whoami xargs xxd xz zip zipinfo

The following are already in the tree and will be used directly:

awk bzip2 jar java javap m4 make python python2.7 python3 xz

Subtracting what's already in toybox (including the following toybox toys that are still in pending: bc dd diff expr gzip lsof tar tr), that leaves:

bash fuser getopt git hexdump openssl pstree rsync sh todos unzip zip zipinfo

For AOSP, zip/zipinfo/unzip are likely to be libziparchive based. The todos callers will use unix2dos instead if it's available. git/openssl seem like they should just be brought in to the tree. rsync is used to work around a Mac cp -Rf bug with broken symbolic links. That leaves:

bash fuser getopt hexdump pstree

(Why are fuser and pstree used during the AOSP build? They're used for diagnostics if something goes wrong. So it's really just bash, getopt, and hexdump that are actually used to build.)


Use case: Tizen Core

The Tizen project has expressed a desire to eliminate GPLv3 software from its core system, and is installing toybox as part of this process.

They have a fairly long list of new commands they'd like to see in toybox:

arch base64 users unexpand shred join csplit hostid nproc runcon sha224sum sha256sum sha384sum sha512sum sha3sum mkfs.vfat fsck.vfat dosfslabel uname stdbuf pinky diff3 sdiff zcmp zdiff zegrep zfgrep zless zmore

In addition, they'd like to use several commands currently in pending:

tar diff printf wget rsync fdisk vi less tr test stty fold expr dd

Also, tizen uses a different Linux Security Module called SMACK, so many of the SELinux options ala ls -Z need smack alternatives in an if/else setup.


buildroot:

The mandatory packages section of the buildroot manual lists:

which sed make bash patch gzip bzip2 tar cpio unzip rsync file bc wget

(It also lists binutils gcc g++ perl python, and for debian it wants build-essential. And it wants file to be in /usr/bin because libtool breaks otherwise.)

Buildroot does not support a cross toolchain that lives in "/usr/bin" with a prefix of "" (if you try, and chop out the test for a blank prefix, it dies trying to run "/usr/bin/-gcc"). But you can patch your way to making it work if you try.


klibc:

Long ago some kernel developers came up with a project called klibc. After a decade of development it still has no web page or HOWTO, and nobody's quite sure if the license is BSD or GPL. It inexplicably requires perl to build, and seems like an ideal candidate for replacement.

In addition to a C library even less capable than bionic (obsoleted by musl), klibc builds a random assortment of executables to run init scripts with. There's no multiplexer command, these are individual executables:

cat chroot cpio dd dmesg false fixdep fstype gunzip gzip halt ipconfig kill kinit ln losetup ls minips mkdir mkfifo mknodes mksyntax mount mv nfsmount nuke pivot_root poweroff readlink reboot resume run-init sh sha1hash sleep sync true umount uname zcat

To get that list, build klibc according to the instructions (I looked at version 2.0.2 and did cd klibc-*; ln -s /output/of/kernel/make/headers_install linux; make) then echo $(for i in $(find . -type f); do file $i | grep -q executable && basename $i; done | grep -v '[.]g$' | sort -u) to find executables, then eliminate the *.so files and *.shared duplicates.

Some of those binaries are build-time tools that don't get installed, which removes mknodes, mksyntax, sha1hash, and fixdep from the list. (And sha1hash is just an unpolished sha1sum anyway.)

The run-init command is more commonly called switch_root, nuke is just "rm -rf -- $@", and minips is more commonly called "ps". I'm not doing aliases for the oddball names.

Yet more stale forks of dash and gzip sucked in here (see "dubious license terms" above), adding nothing to the other projects we've looked at. But we still need sh, gunzip, gzip, and zcat to replace this package.

At the time I did the initial analysis toybox already had cat, chroot, dmesg, false, kill, ln, losetup, ls, mkdir, mkfifo, readlink, rm, switch_root, sleep, sync, true, and uname.

The low hanging fruit is cpio, dd, ps, mv, and pivot_root.

The "kinit" command is another gratuitous rename, it's init running as PID 1. The halt, poweroff, and reboot commands work with it.

I've got mount and umount queued up already, fstype and nfsmount go with those. (And probably smbmount and p9mount, but this hasn't got one. Those are all about querying for login credentials, probably workable into the base mount command.)

The ipconfig command here has a built in dhcp client, so it's ifconfig and dhcpcd and maybe some other stuff.

The resume command is... weird. It finds a swap partition and reads data from it into a /proc file, something the kernel is capable of doing itself. (Even though the klibc author attempted to remove that capability from the kernel, current kernel/power/hibernate.c still parses "resume=" on the command line). And yet various distros seem to make use of klibc for this. Given the history of swsusp/hibernate (and TuxOnIce and kexec jump) I've lost track of the current state of the art here. Ah, Documentation/power/userland-swsusp.txt has the API docs, and here's a better tool...

So the list of things actually in klibc are:

cat chroot dmesg false kill ln losetup ls mkdir mkfifo readlink rm switch_root sleep sync true uname cpio dd ps mv pivot_root mount nfsmount fstype umount sh gunzip gzip zcat kinit halt poweroff reboot ipconfig resume

glibc

Rather a lot of command line utilities come bundled with glibc:

catchsegv getconf getent iconv iconvconfig ldconfig ldd locale localedef mtrace nscd rpcent rpcinfo tzselect zdump zic

Of those, musl libc only implements ldd.

catchsegv is a rudimentary debugger, probably out of scope for toybox.

iconv has been previously discussed.

iconvconfig is only relevant if iconv is user-configurable; musl uses a non-configurable iconv.

getconf is a posix utility which displays several variables from unistd.h; it probably belongs in the development toolchain.

getent handles retrieving entries from passwd-style databases (in a rather lame way) and is trivially replacable by grep.

locale was discussed under posix. localedef compiles locale definitions, which musl currently does not use.

mtrace is a perl script to use the malloc debugging that glibc has built-in; this is not relevant for musl, and would necessarily vary with libc.

nscd is a name service caching daemon, which is not yet relevant for musl. rpcinfo and rpcent are related to rpc, which musl does not include.

The remaining commands involve glibc's bundled timezone database, which seems to be derived from the IANA timezone database. Unless we want to maintain our own fork of the standards body's database like glibc does, these are of no interest, but for completeness:

tzselect outputs a TZ variable correponding to user input. The documentation does not indicate how to use it in a script, but it seems that Debian may have done so. zdump prints current time in each of several timezones, optionally outputting a great deal of extra information about each timezone. zic converts a description of a timezone to a file in tz format.

None of glibc's bundled commands are currently of interest to toybox.


Stand-Alone Shell

Wikipedia has a good summary of sash, with links. The original Stand-Alone Shell project reached a stopping point, and then "sash plus patches" extended it a bit further. The result is a megabyte executable that provides 40 commands.

Sash is a shell with built-in commands. It doesn't have a multiplexer command, meaning "sash ls -l" doesn't work (you have to go "sash -c 'ls -l'").

The list of commands can be obtained via building it and doing "echo help | ./sash | awk '{print $1}' | sed 's/^-//' | xargs echo", which gives us:

alias aliasall ar cd chattr chgrp chmod chown cmp cp chroot dd echo ed exec exit file find grep gunzip gzip help kill losetup losetup ln ls lsattr mkdir mknod more mount mv pivot_root printenv prompt pwd quit rm rmdir setenv source sum sync tar touch umask umount unalias where

Plus sh because it's a shell. A dozen or so commands can only sanely be implemented as shell builtins (alias aliasall cd exec exit prompt quit setenv source umask unalias), where is an alias for which, and at triage time toybox already has chgrp, chmod, chown, cmp, cp, chroot, echo, help, kill, losetup, ln, ls, mkdir, mknod, printenv, pwd, rm, rmdir, sync, and touch.

This leaves:

ar chattr dd ed file find grep gunzip gzip lsattr more mount mv pivot_root sh tar umount

(For once, this project doesn't include a fork of gzip, instead it sucks in -lz from the host.)


sbase:

It's on suckless in two parts. As of November 2015 it's implemented the following (renaming "cron" to "crond" for consistency, and yanking "sponge", "mesg", "pagesize", "respawn", and "vtallow"):

basename cal cat chgrp chmod chown chroot cksum cmp comm cp crond cut date dirname du echo env expand expr false find flock fold getconf grep head hostname join kill link ln logger logname ls md5sum mkdir mkfifo mktemp mv nice nl nohup od paste printenv printf pwd readlink renice rm rmdir sed seq setsid sha1sum sha256sum sha512sum sleep sort split strings sync tail tar tee test tftp time touch tr true tty uname unexpand uniq unlink uudecode uuencode wc which xargs yes

and

chvt clear dd df dmesg eject fallocate free id login mknod mountpoint passwd pidof ps stat su truncate unshare uptime watch who


nash:

Red Hat's nash was part of its "mkinitrd" package, replacement for a shell and utilities on the boot floppy back in the 1990's (the same general idea as BusyBox, developed independently). Red Hat discontinued nash development in 2010, replacing it with dracut (which collects together existing packages, including busybox).

I couldn't figure out how to beat source code out of Fedora's current git repository. The last release version that used it was Fedora Core 12 which has a source rpm that can be unwound with "rpm2cpio mkinitrd.src.rpm | cpio -i -d -H newc --no-absolute-filenames" and in there is a mkinitrd-6.0.93.tar.bz2 which has the source.

In addition to being a bit like a command shell, the nash man page lists the following commands:

access echo find losetup mkdevices mkdir mknod mkdmnod mkrootdev mount pivot_root readlink raidautorun setquiet showlabels sleep switchroot umount

Oddly, the only occurrence of the string pivot_root in the nash source code is in the man page, the command isn't there. (It seems to have been removed when the underscoreless switchroot went in.)

A more complete list seems to be the handlers[] array in nash.c:

access buildEnv cat cond cp daemonize dm echo exec exit find kernelopt loadDrivers loadpolicy mkchardevs mkblktab mkblkdevs mkdir mkdmnod mknod mkrootdev mount netname network null plymouth hotplug killplug losetup ln ls raidautorun readlink resume resolveDevice rmparts setDeviceEnv setquiet setuproot showelfinterp showlabels sleep stabilized status switchroot umount waitdev

This list is nuts: "plymouth" is an alias for "null" which is basically "true" (which thie above list doesn't have). Things like buildEnv and loadDrivers are bespoke Red Hat behavior that might as well be hardwired in to nash's main() without being called.

Instead of eliminating items from the list with an explanation for each, I'm just going to cherry pick a few: the device mapper (dm, raidautorun) is probably interesting, hotplug (may be obsolete due to kernel changes that now load firmware directly), and another "resume" ala klibc.

But mostly: I don't care about this one. And neither does Red Hat anymore.

Verdict: ignore


Beastiebox

Back in 2008, the BSD guys vented some busybox-envy on sourceforge. Then stopped. Their repository is still in CVS, hasn't been touched in years, it's a giant hairball of existing code sucked together. (The web page says the author is aware of crunchgen, but decided to do this by hand anyway. This is not a collection of new code, it's a katamari of existing code rolled up in a ball.)

Combining the set of commands listed on the web page with the set of man pages in the source gives us:

[ cat chmod cp csh date df disklabel dmesg echo ex fdisk fsck fsck_ffs getty halt hostname ifconfig init kill less lesskey ln login ls lv mksh more mount mount_ffs mv pfctl ping poweroff ps reboot rm route sed sh stty sysctl tar test traceroute umount vi wiconfig

Apparently lv is the missing link between ed and vi, copyright 1982-1997 (do not want), ex is another obsolete vi mode, lesskey is "used to specify a set of key bindings to be used with less", and csh is a shell they sucked in (even though they have mksh?), [ is an alias for test. Several more bsd-isms that don't have Linux equivalents (even in the ubuntu "install this package" search) are disklabel, fsck_ffs, mount_ffs, and pfctl. And wiconfig is a wavelan interface network card driver utility. Subtracting all that and the commands toybox already implements at triage time, we get:

fdisk fsck getty halt ifconfig init kill less more mount mv ping poweroff ps reboot route sed sh stty sysctl tar test traceroute umount vi

Not a hugely interesting list, but eh.

Verdict: ignore


BsdBox

Somebody decided to do a multicall binary for freebsd.

They based it on crunchgen, a tool that glues existing programs together into an archive and uses the name to execute the right one. It has no simplification or code sharing benefits whatsoever, it's basically an archiver that produces executables.

That's about where I stopped reading.

Verdict: ignore.


OpenSolaris Busybox

Somebody wrote a wiki page saying that Busybox for OpenSolaris would be a good idea.

The corresponding "files" tab is an auto-generated stub. The project never even got as far as suggesting commands to include before Oracle discontinued OpenSolaris.

Verdict: ignore.


uClinux

Long ago a hardware developer named Jeff Dionne put together a nommu Linux distribution, which involved rewriting a lot of command line utilities that relied on features unavailable on nommu hardware.

In 2003 Jeff moved to Japan and handed the project off to people who allowed it to roll to a stop. The website turned into a mess of 404 links, the navigation indexes stopped being updated over a decade ago, and the project's CVS repository suffered a hard drive failure for which there were no backups. The project continued to put out "releases" through 2014 (you have to scroll down in the "news" section to find them, the "HTTP download" section in the nav bar on the left hasn't been updated in over a decade), which were hand-updated tarball snapshots mostly consisting of software from the 1990's. For example the 2014 release still contained ipfwadm, the package which predated ipchains, which predated iptables, which is in the process of being replaced by nftables.

Nevertheless, people still try to use this because (at least until the launch of nommu.org) the project was viewed as the place to discuss, develop, and learn about nommu Linux. The role of uclinux.org as an educational resource kept people coming to it long after it had collapsed as a Linux distro.

Starting around 0.6.0 toybox began to address nommu support with the goal of putting uClinux out of its misery.

An analysis of uClinux-dist-20140504 found 312 package subdirectories under "user".

Taking out the trash

A bunch of packages (inotify-tools, input-event-demon, ipsec-tools, netifd, keepalived, mobile-broadband-provider-info, nuttp, readline, snort, snort-barnyard, socat, sqlite, sysklogd, sysstat, tcl, ubus, uci, udev, unionfs, uqmi, usb_modeswitch, usbutils, util-linux) are hard to evaluate because uclinux has directories for them, but their source isn't actually in the uclinux tree. In some of these the makefiles download a git repo during the build, so I'm assuming you can build the external package if you really care. (Even when I know what these packages do, I'm skipping them because uclinux doesn't actually contain them, and any given snapshot of the build system will bitrot as external web links change over time.)

Other packages are orphaned, meaning they're not mentioned from any Kconfig or Makefiles outside of their directory, so uclinux can't actually build them: mbus is an orphaned i2c test program expecting to run in some sort of hardwired hardware context, mkeccbin is an orphaned "ECC annotated binary file" generator (meaning it's half of a flash writer), wsc_upnp is a "Ralink WPS" driver (some sort of stale wifi chip)...

The majority of the remaining packages are probably not of interest to toybox due to being so obsolete or special purpose they may not actually be of interest to anybody anymore. (This list also includes a lot of special-purpose network back-end stuff that's hard for anybody but datacenter admins to evaluate the current relevance of.)

arj asterisk boottools bpalogin br2684ctl camserv can4linux cgi_generic cgihtml clamav clamsmtp conntrack-tools cramfs crypto-tools cxxtest ddns3-client de2ts-cal debug demo diald discard dnsmasq dnsmasq2 ethattach expat-examples ez-ipupdate fakeidentd fconfig ferret flatfs flthdr freeradius freeswan frob-led frox fswcert game gettyd gnugk haserl horch hostap hping httptunnel ifattach ipchains ipfwadm ipmasqadm ipportfw ipredir ipset iso_client jamvm jffs-tools jpegview jquery-ui kendin-config kismet klaxon kmod l2tpd lcd ledcmd ledcon lha lilo lirc lissa load loattach lpr lrpstat lrzsz mail mbus mgetty microwin ModemManager msntp musicbox nooom null openswan openvpn palmbot pam_* pcmcia-cs playrt plugdaemon pop3proxy potrace qspitest quagga radauth ramimage readprofile rdate readprofile routed rrdtool rtc-ds1302 sendip ser sethdlc setmac setserial sgutool sigs siproxd slattach smtpclient snmpd net-snmp snortrules speedtouch squashfs scep sslwrap stp stunnel tcpblast tcpdump tcpwrappers threaddemos tinylogin tinyproxy tpt tripwire unrar unzoo version vpnled w3cam xl2tpd zebra

This stuff is all over the place: arj, lha, rar, and zoo are DOS archivers, ethattach describes itself as just "a network tool", mail is a textmode smtp mailer literally described as "Some kind of mail proggy" in uclinux's kconfig (as opposed to clamsmtp and smtpclient and so on), this gettyd isn't a generic version but specifically a hardwired ppp dialin utility, mgetty isn't a generic version but is combined with "sendfax", hostap is an intersil prism driver, wlan-ng is also an intersil prism dirver, null is a program to intentionally dereference a null pointer (in case you needed one), iso_client is a "Demo Application for the USB Device Driver", kendin-config is "for configuring the Micrel Kendin KS8995M over QSPI", speedtouch configures a specific brand of asdl modem, portmap is part of Anfs, ferret, linux-igd, and miniupnp are all upnp packages, lanbypass "can be used to control the LAN bypass switches on the Advantech x86 based hardware platforms", lcd is "test of lcddma device driver" (an out-of-tree Coldfire driver apparently lost to history, the uclinux linux-2.4.x directory has a config symbol for it, but nothing in the code actually _uses_ it...), qspitest is another coldfire thing, mii-tool-fec is "strictly for the FEC Ethernet driver as implemented (and modified) for the uCdimm5272", rtc-ds1302 and rtc-m41t11 are usermode drivers for specific clock chips, stunnel is basically "openssl s_client -quiet -connect", potrace is a bitmap to vector graphic converter, radauth performs command line authentication against a radius server, clamav, klaxon, ferret, l7-protocols, and nessus are very old network security software (it's got a stale snapshot of nmap too), xl2tpd is a PPP over UDP tunnel (rfc 2661), zebra is the package quagga replaced, lilo is the x86-only bootloader that predated grub (and recently discontinued development), lissa is a "framebuffer graphics demo" from 1998, the squashfs package here is the out of tree patches for 2.4 kernels and such before the filesystem was merged upstream (as opposed to the squashfs-new package which is a snapshot of the userspace tool from 2011), load is basically "dd file /dev/spi", version is basically "cat /proc/version", microwin is a port of the WinCE graphics API to Linux, scep is a 2003 implementation of an IETF draft abandoned in 2010, tpt depends on Andrew Morton's 15 year old unmerged "timepegs" kernel patch using the pentium cycle counter, vpnled controls a light that reboots systems (what?), w3cam is a video4linux 1.0 client (v4l2 showed up during 2.5 and support for the old v4l1 was removed in 2.6.38 back in 2011), busybox ate tinylogin over a decade ago, lrpstat is a java network monitor from 2001, lrzsz is zmodem/ymodem/zmodem, msntp and stp implement rfc2030 meaning it overflows in 2036 (the package was last updated in 2000), rdate is rfc 868 meaning it also overflows in 2036 (which is why ntp was invented a few decades back), reiserfsprogs development stopped abruptly after Hans Reiser was convicted of murdering his wife Nina (denying it on the stand and then leading them to the body as part of his plea bargain during sentencing)...

Seriously, there's a lot of crap in there. It's hard to analyze most of it far enough to prove it _doesn't_ do anything.

Non-toybox programs

The following software may actually still do something intelligible (although the package versions tend to be years out of date), but it's not a direction toybox has chosen to go in.

There are several programming languages (bash, lua, jamvm, tinytcl, perl, python) in there. Maybe someone somewhere wants a 2008 release of a java virtual machine tested to work on nommu systems (jamvm), but it's out of scope for toybox.

A bunch of benchmark programs: cpu, dhrystone, mathtest, nbench, netperf, netpipe, and whetstone.

A bunch of web servers: appWeb, boa, fnord (via tcpserver), goahead, httpd, mini_httpd, and thttpd.

A bunch of shells: msh is a clever (I.E. obfuscated) little shell, nwsh is "new shell" (that's what it called itself in 1999 anyway), sash is another shell with a bunch of builtins (ls, ps, df, cp, date, reboot, and shutdown, this roadmap analyzes it elsewhere), sh is a very old minix shell fork, and tcsh is also a shell.

Also in this category, we have:

dropbear jffs-tools jpegview kexec-tools bind ctorrent iperf iproute2 ip-sentinel iptables kexec nmap oggplay openssl oprofile p7zip pppd pptp play vplay hdparm mp3play at clock mtd-utils mysql logrotate brcfg bridge-utils flashw ebtables etherwake ethtool expect gdb gdbserver hostapd lm_sensors load netflash netstat-nat radvd recover rootloader resolveip rp-pppoe rsyslog rsyslogd samba smbmount squashfs-new squid ssh strace tip uboot-envtools ulogd usbhubctrl vconfig vixie-cron watchdogd wireless_tools wpa_supplicant

An awful lot of those are borderline: play and vplay are wav file audio players, there's oprofile _and_ readprofile (which just reads kernel profiling data from /proc/profile), radvd is a "routr advertisement daemon" (ipv6 stateless autoconf), ctorrent is a bittorent client, lm_sensors is hardware (heat?) monitoring, resolveip is dig only less so, rp-pppoe is ppp over ethernet, ebtables is an ethernet version of iptables (for bridging), their dropbear is from 2012, and that ssh version is from 2011 (which means it's about nine months too _old_ to have the heartbleed bug). There's both ulogd and ulogd2 (no idea why), and pppd is version 2.4 but there's a ppd-2.3 directory also.

Lots of flash stuff: flashw is a flash writer, load is an spi flash loader, netflash writes to flash via tftp, recover is also a reflash daemon intended to come up when the system can't boot, rootloader seems to be another reflash daemon but without dhcp.

Already in roadmap

The following packages contain commands already in the toybox roadmap:

agetty cal cksum cron dhcpcd dhcpcd-new dhcpd dhcp-isc dosfstools e2fsprogs elvis-tiny levee fdisk fileutils ftp ftpd grep hd hwclock inetd init ntp iputils login module-init-tools netcat shutils ntpdate lspci ping procps proftpd rsync shadow shutils stty sysutils telnet telnetd tftp tftpd traceroute unzip wget mawk net-tools

There are some duplicates in there, levee is a tiny vi implementation like elvis-tiny, ntp and ntpdate overlap, etc.

Verdict: We don't really need to do a whole lot special for nommu systems, just get the existing toybox roadmap working on nommu and we're good. The uClinux project can rest in peace.


Requests:

The following additional commands have been requested (and often submitted) by various users. I _really_ need to clean up this section.

Also:

dig freeramdisk getty halt hexdump hwclock klogd modprobe ping ping6 pivot_root poweroff readahead rev sfdisk sudo syslogd taskset telnet telnetd tracepath traceroute unzip usleep vconfig zip free login modinfo unshare netcat help w iwconfig iwlist rdate dos2unix unix2dos catv clear pmap realpath setsid timeout truncate mkswap swapon swapoff count oneit fstype acpi blkid eject pwdx sulogin rfkill bootchartd arp makedevs sysctl killall5 crond crontab deluser last mkpasswd watch blockdev rpm2cpio arping brctl dumpleases fsck tcpsvd tftpd factor fallocate fsfreeze inotifyd lspci nbd-client partprobe strings base64 mix reset hexedit nsenter shred fsync insmod ionice lsmod lsusb rmmod vmstat xxd top iotop lsof ionice compress dhcp dhcpd addgroup delgroup host iconv ip ipcrm ipcs netstat openvt deallocvt iorenice udpsvd adduser microcom tunctl chrt getfattr setfattr kexec ascii crc32 devmem fmt i2cdetect i2cdump i2cget i2cset mcookie prlimit sntp ulimit uuidgen dhcp6 ipaddr iplink iproute iprule iptunnel cd exit toysh bash traceroute6 blkdiscard rtcwake watchdog

Other packages

System administrators have asked what other Linux packages toybox commands replace, so they can annotate alternatives in their package management system.

This section uses the package definitions from Chapter 6 of Linux From Scratch 9.0). Each package lists what we currently replace, pending commands [in square brackets], and what we DON'T plan to implement.

Each "see also" note means the listed package also installs the listed shared libraries. (While toybox contains equivalent functionality to a lot of these shared libraries in its lib/ directory, it does not currently provide a shared library interface.)

Packages toybox plans to provide complete-ish replacents for:

Commentary: toybox init doesn't do runlevels, man and vim are just the relevant commands without the piles of strange overgrowth, and if you want to call a toybox binary by another name you can create a symlink to a symlink. If somebody really wants to argue for "gzexe" or similar, be my guest, but there's a lot of obsolete crap in shadow, coreutils, util-linux...

No idea why LFS is installing inetutils instead of net-tools (which contains arp route ifconfig mii-tool nameif netstat and rarp that toybox does or might implement, and plipconfig slattach that it probably won't.)

Packages toybox plans to provide partial replacents for:

Toybox provides replacements for some binaries from these packages, but there are other useful binaries which this package provides that toybox currently considers out of scope for the project:

Toybox provides several decompressors but compresses to a single format (deflate, ala gzip/zlib). Our e2fsprogs doesn't currently plan to support ext4 or defrag. The "qcc" reference is because someday an external project to glue QEMU's Tiny Code Generator to Fabrice Bellard's old Tiny C Compiler making a multicall binary that does cc/ld/as for all the targets QEMU supports (then use the LLVM C Backend to compile LLVM itself to C for use as a modern replacement for cfront to bootstrap C++ code) is under consideration as a successor project to toybox. Until then things like objdump -d (requiring target-specific disassembly for an unbounded number of architectures) are out of scope for toybox. (This means drawing the line somewhere between architecture-specific support in file and strace, and including a full assembler for each architecture.)

Packages from LFS ch6 toybox does NOT plan to replace:

That said, we do implement our own zlib and readline replacements, and presumably _could_ export them as library bindings. Plus we provide our own version of a bunch of the section 1 man pages (as command help). Possibly libcap and acl are interesting?

Misc

The kbd package has over a dozen commands, we only implement chvt. The iproute2 package implements over a dozen commands, there's an "ip" in pending but I'm not a fan (ifconfig and route and such should be extended to work properly). We don't implement eudev, but toybox's maintainer created busybox mdev way back when (which replaces it) and plans to do a new one for toybox as soon as we work out what subset is still needed now that devtmpfs is available.