From 5ffa45661c50be1e6d384156be9b83fbed1356b3 Mon Sep 17 00:00:00 2001 From: Rob Landley Date: Mon, 24 Apr 2017 19:01:33 -0500 Subject: Another FAQ, and some related tweaks as long as I was there. --- www/design.html | 17 +++++++----- www/faq.html | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 90 insertions(+), 13 deletions(-) diff --git a/www/design.html b/www/design.html index 37838be9..707596bc 100644 --- a/www/design.html +++ b/www/design.html @@ -303,8 +303,8 @@ effort on them.

We depend on C99 and posix-2008 libc features such as the openat() family of functions. We also assume certain "modern" linux kernel behavior such -as large environment sizes (linux commit b6a2fea39318, went into 2.6.22 -released July 2007). In theory this shouldn't prevent us from working on +as large environment sizes (linux commit b6a2fea39318, went into 2.6.22 +released July 2007). In theory this shouldn't prevent us from working on older kernels or other implementations (ala BSD), but we don't police their corner cases.

@@ -316,9 +316,8 @@ in embedded devices for several more years.

Toybox relies on the fact that on any Unix-like platform, pointer and long are always the same size (on both 32 and 64 bit). Pointer and int are _not_ -the same size on 64 bit systems, but pointer and long are.

- -

This is guaranteed by the LP64 memory model, a Unix standard (which Linux +the same size on 64 bit systems, but pointer and long are. +This is guaranteed by the LP64 memory model, a Unix standard (which Linux and MacOS X both implement, and which modern 64 bit processors such as x86-64 were designed for). See the LP64 standard and @@ -335,7 +334,8 @@ platforms like arm, char defaults to signed. This difference can lead to subtle portability bugs, and to avoid them we specify which one we want by feeding the compiler -funsigned-char.

-

The reason to pick "unsigned" is that way we're 8-bit clean by default.

+

The reason to pick "unsigned" is that way char strings are 8-bit clean by +default, which makes UTF-8 support easier.

Error messages and internationalization:

@@ -373,6 +373,9 @@ of it.)

Locale support isn't currently a goal; that's a presentation layer issue (I.E. a GUI problem).

+

Someday we should probably have translated --help text, but that's a +post-1.0 issue.

+

Shared Libraries

Toybox's policy on shared libraries is that they should never be @@ -486,7 +489,7 @@ varargs), "if (function() != NULL)" is the same as "if (function())",

The goal is to be concise, not cryptic: if you're worried about the code being hard to understand, splitting it to multiple steps on multiple lines is -better than a NOP operation like "!= NULL". A common sign of trying to +better than a NOP operation like "!= NULL". A common sign of trying too hard is nesting ? : three levels deep, sometimes if/else and a temporary variable is just plain easier to read. If you think you need a comment, you may be right.

diff --git a/www/faq.html b/www/faq.html index 6cc7cdb8..c5709eff 100755 --- a/www/faq.html +++ b/www/faq.html @@ -5,6 +5,7 @@
  • Do you capitalize toybox?

  • Why toybox? (What was wrong with busybox?)

  • Why a 7 year support horizon?

  • +
  • Where do I start understanding the toybox source code?

  • @@ -80,18 +81,19 @@ law below the high end systems, and that another 2-3 iterations should cover the useful lifetime of most systems no longer being sold but still in use and potentially being upgraded to new software releases.

    -

    It turns out I missed industry changes in the 1990's that stretched the gap -from low end to high end from 2 cycles to 4 cycles (here's my writeup on that; and _that_ analysis +

    It turns out I missed +industry changes in the 1990's that stretched the gap +from low end to high end from 2 cycles to 4 cycles, and _that_ analysis ignored the switch from PC to smartphone cutting off the R&D air supply of the laptop market. Meanwhile the Moore's Law s-curve started bending -down in 2000 and these days is pretty flat: the drive for faster clock speeds -stumbled -then died, +down in 2000 and these days is pretty flat because the drive for faster clock +speeds stumbled +then died, and the subsequent drive to go wide maxed out around 4x SMP with ~2 megabyte caches for most applications. These days the switch from exponential to linear growth in hardware capabilities is common -knowledge.)

    +knowledge.

    But the 7 year rule of thumb stuck around anyway: if a kernel or libc feature is less than 7 years old, I try to have a build-time configure test @@ -99,4 +101,76 @@ for it and let the functionality cleanly drop out. I also keep old Ubuntu images around in VMs and perform the occasional defconfig build there to see what breaks.

    +

    Where do I start understanding the source code?

    + +

    Toybox is written in C. There are longer writeups of the +design ideas and a code walkthrough, +and the about page summarizes what we're trying to +accomplish, but here's a quick start:

    + +

    Toybox uses the standard three stage configure/make/install +build, in this case "make defconfig; +make; make install". Type "make help" to +see available make targets.

    + +

    The configure stage is copied from the Linux kernel (in the "kconfig" +directory), and saves your selections in the file ".config" at the top +level. The "defconfig" target selects the +maximum sane configuration (enabling all the commands and features that +aren't unfinished, only intended as examples, debug code, etc) and is +probably what you want. You can use "make menuconfig" to manually select +specific commands to include, through an interactive menu (cursor up and +down, enter to descend into a sub-menu, space to select an entry, ? to see +an entry's help text, esc to exit). The menuconfig help text is the +same as the command's --help output.

    + +

    The "make" stage creates a toybox binary (which is stripped, look in +generated/unstripped for the debug versions), and "install" adds a bunch of +symlinks to toybox under the various command names. Toybox determines which +command to run based on the filename, or you can use the "toybox" name in which case the first +argument is the command to run (ala "toybox ls -l"). You can also build +individual commands as standalone executables, ala "make sed cat ls".

    + +

    The main() function is in main.c at the top level, +along with setup plumbing and selecting which command to run this time. +The function toybox_main() implements the "toybox" multiplexer command.

    + +

    The individual command implementations are under "toys", and are grouped +into categories (mostly based on which standard they come from, posix, lsb, +android...) The "pending" directory contains unfinished commands, and the +"examples" directory contains examples. Commands in those two directories +are _not_ selected by defconfig. (These days pending directory is mostly +third party submissions that have not yet undergone proper code review.)

    + +

    Common infrastructure shared between commands is under "lib". Most +commands call lib/args.c to parse their command line arguments before calling +the command's own main() function, which uses the option string in +the command's NEWTOY() macro. This is similar to the libc function getopt(), +but more powerful, and is documented at the top of lib/args.c.

    + +

    Most of the actual build/install infrastructure is shell scripts under +"scripts". These populate the "generated" directory with headers +created from other files, which are described +in the code walkthrough. All the +build's temporary files live under generated, including the .o files built +from the .c files (in generated/obj). The "make clean" target deletes that +directory. ("make distclean" also deletes your .config and deletes the +kconfig binaries that process .config.)

    + +

    Each command's file contains all the information for that command, so +adding a command to toybox means adding a single file under "toys". +Usually you start a new command by copying an +existing command file to a new filename +(toys/examples/hello.c, toys/examples/skeleton.c, toys/posix/cat.c, +and toys/posix/true.c have all been used for this purpose) and then replacing +all instances of its old name with the new name (which should match the +new filename), and modifying the help text, argument string, and what the +code does. You might have to "make distclean" before you new command +shows up in defconfig or menuconfig.

    + +

    The toybox test suite lives in the "tests" directory. From the top +level you can "make tests" to test everything, or "make test_sed" test a +single command's standalone version (which should behave identically) +but that's why we test.

    + -- cgit v1.2.3