diff options
-rw-r--r-- | www/design.html | 17 | ||||
-rwxr-xr-x | www/faq.html | 86 |
2 files changed, 90 insertions, 13 deletions
diff --git a/www/design.html b/www/design.html index 37838be9..707596bc 100644 --- a/www/design.html +++ b/www/design.html @@ -303,8 +303,8 @@ effort on them.</p> <p>We depend on C99 and posix-2008 libc features such as the openat() family of functions. We also assume certain "modern" linux kernel behavior such -as large environment sizes (linux commit b6a2fea39318, went into 2.6.22 -released July 2007). In theory this shouldn't prevent us from working on +as large environment sizes (<a href=https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318>linux commit b6a2fea39318</a>, went into 2.6.22 +released <a href=faq.html#support_horizon>July 2007</a>). In theory this shouldn't prevent us from working on older kernels or other implementations (ala BSD), but we don't police their corner cases.</p> @@ -316,9 +316,8 @@ in embedded devices for several more years.</p> <p>Toybox relies on the fact that on any Unix-like platform, pointer and long are always the same size (on both 32 and 64 bit). Pointer and int are _not_ -the same size on 64 bit systems, but pointer and long are.</p> - -<p>This is guaranteed by the LP64 memory model, a Unix standard (which Linux +the same size on 64 bit systems, but pointer and long are. +This is guaranteed by the LP64 memory model, a Unix standard (which Linux and MacOS X both implement, and which modern 64 bit processors such as x86-64 were <a href=http://www.pagetable.com/?p=6>designed for</a>). See <a href=http://www.unix.org/whitepapers/64bit.html>the LP64 standard</a> and @@ -335,7 +334,8 @@ platforms like arm, char defaults to signed. This difference can lead to subtle portability bugs, and to avoid them we specify which one we want by feeding the compiler -funsigned-char.</p> -<p>The reason to pick "unsigned" is that way we're 8-bit clean by default.</p> +<p>The reason to pick "unsigned" is that way char strings are 8-bit clean by +default, which makes UTF-8 support easier.</p> <p><h3>Error messages and internationalization:</h3></p> @@ -373,6 +373,9 @@ of it.)</p> <p>Locale support isn't currently a goal; that's a presentation layer issue (I.E. a GUI problem).</p> +<p>Someday we should probably have translated --help text, but that's a +post-1.0 issue.</p> + <p><h3>Shared Libraries</h3></p> <p>Toybox's policy on shared libraries is that they should never be @@ -486,7 +489,7 @@ varargs), "if (function() != NULL)" is the same as "if (function())", <p>The goal is to be concise, not cryptic: if you're worried about the code being hard to understand, splitting it to multiple steps on multiple lines is -better than a NOP operation like "!= NULL". A common sign of trying to +better than a NOP operation like "!= NULL". A common sign of trying too hard is nesting ? : three levels deep, sometimes if/else and a temporary variable is just plain easier to read. If you think you need a comment, you may be right.</p> diff --git a/www/faq.html b/www/faq.html index 6cc7cdb8..c5709eff 100755 --- a/www/faq.html +++ b/www/faq.html @@ -5,6 +5,7 @@ <li><h2><a href="#capitalize">Do you capitalize toybox?</a></h2></li> <li><h2><a href="#why_toybox">Why toybox? (What was wrong with busybox?)</a></h2></li> <li><h2><a href="#support_horizon">Why a 7 year support horizon?</a></h2></li> +<li><h2><a href="#code">Where do I start understanding the toybox source code?</a></h2></li> </ul> <a name="capitalize" /> @@ -80,18 +81,19 @@ law below the high end systems, and that another 2-3 iterations should cover the useful lifetime of most systems no longer being sold but still in use and potentially being upgraded to new software releases.</p> -<p>It turns out I missed industry changes in the 1990's that stretched the gap -from low end to high end from 2 cycles to 4 cycles (<a href=http://landley.net/notes-2011.html#26-06-2011>here's my writeup on that</a>; and _that_ analysis +<p>It turns out <a href=http://landley.net/notes-2011.html#26-06-2011>I missed +industry changes</a> in the 1990's that stretched the gap +from low end to high end from 2 cycles to 4 cycles, and _that_ analysis ignored the switch from PC to smartphone cutting off the R&D air supply of the laptop market. Meanwhile the Moore's Law s-curve started bending -down in 2000 and these days is pretty flat: the drive for faster clock speeds -<a href=http://www.anandtech.com/show/613>stumbled</a> -then <a href=http://www.pcworld.com/article/118603/article.html>died</a>, +down in 2000 and these days is pretty flat because the drive for faster clock +speeds <a href=http://www.anandtech.com/show/613>stumbled</a> +then <a href=http://www.pcworld.com/article/118603/article.html>died</a>, and the subsequent drive to go wide maxed out around 4x SMP with ~2 megabyte caches for most applications. These days the switch from exponential to linear growth in hardware capabilities is <a href=https://www.cnet.com/news/end-of-moores-law-its-not-just-about-physics/>common</a> -<a href=http://www.acm.org/articles/people-of-acm/2016/david-patterson>knowledge</a>.)</p> +<a href=http://www.acm.org/articles/people-of-acm/2016/david-patterson>knowledge</a>.</p> <p>But the 7 year rule of thumb stuck around anyway: if a kernel or libc feature is less than 7 years old, I try to have a build-time configure test @@ -99,4 +101,76 @@ for it and let the functionality cleanly drop out. I also keep old Ubuntu images around in VMs and perform the occasional defconfig build there to see what breaks.</p> +<h2><a name="code" />Where do I start understanding the source code?</h2> + +<p>Toybox is written in C. There are longer writeups of the +<a href=design.html>design ideas</a> and a <a href=code.html>code walkthrough</a>, +and the <a href=about.html>about page</a> summarizes what we're trying to +accomplish, but here's a quick start:</p> + +<p>Toybox uses the standard three stage configure/make/install +<a href=code.html#building>build</a>, in this case "<b>make defconfig; +make; make install</b>". Type "<b>make help</b>" to +see available make targets.</p> + +<p><b>The configure stage is copied from the Linux kernel</b> (in the "kconfig" +directory), and saves your selections in the file ".config" at the top +level. The "defconfig" target selects the +maximum sane configuration (enabling all the commands and features that +aren't unfinished, only intended as examples, debug code, etc) and is +probably what you want. You can use "make menuconfig" to manually select +specific commands to include, through an interactive menu (cursor up and +down, enter to descend into a sub-menu, space to select an entry, ? to see +an entry's help text, esc to exit). The menuconfig help text is the +same as the command's --help output.</p> + +<p><b>The "make" stage creates a toybox binary</b> (which is stripped, look in +generated/unstripped for the debug versions), and "install" adds a bunch of +symlinks to toybox under the various command names. Toybox determines which +command to run based on the filename, or you can use the "toybox" name in which case the first +argument is the command to run (ala "toybox ls -l"). <b>You can also build +individual commands as standalone executables</b>, ala "make sed cat ls".</p> + +<p><b>The main() function is in main.c at the top level</b>, +along with setup plumbing and selecting which command to run this time. +The function toybox_main() implements the "toybox" multiplexer command.</p> + +<p><b>The individual command implementations are under "toys"</b>, and are grouped +into categories (mostly based on which standard they come from, posix, lsb, +android...) The "pending" directory contains unfinished commands, and the +"examples" directory contains examples. Commands in those two directories +are _not_ selected by defconfig. (These days pending directory is mostly +third party submissions that have not yet undergone proper code review.)</p> + +<p><b>Common infrastructure shared between commands is under "lib"</b>. Most +commands call lib/args.c to parse their command line arguments before calling +the command's own main() function, which uses the option string in +the command's NEWTOY() macro. This is similar to the libc function getopt(), +but more powerful, and is documented at the top of lib/args.c.</p> + +<p>Most of the actual <b>build/install infrastructure is shell scripts under +"scripts"</b>. <b>These populate the "generated" directory</b> with headers +created from other files, which are <a href=code.html#generated>described</a> +in the code walkthrough. All the +build's temporary files live under generated, including the .o files built +from the .c files (in generated/obj). The "make clean" target deletes that +directory. ("make distclean" also deletes your .config and deletes the +kconfig binaries that process .config.)</p> + +<p>Each command's file contains all the information for that command, so +<b>adding a command to toybox means adding a single file under "toys"</b>. +Usually you <a href=code.html#adding>start a new command</a> by copying an +existing command file to a new filename +(toys/examples/hello.c, toys/examples/skeleton.c, toys/posix/cat.c, +and toys/posix/true.c have all been used for this purpose) and then replacing +all instances of its old name with the new name (which should match the +new filename), and modifying the help text, argument string, and what the +code does. You might have to "make distclean" before you new command +shows up in defconfig or menuconfig.</p> + +<p><b>The toybox test suite lives in the "tests" directory</b>. From the top +level you can "make tests" to test everything, or "make test_sed" test a +single command's standalone version (which should behave identically) +but that's why we test.</p> + <!--#include file="footer.html" --> |