2 files changed, 90 insertions, 13 deletions
diff --git a/www/design.html b/www/design.html
index 37838be9..707596bc 100644
--- a/www/design.html
+++ b/www/design.html
@@ -303,8 +303,8 @@ effort on them.</p>
 
 <p>We depend on C99 and posix-2008 libc features such as the openat() family of
 functions. We also assume certain "modern" linux kernel behavior such
-as large environment sizes (linux commit b6a2fea39318, went into 2.6.22
-released July 2007). In theory this shouldn't prevent us from working on
+as large environment sizes (<a href=https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318>linux commit b6a2fea39318</a>, went into 2.6.22
+released <a href=faq.html#support_horizon>July 2007</a>). In theory this shouldn't prevent us from working on
 older kernels or other implementations (ala BSD), but we don't police their
 corner cases.</p>
 
@@ -316,9 +316,8 @@ in embedded devices for several more years.</p>
 
 <p>Toybox relies on the fact that on any Unix-like platform, pointer and long
 are always the same size (on both 32 and 64 bit). Pointer and int are _not_
-the same size on 64 bit systems, but pointer and long are.</p>
-
-<p>This is guaranteed by the LP64 memory model, a Unix standard (which Linux
+the same size on 64 bit systems, but pointer and long are.
+This is guaranteed by the LP64 memory model, a Unix standard (which Linux
 and MacOS X both implement, and which modern 64 bit processors such as
 x86-64 were <a href=http://www.pagetable.com/?p=6>designed for</a>).  See
 <a href=http://www.unix.org/whitepapers/64bit.html>the LP64 standard</a> and
@@ -335,7 +334,8 @@ platforms like arm, char defaults to signed.  This difference can lead to
 subtle portability bugs, and to avoid them we specify which one we want by
 feeding the compiler -funsigned-char.</p>
 
-<p>The reason to pick "unsigned" is that way we're 8-bit clean by default.</p>
+<p>The reason to pick "unsigned" is that way char strings are 8-bit clean by
+default, which makes UTF-8 support easier.</p>
 
 <p><h3>Error messages and internationalization:</h3></p>
 
@@ -373,6 +373,9 @@ of it.)</p>
 <p>Locale support isn't currently a goal; that's a presentation layer issue
 (I.E. a GUI problem).</p>
 
+<p>Someday we should probably have translated --help text, but that's a
+post-1.0 issue.</p>
+
 <p><h3>Shared Libraries</h3></p>
 
 <p>Toybox's policy on shared libraries is that they should never be
@@ -486,7 +489,7 @@ varargs), "if (function() != NULL)" is the same as "if (function())",
 <p>The goal is to be
 concise, not cryptic: if you're worried about the code being hard to
 understand, splitting it to multiple steps on multiple lines is
-better than a NOP operation like "!= NULL". A common sign of trying to
+better than a NOP operation like "!= NULL". A common sign of trying too
 hard is nesting ? : three levels deep, sometimes if/else and a temporary
 variable is just plain easier to read. If you think you need a comment,
 you may be right.</p>
diff --git a/www/faq.html b/www/faq.html
index 6cc7cdb8..c5709eff 100755
--- a/www/faq.html
+++ b/www/faq.html
@@ -5,6 +5,7 @@
 <li><h2><a href="#capitalize">Do you capitalize toybox?</a></h2></li>
 <li><h2><a href="#why_toybox">Why toybox? (What was wrong with busybox?)</a></h2></li>
 <li><h2><a href="#support_horizon">Why a 7 year support horizon?</a></h2></li>
+<li><h2><a href="#code">Where do I start understanding the toybox source code?</a></h2></li>
 </ul>
 
 <a name="capitalize" />
@@ -80,18 +81,19 @@ law below the high end systems, and that another 2-3 iterations should cover
 the useful lifetime of most systems no longer being sold but still in use and
 potentially being upgraded to new software releases.</p>
 
-<p>It turns out I missed industry changes in the 1990's that stretched the gap
-from low end to high end from 2 cycles to 4 cycles (<a href=http://landley.net/notes-2011.html#26-06-2011>here's my writeup on that</a>; and _that_ analysis
+<p>It turns out <a href=http://landley.net/notes-2011.html#26-06-2011>I missed
+industry changes</a> in the 1990's that stretched the gap
+from low end to high end from 2 cycles to 4 cycles, and _that_ analysis
 ignored the switch from PC to smartphone cutting off the R&D air supply of the
 laptop market.  Meanwhile the Moore's Law s-curve started bending
-down in 2000 and these days is pretty flat: the drive for faster clock speeds
-<a href=http://www.anandtech.com/show/613>stumbled</a>
-then <a href=http://www.pcworld.com/article/118603/article.html>died</a>,
+down in 2000 and these days is pretty flat because the drive for faster clock
+speeds <a href=http://www.anandtech.com/show/613>stumbled</a>
+then <a href=http://www.pcworld.com/article/118603/article.html>died</a>, and
 the subsequent drive to go wide maxed out around 4x SMP with ~2 megabyte
 caches for most applications. These days the switch from exponential to
 linear growth in hardware capabilities is
 <a href=https://www.cnet.com/news/end-of-moores-law-its-not-just-about-physics/>common</a>
-<a href=http://www.acm.org/articles/people-of-acm/2016/david-patterson>knowledge</a>.)</p>
+<a href=http://www.acm.org/articles/people-of-acm/2016/david-patterson>knowledge</a>.</p>
 
 <p>But the 7 year rule of thumb stuck around anyway: if a kernel or libc
 feature is less than 7 years old, I try to have a build-time configure test
@@ -99,4 +101,76 @@ for it and let the functionality cleanly drop out. I also keep old Ubuntu
 images around in VMs and perform the occasional defconfig build there to
 see what breaks.</p>
 
+<h2><a name="code" />Where do I start understanding the source code?</h2>
+
+<p>Toybox is written in C. There are longer writeups of the
+<a href=design.html>design ideas</a> and a <a href=code.html>code walkthrough</a>,
+and the <a href=about.html>about page</a> summarizes what we're trying to
+accomplish, but here's a quick start:</p>
+
+<p>Toybox uses the standard three stage configure/make/install
+<a href=code.html#building>build</a>, in this case "<b>make defconfig;
+make; make install</b>". Type "<b>make help</b>" to
+see available make targets.</p>
+
+<p><b>The configure stage is copied from the Linux kernel</b> (in the "kconfig"
+directory), and saves your selections in the file ".config" at the top
+level. The "defconfig" target selects the
+maximum sane configuration (enabling all the commands and features that
+aren't unfinished, only intended as examples, debug code, etc) and is
+probably what you want. You can use "make menuconfig" to manually select
+specific commands to include, through an interactive menu (cursor up and
+down, enter to descend into a sub-menu, space to select an entry, ? to see
+an entry's help text, esc to exit). The menuconfig help text is the
+same as the command's --help output.</p>
+
+<p><b>The "make" stage creates a toybox binary</b> (which is stripped, look in
+generated/unstripped for the debug versions), and "install" adds a bunch of
+symlinks to toybox under the various command names. Toybox determines which
+command to run based on the filename, or you can use the "toybox" name in which case the first
+argument is the command to run (ala "toybox ls -l"). <b>You can also build
+individual commands as standalone executables</b>, ala "make sed cat ls".</p>
+
+<p><b>The main() function is in main.c at the top level</b>,
+along with setup plumbing and selecting which command to run this time.
+The function toybox_main() implements the "toybox" multiplexer command.</p>
+
+<p><b>The individual command implementations are under "toys"</b>, and are grouped
+into categories (mostly based on which standard they come from, posix, lsb,
+android...) The "pending" directory contains unfinished commands, and the
+"examples" directory contains examples. Commands in those two directories
+are _not_ selected by defconfig. (These days pending directory is mostly
+third party submissions that have not yet undergone proper code review.)</p>
+
+<p><b>Common infrastructure shared between commands is under "lib"</b>. Most
+commands call lib/args.c to parse their command line arguments before calling
+the command's own main() function, which uses the option string in
+the command's NEWTOY() macro. This is similar to the libc function getopt(),
+but more powerful, and is documented at the top of lib/args.c.</p>
+
+<p>Most of the actual <b>build/install infrastructure is shell scripts under
+"scripts"</b>. <b>These populate the "generated" directory</b> with headers
+created from other files, which are <a href=code.html#generated>described</a>
+in the code walkthrough. All the
+build's temporary files live under generated, including the .o files built
+from the .c files (in generated/obj). The "make clean" target deletes that
+directory. ("make distclean" also deletes your .config and deletes the
+kconfig binaries that process .config.)</p>
+
+<p>Each command's file contains all the information for that command, so
+<b>adding a command to toybox means adding a single file under "toys"</b>.
+Usually you <a href=code.html#adding>start a new command</a> by copying an
+existing command file to a new filename
+(toys/examples/hello.c, toys/examples/skeleton.c, toys/posix/cat.c,
+and toys/posix/true.c have all been used for this purpose) and then replacing
+all instances of its old name with the new name (which should match the
+new filename), and modifying the help text, argument string, and what the
+code does. You might have to "make distclean" before you new command
+shows up in defconfig or menuconfig.</p>
+
+<p><b>The toybox test suite lives in the "tests" directory</b>. From the top
+level you can "make tests" to test everything, or "make test_sed" test a
+single command's standalone version (which should behave identically)
+but that's why we test.</p>
+
 <!--#include file="footer.html" -->