diff options
-rwxr-xr-x | www/code.html | 183 |
1 files changed, 107 insertions, 76 deletions
diff --git a/www/code.html b/www/code.html index 64ee98f5..a5ffa155 100755 --- a/www/code.html +++ b/www/code.html @@ -129,7 +129,7 @@ files generated from other parts of the source code.</li> </ul> <a name="adding" /> -<p><h1>Adding a new command</h1></p> +<p><h1><a href="#adding">Adding a new command</a></h1></p> <p>To add a new command to toybox, add a C file implementing that command under the toys directory. No other files need to be modified; the build extracts all the information it needs (such as command line arguments) from specially @@ -225,7 +225,7 @@ as appropriate by the time this function is called. (See <a href="#lib_args">get_optflags()</a> for details.</p></li> </ul> -<a name="headers" /><h2>Headers.</h2> +<a name="headers" /><h2><a href="#headers">Headers.</a></h2> <p>Commands generally don't have their own headers. If it's common code it can live in lib/, if it isn't put it in the command's .c file. (The line @@ -252,7 +252,7 @@ that haven't changed since the 1990's, it's ok to #define them yourself or just use the constant inline with a comment explaining what it is. (A #define that's only used once isn't really helping.)</p> -<p><a name="top" /><h2>Top level directory.</h2></p> +<p><a name="top" /><h1><a href="#top">Top level directory.</a></h1></p> <p>This directory contains global infrastructure.</p> @@ -460,13 +460,14 @@ install path prepended.</p></li> to build into toybox (thus generating a .config file), and by scripts/config2help.py to create generated/help.h.</p> -<h3>Temporary files:</h3> +<a name="generated" /> +<h1><a href="#generated">Temporary files:</a></h1> <p>There is one temporary file in the top level source directory:</p> <ul> <li><p><b>.config</b> - Configuration file generated by kconfig, indicating which commands (and options to commands) are currently enabled. Used -to make generated/config.h and determine which toys/*.c files to build.</p> +to make generated/config.h and determine which toys/*/*.c files to build.</p> <p>You can create a human readable "miniconfig" version of this file using <a href=http://landley.net/aboriginal/new_platform.html#miniconfig>these @@ -474,93 +475,121 @@ instructions</a>.</p> </li> </ul> -<a name="generated" /> -<p>The "generated/" directory contains files generated from other source code -in toybox. All of these files can be recreated by the build system, although -some (such as generated/help.h) are shipped in release versions to reduce -environmental dependencies (I.E. so you don't need python on your build -system).</p> +<p><h2>Directory generated/</h2></p> + +<p>The remaining temporary files live in the "generated/" directory, +which is for files generated at build time from other source files.</p> <ul> +<li><p><b>generated/Config.in</b> - Included from the top level Config.in, +contains one or more configuration entries for each command.</p> + +<p>Each command has a configuration entry with an upper case version of +the command name. Options to commands start with the command +name followed by an underscore and the option name. Global options are attached +to the "toybox" command, and thus use the prefix "TOYBOX_". This organization +is used by scripts/cfg2files to select which toys/*/*.c files to compile for a +given .config.</p> + +<p>A command with multiple names (or multiple similar commands implemented in +the same .c file) should have config symbols prefixed with the name of their +C file. I.E. config symbol prefixes are NEWTOY() names. If OLDTOY() names +have config symbols they must be options (symbols with an underscore and +suffix) to the NEWTOY() name. (See generated/toylist.h)</p> +</li> + <li><p><b>generated/config.h</b> - list of CFG_SYMBOL and USE_SYMBOL() macros, generated from .config by a sed invocation in the top level Makefile.</p> <p>CFG_SYMBOL is a comple time constant set to 1 for enabled symbols and 0 for -disabled symbols. This allows the use of normal if() statements to remove +disabled symbols. This allows the use of normal if() statements to remove code at compile time via the optimizer's dead code elimination (which removes -from the binary any code that cannot be reached). This saves space without +from the binary any code that cannot be reached). This saves space without cluttering the code with #ifdefs or leading to configuration dependent build -breaks. (See the 1992 Usenix paper +breaks. (See the 1992 Usenix paper <a href=http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful.pdf>#ifdef Considered Harmful</a> for more information.)</p> <p>USE_SYMBOL(code) evaluates to the code in parentheses when the symbol -is enabled, and nothing when the symbol is disabled. This can be used +is enabled, and nothing when the symbol is disabled. This can be used for things like varargs or variable declarations which can't always be -eliminated by a simple test on CFG_SYMBOL. Note that +eliminated by a simple test on CFG_SYMBOL. Note that (unlike CFG_SYMBOL) this is really just a variant of #ifdef, and can -still result in configuration dependent build breaks. Use with caution.</p> +still result in configuration dependent build breaks. Use with caution.</p> </li> -</ul> - -<p><h2>Directory toys/</h2></p> - -<h3>toys/Config.in</h3> - -<p>Included from the top level Config.in, contains one or more -configuration entries for each command.</p> - -<p>Each command has a configuration entry matching the command name (although -configuration symbols are uppercase and command names are lower case). -Options to commands start with the command name followed by an underscore and -the option name. Global options are attached to the "toybox" command, -and thus use the prefix "TOYBOX_". This organization is used by -scripts/cfg2files to select which toys/*.c files to compile for a given -.config.</p> -<p>A command with multiple names (or multiple similar commands implemented in -the same .c file) should have config symbols prefixed with the name of their -C file. I.E. config symbol prefixes are NEWTOY() names. If OLDTOY() names -have config symbols they're options (symbols with an underscore and suffix) -to the NEWTOY() name. (See toys/toylist.h)</p> - -<h3>toys/toylist.h</h3> -<p>The first half of this file prototypes all the structures to hold -global variables for each command, and puts them in toy_union. These -prototypes are only included if the macro NEWTOY isn't defined (in which -case NEWTOY is defined to a default value that produces function -prototypes).</p> - -<p>The second half of this file lists all the commands in alphabetical -order, along with their command line arguments and install location. -Each command has an appropriate configuration guard so only the commands that -are enabled wind up in the list.</p> - -<p>The first time this header is #included, it defines structures and -produces function prototypes for the commands in the toys directory.</p> +<li><p><b>generated/flags.h</b> - FLAG_? macros indicating which command +line options were seen. The option parsing in lib/args.c sets bits in +toys.optflags, which can be tested by anding with the appropriate FLAG_ +macro. (Bare longopts, which have no corresponding short option, will +have the longopt name after FLAG_. All others use the single letter short +option.)</p> + +<p>To get the appropriate macros for your command, #define FOR_commandname +before #including toys.h. To switch macro sets (because you have an OLDTOY() +with different options in the same .c file), #define CLEANUP_oldcommand +and also #define FOR_newcommand, then #include "generated/flags.h" to switch. +</p> +</li> +<li><p><b>generated/globals.h</b> - +Declares structures to hold the contents of each command's GLOBALS(), +and combines them into "global_union this". (Yes, the name was +chosen to piss off C++ developers who think that C +is merely a subset of C++, not a language in its own right.)</p> -<p>The first time it's included, it defines structures and produces function -prototypes. - This -is used to initialize toy_list in main.c, and later in that file to initialize -NEED_OPTIONS (to figure out whether the command like parsing logic is needed), -and to put the help entries in the right order in toys/help.c.</p> +<p>The union reuses the same memory for each command's global struct: +since only one command's globals are in use at any given time, collapsing +them together saves space. The headers #define TT to the appropriate +"this.commandname", so you can refer to the current command's global +variables out of "this" as TT.variablename.</p> -<h3>toys/help.h</h3> +<p>The globals start zeroed, and the first few are filled out by the +lib/args.c argument parsing code called from main.c.</p> +</li> -<p>#defines two help text strings for each command: a single line +<li><p><b>toys/help.h</b> - +#defines two help text strings for each command: a single line command_help and an additinal command_help_long. This is used by help_main() in toys/help.c to display help for commands.</p> -<p>Although this file is generated from Config.in help entries by -scripts/config2help.py, it's shipped in release tarballs so you don't need -python on the build system. (If you check code out of source control, or -modify Config.in, then you'll need python installed to rebuild it.)</p> +<p>This file is created by scripts/make.sh, which compiles scripts/config2help.c +into the binary generated/config2help, and then runs it against the top +level .config and Config.in files to extract the help text from each config +entry and collate together dependent options.</p> + +<p>This file contains help text for all commands, regardless of current +configuration, but only the ones currently enabled in the .config file +wind up in the help_data[] array, and only the enabled dependent options +have their help text added to the command they depend on.</p> +</li> -<p>This file contains help for all commands, regardless of current -configuration, but only the currently enabled ones are entered into help_data[] -in toys/help.c.</p> +<li><p><b>generated/newtoys.h</b> - +All the NEWTOY() and OLDTOY() macros in alphabetical order, +each of which should be inside the appropriate USE_ macro. (Ok, not _quite_ +alphabetical orer: the "toybox" multiplexer is always the first entry.)</p> + +<p>By #definining NEWTOY() to various things before #including this file, +it may be used to create function prototypes (in toys.h), initialize the +toy_list array (in main.c, the alphabetical order lets toy_find() do a +binary search), initialize the help_data array (in lib/help.c), and so on. +(It's even used to initialize the NEED_OPTIONS macro, which is has a 1 or 0 +for each command using command line option parsing, ORed together. +This allows compile-time dead code elimination to remove the whole of +lib/args.c if nothing currently enabled is using it.)<p> + +<p>Each NEWTOY and OLDTOY macro contains the command name, command line +option string (telling lib/args.c how to parse command line options for +this command), recommended install location, and miscelaneous data such +as whether this command should retain root permissions if installed suid.</p> +</li> + +<li><p><b>toys/oldtoys.h</b> - Macros with the command line option parsing +string for each NEWTOY. This allows an OLDTOY that's just an alias for an +existing command to refer to the existing option string instead of +having to repeat it.</p> +</li> +</ul> <a name="lib"> <h2>Directory lib/</h2> @@ -648,14 +677,16 @@ struct double_list *new) - append existing struct double_list to list, does not allocate anything.</p></li></ul> </ul> -<b>Trivia questions:</b> +<b>List code trivia questions:</b> <ul> <li><p><b>Why do arg_list and double_list contain a char * payload instead of a void *?</b> - Because you always have to typecast a void * to use it, and -typecasting a char * does no harm. Thus having it default to the most common -pointer type saves a few typecasts (strings are the most common payload), -and doesn't hurt anything otherwise.</p> +typecasting a char * does no harm. Since strings are the most common +payload, and doing math on the pointer ala +"(type *)(ptr+sizeof(thing)+sizeof(otherthing))" requires ptr to be char * +anyway (at least according to the C standard), defaulting to char * saves +a typecast.</p> </li> <li><p><b>Why do the names ->str, ->arg, and ->data differ?</b> - To force @@ -664,10 +695,10 @@ be bad, and _failing_ to free(node->arg) leaks memory.</p></li> <li><p><b>Why does llist_pop() take a void * instead of void **?</b> - because the stupid compiler complains about "type punned pointers" when -you typecast and dereference ont he same line, +you typecast and dereference on the same line, due to insane FSF developers hardwiring limitations of their optimizer into gcc's warning system. Since C automatically typecasts any other -pointer _down_ to a void *, the current code works fine. It's sad that it +pointer type to and from void *, the current code works fine. It's sad that it won't warn you if you forget the &, but the code crashes pretty quickly in that case.</p></li> @@ -1077,8 +1108,8 @@ from elsewhere in the program. This gives ls -lR manual control of traversal order, which is neither depth first nor breadth first but instead a sort of FIFO order requried by the ls standard.</p> -<a name="#toys"> -<h2>Directory toys/</h2> +<a name="toys"> +<h1><a href="#toys">Directory toys/</a></h1> <p>This directory contains command implementations. Each command is a single self-contained file. Adding a new command involves adding a single |