changeset 1247:d630b3be752f draft

Document some of the new temporary files in generated/, add anchor tags.
author Rob Landley <rob@landley.net>
date Wed, 09 Apr 2014 08:30:09 -0500
parents 04c69f3c7c9e
children 407357afa07f
files www/code.html
diffstat 1 files changed, 106 insertions(+), 75 deletions(-) [+]
line wrap: on
line diff
--- a/www/code.html	Wed Apr 09 07:57:08 2014 -0500
+++ b/www/code.html	Wed Apr 09 08:30:09 2014 -0500
@@ -129,7 +129,7 @@
 </ul>
 
 <a name="adding" />
-<p><h1>Adding a new command</h1></p>
+<p><h1><a href="#adding">Adding a new command</a></h1></p>
 <p>To add a new command to toybox, add a C file implementing that command under
 the toys directory.  No other files need to be modified; the build extracts
 all the information it needs (such as command line arguments) from specially
@@ -225,7 +225,7 @@
 <a href="#lib_args">get_optflags()</a> for details.</p></li>
 </ul>
 
-<a name="headers" /><h2>Headers.</h2>
+<a name="headers" /><h2><a href="#headers">Headers.</a></h2>
 
 <p>Commands generally don't have their own headers. If it's common code
 it can live in lib/, if it isn't put it in the command's .c file. (The line
@@ -252,7 +252,7 @@
 just use the constant inline with a comment explaining what it is. (A
 #define that's only used once isn't really helping.)</p>
 
-<p><a name="top" /><h2>Top level directory.</h2></p>
+<p><a name="top" /><h1><a href="#top">Top level directory.</a></h1></p>
 
 <p>This directory contains global infrastructure.</p>
 
@@ -460,13 +460,14 @@
 to build into toybox (thus generating a .config file), and by
 scripts/config2help.py to create generated/help.h.</p>
 
-<h3>Temporary files:</h3>
+<a name="generated" />
+<h1><a href="#generated">Temporary files:</a></h1>
 
 <p>There is one temporary file in the top level source directory:</p>
 <ul>
 <li><p><b>.config</b> - Configuration file generated by kconfig, indicating
 which commands (and options to commands) are currently enabled.  Used
-to make generated/config.h and determine which toys/*.c files to build.</p>
+to make generated/config.h and determine which toys/*/*.c files to build.</p>
 
 <p>You can create a human readable "miniconfig" version of this file using
 <a href=http://landley.net/aboriginal/new_platform.html#miniconfig>these
@@ -474,93 +475,121 @@
 </li>
 </ul>
 
-<a name="generated" />
-<p>The "generated/" directory contains files generated from other source code
-in toybox.  All of these files can be recreated by the build system, although
-some (such as generated/help.h) are shipped in release versions to reduce
-environmental dependencies (I.E. so you don't need python on your build
-system).</p>
+<p><h2>Directory generated/</h2></p>
+
+<p>The remaining temporary files live in the "generated/" directory,
+which is for files generated at build time from other source files.</p>
 
 <ul>
+<li><p><b>generated/Config.in</b> - Included from the top level Config.in,
+contains one or more configuration entries for each command.</p>
+
+<p>Each command has a configuration entry with an upper case version of
+the command name. Options to commands start with the command
+name followed by an underscore and the option name. Global options are attached
+to the "toybox" command, and thus use the prefix "TOYBOX_".  This organization
+is used by scripts/cfg2files to select which toys/*/*.c files to compile for a
+given .config.</p>
+
+<p>A command with multiple names (or multiple similar commands implemented in
+the same .c file) should have config symbols prefixed with the name of their
+C file. I.E. config symbol prefixes are NEWTOY() names. If OLDTOY() names
+have config symbols they must be options (symbols with an underscore and
+suffix) to the NEWTOY() name. (See generated/toylist.h)</p>
+</li>
+
 <li><p><b>generated/config.h</b> - list of CFG_SYMBOL and USE_SYMBOL() macros,
 generated from .config by a sed invocation in the top level Makefile.</p>
 
 <p>CFG_SYMBOL is a comple time constant set to 1 for enabled symbols and 0 for
-disabled symbols.  This allows the use of normal if() statements to remove
+disabled symbols. This allows the use of normal if() statements to remove
 code at compile time via the optimizer's dead code elimination (which removes
-from the binary any code that cannot be reached).  This saves space without
+from the binary any code that cannot be reached). This saves space without
 cluttering the code with #ifdefs or leading to configuration dependent build
-breaks.  (See the 1992 Usenix paper
+breaks. (See the 1992 Usenix paper
 <a href=http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful.pdf>#ifdef
 Considered Harmful</a> for more information.)</p>
 
 <p>USE_SYMBOL(code) evaluates to the code in parentheses when the symbol
-is enabled, and nothing when the symbol is disabled.  This can be used
+is enabled, and nothing when the symbol is disabled. This can be used
 for things like varargs or variable declarations which can't always be
-eliminated by a simple test on CFG_SYMBOL.  Note that
+eliminated by a simple test on CFG_SYMBOL. Note that
 (unlike CFG_SYMBOL) this is really just a variant of #ifdef, and can
-still result in configuration dependent build breaks.  Use with caution.</p>
+still result in configuration dependent build breaks. Use with caution.</p>
 </li>
-</ul>
-
-<p><h2>Directory toys/</h2></p>
 
-<h3>toys/Config.in</h3>
-
-<p>Included from the top level Config.in, contains one or more
-configuration entries for each command.</p>
-
-<p>Each command has a configuration entry matching the command name (although
-configuration symbols are uppercase and command names are lower case).
-Options to commands start with the command name followed by an underscore and
-the option name.  Global options are attached to the "toybox" command,
-and thus use the prefix "TOYBOX_".  This organization is used by
-scripts/cfg2files to select which toys/*.c files to compile for a given
-.config.</p>
+<li><p><b>generated/flags.h</b> - FLAG_? macros indicating which command
+line options were seen. The option parsing in lib/args.c sets bits in
+toys.optflags, which can be tested by anding with the appropriate FLAG_
+macro. (Bare longopts, which have no corresponding short option, will
+have the longopt name after FLAG_. All others use the single letter short
+option.)</p>
 
-<p>A command with multiple names (or multiple similar commands implemented in
-the same .c file) should have config symbols prefixed with the name of their
-C file.  I.E. config symbol prefixes are NEWTOY() names.  If OLDTOY() names
-have config symbols they're options (symbols with an underscore and suffix)
-to the NEWTOY() name.  (See toys/toylist.h)</p>
+<p>To get the appropriate macros for your command, #define FOR_commandname
+before #including toys.h. To switch macro sets (because you have an OLDTOY()
+with different options in the same .c file), #define CLEANUP_oldcommand
+and also #define FOR_newcommand, then #include "generated/flags.h" to switch.
+</p>
+</li>
 
-<h3>toys/toylist.h</h3>
-<p>The first half of this file prototypes all the structures to hold
-global variables for each command, and puts them in toy_union.  These
-prototypes are only included if the macro NEWTOY isn't defined (in which
-case NEWTOY is defined to a default value that produces function
-prototypes).</p>
+<li><p><b>generated/globals.h</b> -
+Declares structures to hold the contents of each command's GLOBALS(),
+and combines them into "global_union this". (Yes, the name was
+chosen to piss off C++ developers who think that C
+is merely a subset of C++, not a language in its own right.)</p>
 
-<p>The second half of this file lists all the commands in alphabetical
-order, along with their command line arguments and install location.
-Each command has an appropriate configuration guard so only the commands that
-are enabled wind up in the list.</p>
-
-<p>The first time this header is #included, it defines structures and
-produces function prototypes for the commands in the toys directory.</p>
-
+<p>The union reuses the same memory for each command's global struct:
+since only one command's globals are in use at any given time, collapsing
+them together saves space. The headers #define TT to the appropriate
+"this.commandname", so you can refer to the current command's global
+variables out of "this" as TT.variablename.</p>
 
-<p>The first time it's included, it defines structures and produces function
-prototypes.
-  This
-is used to initialize toy_list in main.c, and later in that file to initialize
-NEED_OPTIONS (to figure out whether the command like parsing logic is needed),
-and to put the help entries in the right order in toys/help.c.</p>
+<p>The globals start zeroed, and the first few are filled out by the 
+lib/args.c argument parsing code called from main.c.</p>
+</li>
 
-<h3>toys/help.h</h3>
-
-<p>#defines two help text strings for each command: a single line
+<li><p><b>toys/help.h</b> -
+#defines two help text strings for each command: a single line
 command_help and an additinal command_help_long.  This is used by help_main()
 in toys/help.c to display help for commands.</p>
 
-<p>Although this file is generated from Config.in help entries by
-scripts/config2help.py, it's shipped in release tarballs so you don't need
-python on the build system.  (If you check code out of source control, or
-modify Config.in, then you'll need python installed to rebuild it.)</p>
+<p>This file is created by scripts/make.sh, which compiles scripts/config2help.c
+into the binary generated/config2help, and then runs it against the top
+level .config and Config.in files to extract the help text from each config
+entry and collate together dependent options.</p>
+
+<p>This file contains help text for all commands, regardless of current
+configuration, but only the ones currently enabled in the .config file
+wind up in the help_data[] array, and only the enabled dependent options
+have their help text added to the command they depend on.</p>
+</li>
+
+<li><p><b>generated/newtoys.h</b> - 
+All the NEWTOY() and OLDTOY() macros in alphabetical order,
+each of which should be inside the appropriate USE_ macro. (Ok, not _quite_
+alphabetical orer: the "toybox" multiplexer is always the first entry.)</p>
 
-<p>This file contains help for all commands, regardless of current
-configuration, but only the currently enabled ones are entered into help_data[]
-in toys/help.c.</p>
+<p>By #definining NEWTOY() to various things before #including this file,
+it may be used to create function prototypes (in toys.h), initialize the
+toy_list array (in main.c, the alphabetical order lets toy_find() do a
+binary search), initialize the help_data array (in lib/help.c), and so on.
+(It's even used to initialize the NEED_OPTIONS macro, which is has a 1 or 0
+for each command using command line option parsing, ORed together.
+This allows compile-time dead code elimination to remove the whole of
+lib/args.c if nothing currently enabled is using it.)<p>
+
+<p>Each NEWTOY and OLDTOY macro contains the command name, command line
+option string (telling lib/args.c how to parse command line options for
+this command), recommended install location, and miscelaneous data such
+as whether this command should retain root permissions if installed suid.</p>
+</li>
+
+<li><p><b>toys/oldtoys.h</b> - Macros with the command line option parsing
+string for each NEWTOY. This allows an OLDTOY that's just an alias for an
+existing command to refer to the existing option string instead of
+having to repeat it.</p>
+</li>
+</ul>
 
 <a name="lib">
 <h2>Directory lib/</h2>
@@ -648,14 +677,16 @@
 list, does not allocate anything.</p></li></ul>
 </ul>
 
-<b>Trivia questions:</b>
+<b>List code trivia questions:</b>
 
 <ul>
 <li><p><b>Why do arg_list and double_list contain a char * payload instead of
 a void *?</b> - Because you always have to typecast a void * to use it, and
-typecasting a char * does no harm. Thus having it default to the most common
-pointer type saves a few typecasts (strings are the most common payload),
-and doesn't hurt anything otherwise.</p>
+typecasting a char * does no harm. Since strings are the most common
+payload, and doing math on the pointer ala
+"(type *)(ptr+sizeof(thing)+sizeof(otherthing))" requires ptr to be char *
+anyway (at least according to the C standard), defaulting to char * saves
+a typecast.</p>
 </li>
 
 <li><p><b>Why do the names ->str, ->arg, and ->data differ?</b> - To force
@@ -664,10 +695,10 @@
 
 <li><p><b>Why does llist_pop() take a void * instead of void **?</b> -
 because the stupid compiler complains about "type punned pointers" when
-you typecast and dereference ont he same line,
+you typecast and dereference on the same line,
 due to insane FSF developers hardwiring limitations of their optimizer
 into gcc's warning system. Since C automatically typecasts any other
-pointer _down_ to a void *, the current code works fine. It's sad that it
+pointer type to and from void *, the current code works fine. It's sad that it
 won't warn you if you forget the &, but the code crashes pretty quickly in
 that case.</p></li>
 
@@ -1077,8 +1108,8 @@
 of traversal order, which is neither depth first nor breadth first but
 instead a sort of FIFO order requried by the ls standard.</p>
 
-<a name="#toys">
-<h2>Directory toys/</h2>
+<a name="toys">
+<h1><a href="#toys">Directory toys/</a></h1>
 
 <p>This directory contains command implementations. Each command is a single
 self-contained file. Adding a new command involves adding a single