changeset 1537:a6ef79b31829 draft

Fill out rest of help text for sed.
author Rob Landley <rob@landley.net>
date Tue, 28 Oct 2014 23:43:58 -0500
parents 1d996b0a11c0
children 8bc715741481
files toys/pending/sed.c
diffstat 1 files changed, 112 insertions(+), 15 deletions(-) [+]
line wrap: on
line diff
--- a/toys/pending/sed.c	Mon Oct 27 19:20:31 2014 -0500
+++ b/toys/pending/sed.c	Tue Oct 28 23:43:58 2014 -0500
@@ -1,4 +1,4 @@
-/* sed.c - stream editor
+/* sed.c - stream editor. Thing that does s/// and other stuff.
  *
  * Copyright 2014 Rob Landley <rob@landley.net>
  *
@@ -26,14 +26,15 @@
   help
     usage: sed [-inr] [-e SCRIPT]...|SCRIPT [-f SCRIPT_FILE]... [FILE...]
 
-    Stream editor. Apply one or more editing SCRIPTs to each line of each line
-    of input (from FILE or stdin) producing output (by default to stdout).
+    Stream editor. Apply one or more editing SCRIPTs to each line of input
+    (from FILE or stdin) producing output (by default to stdout).
 
     -e	add SCRIPT to list
     -f	add contents of SCRIPT_FILE to list
     -i	Edit each file in place.
     -n	No default output. (Use the p command to output matched lines.)
     -r	Use extended regular expression syntax.
+    -s  Treat input files separately (implied by -i)
 
     A SCRIPT is a series of one or more COMMANDs separated by newlines or
     semicolons. All -e SCRIPTs are concatenated together as if separated
@@ -41,16 +42,19 @@
     If no -e or -f SCRIPTs are specified, the first argument is the SCRIPT.
 
     Each COMMAND may be preceded by an address which limits the command to
-    run only on the specified lines:
+    apply only to the specified line(s). Commands without an address apply to
+    every line. Addresses are of the form:
 
-    [ADDRESS[,ADDRESS]]COMMAND
+      [ADDRESS[,ADDRESS]]COMMAND
 
     The ADDRESS may be a decimal line number (starting at 1), a /regular
     expression/ within a pair of forward slashes, or the character "$" which
-    matches the last line of input. A single address matches one line, a pair
-    of comma separated addresses match everything from the first address to
-    the second address (inclusive). If both addresses are regular expressions,
-    more than one range of lines in each file can match.
+    matches the last line of input. (In -s or -i mode this matches the last
+    line of each file, otherwise just the last line of the last file.) A single
+    address matches one line, a pair of comma separated addresses match
+    everything from the first address to the second address (inclusive). If
+    both addresses are regular expressions, more than one range of lines in
+    each file can match.
 
     REGULAR EXPRESSIONS in sed are started and ended by the same character
     (traditionally / but anything except a backslash or a newline works).
@@ -60,17 +64,110 @@
     (above) require the first delimeter to be escaped with a backslash when
     it isn't a forward slash (to distinguish it from the COMMANDs below).
 
-    Each COMMAND starts with a single character, which may be followed by
-    additional data depending on the COMMAND:
+    Sed mostly operates on individual lines one at a time. It reads each line,
+    processes it, and either writes it to the output or discards it before
+    reading the next line. Sed can remember one additional line in a separate
+    buffer (using the h, H, g, G, and x commands), and can read the next line
+    of input early (using the n and N command), but other than that command
+    scripts operate on individual lines of text.
+
+    Each COMMAND starts with a single character. The following commands take
+    no arguments:
+
+      {  Start a new command block, continuing until a corresponding "}".
+         Command blocks may nest. If the block has an address, commands within
+         the block are only run for lines within the block's address range.
+
+      }  End command block (this command cannot have an address)
+
+      d  Delete this line and move on to the next one
+         (ignores remaining COMMANDs)
+
+      D  Delete one line of input and restart command SCRIPT (same as "d"
+         unless you've glued lines together with "N" or similar)
+
+      g  Get remembered line (overwriting current line)
+
+      G  Get remembered line (appending to current line)
+
+      h  Remember this line (overwriting remembered line)
+
+      H  Remember this line (appending to remembered line, if any)
+
+      l  Print this line, escaping \abfrtv (but leaving \n as a newline),
+         using octal escapes for other nonprintable characters, and
+         wrapping lines to terminal width with a backslash and newline
+
+      n  Print default output and read next line, replacing current line
+         (If no next line available, quit processing script)
+
+      N  Append next line of input to this line, separated by a newline
+         (This advances the line counter for address matching and "=", if no
+         next line available quit processing script without default output)
+
+      p  Print this line
+
+      P  Print this line up to first newline (from "N")
+
+      q  Quit (print default output, no more commands processed or lines read)
+
+      x  Exchange this line with remembered line (overwrite in both directions)
 
-    rwbrstwy:{
+      =  Print the current line number (followed by a newline)
+
+    The following commands (may) take an argument. ("b", "s", "t", "T", "y"
+    and ":" may be ended with semicolons, the rest eat at least one line.)
+
+      a [text]   Append text to output before attempting to read next line,
+                 if text ends with unescaped "\" append next line of script
+
+      b [label]  Branch, jumps to :label (or with no label, to end of SCRIPT)
+
+      c [text]   Delete current address range and print text instead,
+                 if text ends with unescaped "\" append next line of script
+
+      i [text]   Print text, if text ends with unescaped "\" append next
+                 line of script
+
+      r [file]   Append contents of file to output before attempting to read
+                 next line.
+
+      s/S/R/F    Search for regex S, replace matched text with R using flags F.
+                 The first character after the "s" (anything but newline or
+                 backslash) is the delimiter, escape with \ to use normally.
 
-    s  search and replace
+                 The replacement text may contain "&" to substitute the matched
+                 text (escape it with backslash for a literal &), or \1 through
+                 \9 to substitute a parenthetical subexpression in the regex.
+                 You can also use the normal backslash escapes such as \n and
+                 a backslash at the end of the line appends the next line.
+
+                 The flags are:
+
+                 [0-9]    A number, substitute only that occurrence of pattern
+                 g        Global, substitute all occurrences of pattern
+                 p        Print the line if match was found and replaced
+                 w [file] Write (append) line to file if match replaced
 
-    The search and replace syntax
+      t [label]  Test, jump to :label only if an "s" command found a match in
+                 this line since last test (replacing with same text counts)
+
+      T [label]  Test false, jump only if "s" hasn't found a match.
+
+      w [file]   Write (append) line to file
+
+      y/old/new/ Change each character in 'old' to corresponding character
+                 in 'new' (with standard backslash escapes, delimiter can be
+                 any repeated character except \ or \n)
+
+      : [label]  Labeled target for jump commands
+
+      #  Comment, ignore rest of this line of SCRIPT
 
     Deviations from posix: we allow extended regular expressions with -r,
-    editing in place with -i, printf escapes in text, semicolons after.
+    editing in place with -i, separate with -s, printf escapes in text, line
+    continuations, semicolons after all commands, 2-address anywhere an
+    address is allowed, "T" command.
 */
 
 #define FOR_sed