aboutsummaryrefslogtreecommitdiff
path: root/toys/pending/sed.c
diff options
context:
space:
mode:
Diffstat (limited to 'toys/pending/sed.c')
-rw-r--r--toys/pending/sed.c127
1 files changed, 112 insertions, 15 deletions
diff --git a/toys/pending/sed.c b/toys/pending/sed.c
index 00ea69c4..2feeaa66 100644
--- a/toys/pending/sed.c
+++ b/toys/pending/sed.c
@@ -1,4 +1,4 @@
-/* sed.c - stream editor
+/* sed.c - stream editor. Thing that does s/// and other stuff.
*
* Copyright 2014 Rob Landley <rob@landley.net>
*
@@ -26,14 +26,15 @@ config SED
help
usage: sed [-inr] [-e SCRIPT]...|SCRIPT [-f SCRIPT_FILE]... [FILE...]
- Stream editor. Apply one or more editing SCRIPTs to each line of each line
- of input (from FILE or stdin) producing output (by default to stdout).
+ Stream editor. Apply one or more editing SCRIPTs to each line of input
+ (from FILE or stdin) producing output (by default to stdout).
-e add SCRIPT to list
-f add contents of SCRIPT_FILE to list
-i Edit each file in place.
-n No default output. (Use the p command to output matched lines.)
-r Use extended regular expression syntax.
+ -s Treat input files separately (implied by -i)
A SCRIPT is a series of one or more COMMANDs separated by newlines or
semicolons. All -e SCRIPTs are concatenated together as if separated
@@ -41,16 +42,19 @@ config SED
If no -e or -f SCRIPTs are specified, the first argument is the SCRIPT.
Each COMMAND may be preceded by an address which limits the command to
- run only on the specified lines:
+ apply only to the specified line(s). Commands without an address apply to
+ every line. Addresses are of the form:
- [ADDRESS[,ADDRESS]]COMMAND
+ [ADDRESS[,ADDRESS]]COMMAND
The ADDRESS may be a decimal line number (starting at 1), a /regular
expression/ within a pair of forward slashes, or the character "$" which
- matches the last line of input. A single address matches one line, a pair
- of comma separated addresses match everything from the first address to
- the second address (inclusive). If both addresses are regular expressions,
- more than one range of lines in each file can match.
+ matches the last line of input. (In -s or -i mode this matches the last
+ line of each file, otherwise just the last line of the last file.) A single
+ address matches one line, a pair of comma separated addresses match
+ everything from the first address to the second address (inclusive). If
+ both addresses are regular expressions, more than one range of lines in
+ each file can match.
REGULAR EXPRESSIONS in sed are started and ended by the same character
(traditionally / but anything except a backslash or a newline works).
@@ -60,17 +64,110 @@ config SED
(above) require the first delimeter to be escaped with a backslash when
it isn't a forward slash (to distinguish it from the COMMANDs below).
- Each COMMAND starts with a single character, which may be followed by
- additional data depending on the COMMAND:
+ Sed mostly operates on individual lines one at a time. It reads each line,
+ processes it, and either writes it to the output or discards it before
+ reading the next line. Sed can remember one additional line in a separate
+ buffer (using the h, H, g, G, and x commands), and can read the next line
+ of input early (using the n and N command), but other than that command
+ scripts operate on individual lines of text.
- rwbrstwy:{
+ Each COMMAND starts with a single character. The following commands take
+ no arguments:
- s search and replace
+ { Start a new command block, continuing until a corresponding "}".
+ Command blocks may nest. If the block has an address, commands within
+ the block are only run for lines within the block's address range.
- The search and replace syntax
+ } End command block (this command cannot have an address)
+
+ d Delete this line and move on to the next one
+ (ignores remaining COMMANDs)
+
+ D Delete one line of input and restart command SCRIPT (same as "d"
+ unless you've glued lines together with "N" or similar)
+
+ g Get remembered line (overwriting current line)
+
+ G Get remembered line (appending to current line)
+
+ h Remember this line (overwriting remembered line)
+
+ H Remember this line (appending to remembered line, if any)
+
+ l Print this line, escaping \abfrtv (but leaving \n as a newline),
+ using octal escapes for other nonprintable characters, and
+ wrapping lines to terminal width with a backslash and newline
+
+ n Print default output and read next line, replacing current line
+ (If no next line available, quit processing script)
+
+ N Append next line of input to this line, separated by a newline
+ (This advances the line counter for address matching and "=", if no
+ next line available quit processing script without default output)
+
+ p Print this line
+
+ P Print this line up to first newline (from "N")
+
+ q Quit (print default output, no more commands processed or lines read)
+
+ x Exchange this line with remembered line (overwrite in both directions)
+
+ = Print the current line number (followed by a newline)
+
+ The following commands (may) take an argument. ("b", "s", "t", "T", "y"
+ and ":" may be ended with semicolons, the rest eat at least one line.)
+
+ a [text] Append text to output before attempting to read next line,
+ if text ends with unescaped "\" append next line of script
+
+ b [label] Branch, jumps to :label (or with no label, to end of SCRIPT)
+
+ c [text] Delete current address range and print text instead,
+ if text ends with unescaped "\" append next line of script
+
+ i [text] Print text, if text ends with unescaped "\" append next
+ line of script
+
+ r [file] Append contents of file to output before attempting to read
+ next line.
+
+ s/S/R/F Search for regex S, replace matched text with R using flags F.
+ The first character after the "s" (anything but newline or
+ backslash) is the delimiter, escape with \ to use normally.
+
+ The replacement text may contain "&" to substitute the matched
+ text (escape it with backslash for a literal &), or \1 through
+ \9 to substitute a parenthetical subexpression in the regex.
+ You can also use the normal backslash escapes such as \n and
+ a backslash at the end of the line appends the next line.
+
+ The flags are:
+
+ [0-9] A number, substitute only that occurrence of pattern
+ g Global, substitute all occurrences of pattern
+ p Print the line if match was found and replaced
+ w [file] Write (append) line to file if match replaced
+
+ t [label] Test, jump to :label only if an "s" command found a match in
+ this line since last test (replacing with same text counts)
+
+ T [label] Test false, jump only if "s" hasn't found a match.
+
+ w [file] Write (append) line to file
+
+ y/old/new/ Change each character in 'old' to corresponding character
+ in 'new' (with standard backslash escapes, delimiter can be
+ any repeated character except \ or \n)
+
+ : [label] Labeled target for jump commands
+
+ # Comment, ignore rest of this line of SCRIPT
Deviations from posix: we allow extended regular expressions with -r,
- editing in place with -i, printf escapes in text, semicolons after.
+ editing in place with -i, separate with -s, printf escapes in text, line
+ continuations, semicolons after all commands, 2-address anywhere an
+ address is allowed, "T" command.
*/
#define FOR_sed