From 21374adab1bef2321a7bc58554c021b703b786e9 Mon Sep 17 00:00:00 2001 From: Rob Landley Date: Tue, 28 Oct 2014 23:43:58 -0500 Subject: Fill out rest of help text for sed. --- toys/pending/sed.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 112 insertions(+), 15 deletions(-) diff --git a/toys/pending/sed.c b/toys/pending/sed.c index 00ea69c4..2feeaa66 100644 --- a/toys/pending/sed.c +++ b/toys/pending/sed.c @@ -1,4 +1,4 @@ -/* sed.c - stream editor +/* sed.c - stream editor. Thing that does s/// and other stuff. * * Copyright 2014 Rob Landley * @@ -26,14 +26,15 @@ config SED help usage: sed [-inr] [-e SCRIPT]...|SCRIPT [-f SCRIPT_FILE]... [FILE...] - Stream editor. Apply one or more editing SCRIPTs to each line of each line - of input (from FILE or stdin) producing output (by default to stdout). + Stream editor. Apply one or more editing SCRIPTs to each line of input + (from FILE or stdin) producing output (by default to stdout). -e add SCRIPT to list -f add contents of SCRIPT_FILE to list -i Edit each file in place. -n No default output. (Use the p command to output matched lines.) -r Use extended regular expression syntax. + -s Treat input files separately (implied by -i) A SCRIPT is a series of one or more COMMANDs separated by newlines or semicolons. All -e SCRIPTs are concatenated together as if separated @@ -41,16 +42,19 @@ config SED If no -e or -f SCRIPTs are specified, the first argument is the SCRIPT. Each COMMAND may be preceded by an address which limits the command to - run only on the specified lines: + apply only to the specified line(s). Commands without an address apply to + every line. Addresses are of the form: - [ADDRESS[,ADDRESS]]COMMAND + [ADDRESS[,ADDRESS]]COMMAND The ADDRESS may be a decimal line number (starting at 1), a /regular expression/ within a pair of forward slashes, or the character "$" which - matches the last line of input. A single address matches one line, a pair - of comma separated addresses match everything from the first address to - the second address (inclusive). If both addresses are regular expressions, - more than one range of lines in each file can match. + matches the last line of input. (In -s or -i mode this matches the last + line of each file, otherwise just the last line of the last file.) A single + address matches one line, a pair of comma separated addresses match + everything from the first address to the second address (inclusive). If + both addresses are regular expressions, more than one range of lines in + each file can match. REGULAR EXPRESSIONS in sed are started and ended by the same character (traditionally / but anything except a backslash or a newline works). @@ -60,17 +64,110 @@ config SED (above) require the first delimeter to be escaped with a backslash when it isn't a forward slash (to distinguish it from the COMMANDs below). - Each COMMAND starts with a single character, which may be followed by - additional data depending on the COMMAND: + Sed mostly operates on individual lines one at a time. It reads each line, + processes it, and either writes it to the output or discards it before + reading the next line. Sed can remember one additional line in a separate + buffer (using the h, H, g, G, and x commands), and can read the next line + of input early (using the n and N command), but other than that command + scripts operate on individual lines of text. - rwbrstwy:{ + Each COMMAND starts with a single character. The following commands take + no arguments: - s search and replace + { Start a new command block, continuing until a corresponding "}". + Command blocks may nest. If the block has an address, commands within + the block are only run for lines within the block's address range. - The search and replace syntax + } End command block (this command cannot have an address) + + d Delete this line and move on to the next one + (ignores remaining COMMANDs) + + D Delete one line of input and restart command SCRIPT (same as "d" + unless you've glued lines together with "N" or similar) + + g Get remembered line (overwriting current line) + + G Get remembered line (appending to current line) + + h Remember this line (overwriting remembered line) + + H Remember this line (appending to remembered line, if any) + + l Print this line, escaping \abfrtv (but leaving \n as a newline), + using octal escapes for other nonprintable characters, and + wrapping lines to terminal width with a backslash and newline + + n Print default output and read next line, replacing current line + (If no next line available, quit processing script) + + N Append next line of input to this line, separated by a newline + (This advances the line counter for address matching and "=", if no + next line available quit processing script without default output) + + p Print this line + + P Print this line up to first newline (from "N") + + q Quit (print default output, no more commands processed or lines read) + + x Exchange this line with remembered line (overwrite in both directions) + + = Print the current line number (followed by a newline) + + The following commands (may) take an argument. ("b", "s", "t", "T", "y" + and ":" may be ended with semicolons, the rest eat at least one line.) + + a [text] Append text to output before attempting to read next line, + if text ends with unescaped "\" append next line of script + + b [label] Branch, jumps to :label (or with no label, to end of SCRIPT) + + c [text] Delete current address range and print text instead, + if text ends with unescaped "\" append next line of script + + i [text] Print text, if text ends with unescaped "\" append next + line of script + + r [file] Append contents of file to output before attempting to read + next line. + + s/S/R/F Search for regex S, replace matched text with R using flags F. + The first character after the "s" (anything but newline or + backslash) is the delimiter, escape with \ to use normally. + + The replacement text may contain "&" to substitute the matched + text (escape it with backslash for a literal &), or \1 through + \9 to substitute a parenthetical subexpression in the regex. + You can also use the normal backslash escapes such as \n and + a backslash at the end of the line appends the next line. + + The flags are: + + [0-9] A number, substitute only that occurrence of pattern + g Global, substitute all occurrences of pattern + p Print the line if match was found and replaced + w [file] Write (append) line to file if match replaced + + t [label] Test, jump to :label only if an "s" command found a match in + this line since last test (replacing with same text counts) + + T [label] Test false, jump only if "s" hasn't found a match. + + w [file] Write (append) line to file + + y/old/new/ Change each character in 'old' to corresponding character + in 'new' (with standard backslash escapes, delimiter can be + any repeated character except \ or \n) + + : [label] Labeled target for jump commands + + # Comment, ignore rest of this line of SCRIPT Deviations from posix: we allow extended regular expressions with -r, - editing in place with -i, printf escapes in text, semicolons after. + editing in place with -i, separate with -s, printf escapes in text, line + continuations, semicolons after all commands, 2-address anywhere an + address is allowed, "T" command. */ #define FOR_sed -- cgit v1.2.3