What actually the meaning of "-n" in sed? - regex

According to http://linux.about.com/od/commands/l/blcmdl1_sed.htm
suppress automatic printing of pattern
space
I've tested with or without -n, sed will produce same result
I dont understand what space does it means.

Sed has two places to store text: pattern space and hold space. Pattern space is where each line is put to be processed by sed commands; hold space is an auxiliary place to put some text you may want to use later. You probably will use only pattern space.
Before sed goes to process a line, it is put in the pattern space. Then, sed applies all commands (such as s///) to de pattern space and, by default, prints the resulting text from the pattern space. Let us suppose we have a file myfile with a line like:
The quick brown fox jumps over the lazy dog.
We run the following command:
sed 's/fox/coati/;s/dog/dingo/' myfile
Sed will apply s/fox/coati/ and then s/dog/dingo/ for each line of the file - in this case, the only one we showed above. When it occurs, it will put the line in the pattern space, which will have the following content:
The quick brown fox jumps over the lazy dog.
Then, sed will run the first command. After sed runs the command s/fox/coati/, the content of the pattern space will be:
The quick brown coati jumps over the lazy dog.
Then sed will apply the second command, s/dog/dingo/. After that, the content of the pattern space will be:
The quick brown coati jumps over the lazy dingo.
Note that this only happens in memory - nothing is printed by now.
After all, commands have been applied to the current line, by default, sed will then get the content of the pattern space and print it to the standard output. However, when you give -n as an option to sed, you ask sed not to execute this last step — except if it is explicitly required. So, if you run
sed -n 's/fox/coati/;s/dog/dingo/' myfile
nothing will be printed.
But how could you explicitly request sed to print the pattern space? Well, you can use the p command. When sed finds this command, it will print the content of the pattern space immediately. For example, in the command below we request sed to print the content of the pattern space just after the first command:
sed -n 's/fox/coat/;p;s/dog/dingo/' myfile
The result will be
$ sed -n 's/fox/coati/;p;s/dog/dingo/' myfile
The quick brown coati jumps over the lazy dog.
Note that only fox is replaced. It happens because the second command was not executed before the pattern space was printed. If we want to print the pattern space after both commands, we just put p after the second one:
sed -n 's/fox/coati/;s/dog/dingo/;p' myfile
Another option, if you are using the s/// command, is to pass the p flag to s///:
sed -n 's/fox/coati/;s/dog/dingo/p' myfile
In this case, the line will only be printed if the flagged replacement was executed. It may be very useful!

Just try a sed do-nothing:
sed '' file
and
sed -n '' file
First will print whole file but second will NOT print anything.

This puts sed into quiet mode, where sed will suppress all output except for when explicitly stated by a p command:
-n
--quiet
--silent
By default, sed will print out the pattern space at
the end of each cycle through the script. These
options disable this automatic printing, and sed
will only produce output when explicitly told to
via the p command.
An example of this would be if you wanted to use sed to simulate the actions of grep:
$echo -e "a\nb\nc" | sed -n '/[ab]/ p'
a
b
without the -n you would get an occurrence of c (and two occurrences of a and b)

$ echo "a b c d" | sed "s/a/apple/"
apple b c d
The pattern space is printed implicitly.
$ echo "a b c d" | sed -n "s/a/apple/"
No output.
$ echo "a b c d" | sed -n "s/a/apple/p"
apple b c d
Explicitly print the pattern space.

Related

Make matching example from sed manual working

I found an example in info sed stating the following:
'^\(.*\)\n\1$'
This matches a string consisting of two equal substrings separated
by a newline.
Trying to implement it in this ways didn't
return any matching lines:
echo -e "test\ntest" | sed -n '/^\(.*\)\n\1$/p'
echo -e "test\ntest" | sed -n 's/^\(.*\)\n\1$/\0/p'
sed version I use is 4.2.2.
Please suggest the way this example can be tested.
This might work for you (GNU sed and bash);
<<<$'test\ntest' sed -En 'N;s/^(.*)\n\1$/\1 == \1/p;s/^(.*)\n(.*)$/\1 != \2/p'
Append the second line of the input to the first and if the two lines are the same, replace them by line1 == line2 otherwise replace them by line1 != line2.
N.B. That both substitutions are trying to match at least a newline and if the first substitution succeeds the second can not. Likewise, if the first substitution never happened the second must.
To make an example work, I will have to use N that will read one more line in a pattern space and allow \n to be matched.

sed: delete everything that starts with $p, but not just $p

I'm trying to find a sed command that will delete every instance of a wrod that begins with another word, but not the word itself. So if I have
aardvark
aardvarky
aardvarkiest
I want to delete aardvarky and aardvarkiest, but not aardvark.
I tried
sed -n "/^$p.*/ d"
hoping to do some kind of regex that meant starting with $p and then some characters *, but it didn't seem to work.
This deletes all lines that start with $p and have at least one more character:
$ sed "/^$p./d" file
aardvark
To change the file in place, use the -i option. With GNU sed:
sed -i "/^$p./d" file
With BSD (OSX) sed:
sed -i "" "/^$p./d" file
Discussion
Consider:
sed -n "/^$p.*/ d"
This command will print nothing: -n means print nothing unless explicitly asked to and there is no command with an explicit print (p).
Further, * means zero or move of the preceding character. Thus, $p.* matches $p also.
We could use:
$ sed "/^$p.\+/d" file
aardvark
\+ means one or more of the preceding character. However, the \+ is not useful because any line that matches ^$p.\+ also matches the simpler ^$p. (and vice versa).
Warning
The use of shell variables in sed commands is potentially dangerous. As an example, the following writes a file to the current directory:
p=$'a/w hi.there\n/'; sed "/^$p.\+/d" file
A shell variable should not be used in a sed command unless the shell variable is created by code that is trusted.
use grep as below to keep all lines except aardvark
grep -v -w 'aardvark' file
if you want to delete everything except aardvark:
grep -w aardvark file
This might work for you (GNU sed):
sed -i /^'"$word"'\B/d' file
This deletes any line that begins with $word but does not end on a word boundary.

SED command to delete empty lines till the first occurrence of sentence

My input file will be
[emptyline]
[emptyline]
aaa
bbb
[emptyline]
cc
dd
Here [emptyline] indicates blanklines.
And I need an SED command to change this into
aaa
bbb
[emptyline]
cc
dd
That is, I need to delete all the blank lines at the top alone.
I need only SED command since i need to use that in bash script.
Additional info its MAC OSx
You can do it with branching in sed:
sed '/^ *$/d; :a; n; ba' file
A more efficient solution would be to use a range expression, see user2719058's answer for how to do this.
It is even more efficient if you can reduce the need for sed, see gniourf_gniourf's answer for alternatives.
This can be expressed in awk elegantly like this:
awk 'NF {f=1} f' file
Output in both cases:
aaa
bbb
cc
dd
Explanation
Both alternatives work by looking for the first non-empty line.
With sed the pattern /^ *$/d will delete all empty lines in the beginning of the file. What follows is a loop that prints the rest of the file.
awk will update NF for every line, when the line is empty NF is zero. This is exploited for setting the print-flag (f).
If the lines are really empty (no whitespace), I would suggest
sed -n '/./,$p', otherwise sed -n $'/[^ \t]/,$p'. (The $'..' syntax makes bash expand the \t, so you don't need a sed that understands it.)
One funny possibility:
{ sed -n '/./{p;q}' && cat; } < file
And it's really efficient too! (try to benchmark it against the other methods). If you might have some spaces in your first lines, you could do:
{ sed -n '/[^[:space:]]/{p;q}' && cat; } < file
sed does nothing until it reads a character; at this point it prints out the line and exits. Then cat outputs the whole thing; so since there's no more sed filtering, the data flows much faster through cat!
The same with grep:
{ grep -v -m 1 '^$' && cat; } < file
or discarding leading lines with possible spaces:
{ grep -v -m 1 '^[[:space:]]*$' && cat; } < file
A simple one is: sed '1,/^$/d' file
It will delete starting at line 1 up to the last blank line prior to the actual content of file; preserving the other blank lines as desired by OP.
Here is another way of deleting all blank lines at file start using pure BASH way without involving any external utility like awk/sed:
[[ "$(<file)" =~ ^[[:space:]]+(.*)$ ]] && echo "${BASH_REMATCH[1]}"
aaa
bbb
cc
dd
sed -n "H;$ {x;s/^\n*//p;}"
delete all first \n ant take into account that 1st line is maybe not empty (1,/^$/ does not work in this case)

sed: display lines before a match

Using sed looking for the last lines before matching lines:
echo -e "aaa\nbbb\nccc\naaa\nccc\naaa\nbbb\nccc" | sed '/aaa/!d' | sed '$!d' #In which order and amount of aaa, bb, ccc, ..., nnn is optional
The example above works well. The second method:
echo -e "aaa\nbbb\nccc\naaa\nccc\naaa\nbbb\nccc" | sed -e '/aaa/!d' -e '$!d'
or:
echo -e "aaa\nbbb\nccc\naaa\nccc\naaa\nbbb\nccc" | sed -e '/aaa/!d;$!d'
The second method does not want me to work. The wikipedia someone wrote that sed can be combined. I do not want to work. What I'm doing wrong and I understand? How should properly look like?
This might work for you (GNU sed):
sed '/aaa/h;$!d;x' file
To catch the last match you must store it in the hold space then retrieve it at the end of the file.
What is the desired output? The first command gives a single line aaa. The second and third commands give no output. There's a solid reason for the discrepancy in the behaviour.
In the first command, you have:
sed '/aaa/!d' | sed '$!d'
The first sed here deletes each line that is not aaa. The output (3 lines containing aaa) is then filtered so that only the last line is printed.
In the second and third commands (which are equivalent), you have:
sed -e '/aaa/!d' -e '$!d'
The first operand deletes each line that is not aaa and starts the next cycle. The second operand deletes every remaining aaa because none of them is on the last line of input (the last line in the input is ccc, which has already been deleted by virtue of not being aaa). So the output you see is exactly what you should expect.
If you want just one aaa, consider using:
grep '^aaa$' | uniq
Though that's a long-winded way of writing:
echo aaa
Presumably, though, this is a simplified version of the real situation (which is a good thing).
$ in the address means the last line. It does not change even if the last line is not being printed because of a previous command. In the pipeline, though, only the printed lines get to the second invocation of sed, and $ again means the last line - now only from the lines printed by the previous sed invocation.

regexp (sed) suppress "no match" output

I'm stuck on that and can't wrap my head around it: How can I tell sed to return the value found, and otherwise shut up?
It's really beyond me: Why would sed return the whole string if he found nothing? Do I have to run another test on the returned string to verify it? I tried using "-n" from the (very short) man page but it effectively suppresses all output, including matched strings.
This is what I have now :
echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
which returns
02 (and that is fine and dandy, thank you very much), but:
echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
returns
plop-02plop (when it should return this = "" nothing! Dang, you found nothing so be quiet!
For crying out loud !!)
I tried checking for a return value, but this failed too ! Gasp !!
$ echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
02
0
$ echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
plop-02plop
0
$
This last one I cannot even believe. Is sed really the tool I should be using? I want to extract a needle from a haystack, and I want a needle or nothing..?
sed by default prints all lines.
What you want to do is
/patt/!d;s//repl/
IOW delete lines not matching your pattern, and if they match, extract particular element from it, giving capturing group number for instance. In your case it will be:
sed -e '/^.*\(.\)\([0-9][0-9]\)\1.*$/!d;s//\2/'
You can also use -n option to suppress echoing all lines. Then line is printed only when you explicitly state it. In practice scripts using -n are usually longer and more cumbersome to maintain. Here it will be:
sed -ne 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/p'
There is also grep, but your example shows, why sed is sometimes better.
Perhaps you can use egrep -o?
input.txt:
blooody
aaaa
bbbb
odor
qqqq
E.g.
sehe#meerkat:/tmp$ egrep -o o+ input.txt
ooo
o
o
sehe#meerkat:/tmp$ egrep -no o+ input.txt
1:ooo
4:o
4:o
Of course egrep will have slightly different (better?) regex syntax for advanced constructs (back-references, non-greedy operators). I'll let you do the translation, if you like the approach.