How to get the output from second value using sed - regex

I have a file with below content as example
cat test.log
hello
how are you?
terminating
1
2
3
terminating
1
2
When am using grep command to show output after terminating it is showing as below.
sed -n '/terminating/,$p' test.log
terminating
1
2
3
terminating
1
2
I want output as below
terminating
1
2
Can anyone help me on this please?

Code for sed:
$ sed -n '/terminating/{N;N;h};${g;p}' file
terminating
1
2
If line matches terminating, store it and the next two lines in hold space. Print the three lines on $EOF.
Example with a sedscript:
$ cat script.sed
/terminating/{
N
N
h
}
${
g
p
}
$ sed -nf script.sed file
terminating
1
2
And for all lines after the last terminating:
$ cat file
cat test.log
hello
how are you?
terminating
1
2
3
terminating
1
2
3
4
5
6
$ cat script.sed
H
/terminating/{
h
}
${
g
p
}
$ sed -nf script.sed file
terminating
1
2
3
4
5
6

This might work for you (GNU sed):
sed -r '/^terminating/!d;:a;$!N;/.*\n(terminating)/s//\1/;$q;ba' file
Unless the line begins with terminating discard it. Read in more lines discarding any lines that are ahead of a line beginning terminating. At end-of-file print out the remainder of the file.

sed -n 'H;/terminating/x;${x;p}' test.log
As annotated pseudocode:
for each line:
append the line to the hold space H
if line matches /terminating/ /terminating/
then set hold space to line x
on the last line: $
get the hold space x
and print p
note: x actually exchanges the hold/pattern spaces.

Related

Internally handling odd number of lines in GNU sed - is my solution exploiting a bug?

I have a series of tokens like
Filename
URL
Filename
URL
...
and I wanted to group them onto the same line, then reverse the token order, so I did
$ echo -e 'Filename\nURL\nFilename\nURL' | sed 'N;s/\(.*\)\n\(.*\)/\2 \1/'
URL Filename
URL Filename
which I have no problems with.
However, the N operator in sed is quite fragile, so I wanted to ensure that wonky input like...
$ echo -e 'Filename\nURL\nFilename' | sed 'N;s/\(.*\)\n\(.*\)/\2 \1/'
URL Filename
Filename
...doesn't ruin everything. But I wanted to keep it to a oneliner, and try and use sed builtins if I can.
I accidentally discovered:
$ echo -e 'hi\nabc\ndef\nghi' | sed '$q1;N;s/\n/ | /' && echo -n even || echo -n odd; echo ' number of input lines'
hi | abc
def | ghi
even number of input lines
$ echo -e 'hi\nabc\ndef' | sed '$q1;N;s/\n/ | /' && echo -n even || echo -n odd; echo ' number of input lines'
hi | abc
def
odd number of input lines
It seems that the $ operator does not report EOF if an N munges the line immediately beforehand.
I'm guessing this a bug and that I shouldn't depend on it...?
sed commands are matched in order. That's the key here.
Assume an input file like this:
URL 1
Filename 1
URL 2
Filename 2
What happens when sed processes the file is that sed reads the first line, matches it against $ (which fails) then runs the N and s commands.
It then reads the "next" line (which is now line three, the URL 2 line) matches that against $ (which fails) and then runs the N and s commands. At which point sed then tries to read the next line to prep for the next run through the input only there isn't any more input and so it exits.
Now assume an input file of
URL 1
Filename 1
Filename 2
sed starts by reading the first line, matching it against $ (which fails) and then performs the N and s commands. It then reads the "next" line (again line three, this time the Filename 2 line) and matches that against $ (which succeeds) and then quits.
If you had a $ addressed command after the N command that would trigger on a file with an even number of lines (as sed would now be on the last line when that command is running).
Example:
$ printf %s\\n "U 1" "F 1" "U 2" "F 2" | sed '$q1;N;s/\n/ | /;$s/$/ - last line/' && echo even || echo odd
U 1 | F 1
U 2 | F 2 - last line
even
$ printf %s\\n "U 1" "F 1" "F 2" | sed '$q1;N;s/\n/ | /;$s/$/ - last line/' && echo even || echo odd
U 1 | F 1
F 2
odd

Deleting n lines in both directions and the match in sed?

Deleting the match and two lines before it works:
sed -i.bak -e '/match/,-2d' someCommonName.txt
Deleting the match and two lines after it works:
sed -i.bak -e '/match/,+2d' someCommonName.txt
But deleting the match, two lines after it and two lines before it does not work?
sed -i.bak -e '/match/-2,+2d' someCommonName.txt
sed: -e expression #1 unknown command: `-'
Why is that?
sed operates on a range of addresses. That means either one or two expressions, not three.
/match/ is an address which matches a regex.
-2 is an address which specifies two lines before
+2 is an address which specifies two lines after
Therefore:
/match/,-2 is a range which specifies the line matching match to two lines before.
/match/-2,+2d, on the other hand, includes three addresses, and thus makes no sense.
To delete two lines before and after a pattern, I would recommend something like this (modified from this answer):
sed -n "1N;2N;/\npattern$/{N;N;d};P;N;D"
This keeps 3 lines in the buffer and reads through the file. When the pattern is found in the last line, it reads two more lines and deletes all 5. Note that this will not work if the pattern is in the first two lines of the file, but it is a start.
sed -i .bak '/match/,-2 {/match/!d;};/match/,+2d' YourFile
try this (cannot test here, -2 is not available in my sed version)
I don't have a complete solution but an outline: sed is a pretty simple tool which doesn't do two things at once. My approach would be to run sed once deleting the two lines after the pattern but keeping the pattern itself. The result can then be piped to sed again to remove the pattern and the two lines before.
FWIW this is how I'd really do the job (just change the b and a values to delete different numbers of lines before/after match is found):
$ cat file
1
2
3
4
5 match
6
7
8
9
$ awk -v b=2 -v a=2 'NR==FNR{if (/match/) for (i=(NR-b);i<=(NR+a);i++) skip[i]; next } !(FNR in skip)' file file
1
2
8
9
$ awk -v b=3 -v a=1 'NR==FNR{if (/match/) for (i=(NR-b);i<=(NR+a);i++) skip[i]; next } !(FNR in skip)' file file
1
7
8
9
Note that the above assumes that when 2 "match"s appear within a removal window you want to base the deletions on the original occurrence, not what would happen after the first match being found causes the 2nd match to be deleted:
$ cat file2
1
2
3
4 match
5
6 match
7
8
9
$ awk -v b=2 -v a=2 'NR==FNR{if (/match/) for (i=(NR-b);i<=(NR+a);i++) skip[i]; next } !(FNR in skip)' file2 file2
1
9
as opposed to the output being:
1
7
8
9
since deleting the 2 lines after the first match would delete the 2nd match and so the 2 lines after THAT would not be deleted since they no longer are within 2 lines after a match.
Something else to consider:
$ diff --changed-group-format='%<' --unchanged-group-format='' file <(grep -A2 -B2 match file)
1
2
8
9
$ diff --changed-group-format='%<' --unchanged-group-format='' file2 <(grep -A2 -B2 match file2)
1
9
That uses bash and GNU diff 3.2, idk if/which other shells/diffs would support those constructs/options.

How to use grep to extract multiple groups

Say I have this file data.txt:
a=0,b=3,c=5
a=2,b=0,c=4
a=3,b=6,c=7
I want to use grep to extract 2 columns corresponding to the values of a and c:
0 5
2 4
3 7
I know how to extract each column separately:
grep -oP 'a=\K([0-9]+)' data.txt
0
2
3
And:
grep -oP 'c=\K([0-9]+)' data.txt
5
4
7
But I can't figure how to extract the two groups. I tried the following, which didn't work:
grep -oP 'a=\K([0-9]+),.+c=\K([0-9]+)' data.txt
5
4
7
I am also curious about grep being able to do so. \K "removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.
In the meanwhile, I would use sed:
sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file
it catches the digits after a= and c=, whenever this happens on lines starting with a= and not containing anything else after c=digits.
For your input, it returns:
0 5
2 4
3 7
You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7
To get the mentioned format , you need to pass the output of grep to paste or any other commands .
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7
use this :
awk -F[=,] '{print $2" "$6}' data.txt
I am using the separators as = and ,, then spliting on them

Find the Last Occurrence of a search string And Print the Line next line in Ksh

How can we Find the last occurrence a search string (Regex) and then print the next line following it? Assume a Textfile which has Data as below
1 absc
1 sandka
file hjk
2 asdaps
2 amsdapm
file abc
So, from this file, I have to grep or awk the last occurrence of the 2 and print the line that follows it.
awk is always handy for these cases:
$ awk '/2/ {p=1; next} p{a=$0; p=0} END{print a}' file
file abc
Explanation
/2/ {p=1; next} when 2 appears in the line, activate the p flag and skip the line.
p{a=$0; p=0} when the p flag is active, store the line and unactivate p.
END{print a} print the stored value, which happens to be the last one because a is always overwritten.
Using grep
grep -A 1 '^2' option displays lines that match 2 at the beginning of the line plus one following line
then use tail -1 to print the final line:
grep -A 1 '^2' yourfile | tail -1

if first space is 2 space, make it 1 in a file

i have a text file and in some lines the first space from left is 2 space long and i want it to be 1 space long. whats the script for this in bash?
123 2 5//problem
1 2 5
1 2 5
1 32 5//problem
what i want
123 2 5
1 2 5
1 2 5
1 32 5
tr way:
cat test.txt | tr -s ' '
Using sed:
sed 's/^\([^ ][^ ]*[ ]\)[ ]*/\1/' input
Starting from the left
^
match and capture non-space characters and a space
\([^ ][^ ]*[ ]\)
and any number of additional spaces:
[ ]* # remove the star if you only care about exactly 2 spaces
and replace these with the captured part:
\1
Edit: I realized that David's answer was almost right.
You can use sed.
cat x | sed -e 's/ \+/ /'
This replaces the first occurrence of one or more spaces with a single space.
But you can do it purely in bash as well:
cat x | while read a b ; do echo "$a" "$b" ; done
This splits each line at the first word, and echos back the first word and the rest of the line. The result is that there is only one space between the first word and the rest of the line.