Sed command find and replace in even lines of a file - regex

Hi I am new to this forum. I want to use SED to replace an expression on even lines of a file. My problem is that I cannot think f how to save the changes in the original file (i.e, how to overwrite the changes in the file). I have tried with :
sed -n 'n;p;' filename | sed 's/aaa/bbb/'
but this does not save the changes. I appreciate your help on this.

Try :
sed -i '2~2 s/aaa/bbb/' filename
The -i option tells sed to work in place, so not to write the edited version to stout and leave the original file be, but to apply the changes to the file. The 2~2 portion is the address for the lines sed should apply the commands. 2~2 means edit only even lines. 1~2 would edit only odd lines. 5~6 would edit every fifth line, starting at line 5 etc...

#Mithrandir's answer is an excellent, correct and complete one.
I will just add that the m~n addressing method is a GNU sed extension that may not work everywhere. For example, not all Macs have GNU sed, as well as *BSD systems may not have it either.
So, if you have a file like the following one:
$ cat f
1 ab
2 ad
3 ab
4 ac
5 aa
6 da
7 aa
8 ad
9 aa
...here is a more universal solution:
$ sed '2,${s/a/#A#/g;n}' f
1 ab
2 #A#d
3 ab
4 #A#c
5 aa
6 d#A#
7 aa
8 #A#d
9 aa
What does it do? The address of the command is 2,$, which means it will be applied to all lines between the second one (2) and the last one ($). The command in fact are two commands, treated as one because they are grouped by brackets ({ and }). The first command is the replacement s/a/#A#/g. The second one is the n command, which gets, in the current iteration, the next line, appends it to the current pattern space. So the current iteration will print the current line plus the next line, and the next iteration will process the next next line. Since I started it at the 2nd line, I am doing this process at each even line.
Of course, since you want to update the original file, you should call it with the -i flag. I would note that some of those non-GNU seds require you to give a parameter to the -i flag, which will an extension to be append to a file name. This file name is the name of a generated backup file with the old content. (So, if you call, for example, sed -i.bkp s/a/b/ myfile.txt the file myfile.txt will be altered, but another file, called myfile.txt.bkp, will be created with the old content of myfile.txt.) Since a) it is required in some places and b) it is accepted in GNU sed and c) it is a good practice nonetheless (if something go wrong, you can reuse the backup), I recommend to use it:
$ ls
f
$ sed -i.bkp '2,${s/a/#A#/g;n}' f
$ ls
f f.bkp
Anyway, my answer is just a complement for some specific scenarios. I would use #Mithrandir's solution, even because I am a Linux user :)

This might work for you:
sed -i 'n;s/aaa/bbb/' file

Use sed -i to edit the file in place.

Related

What is the difference b/w two sed commands below?

Information about the environment I am working in:
$ uname -a
AIX prd231 1 6 00C6B1F74C00
$ oslevel -s
6100-03-10-1119
Code Block A
( grep schdCycCleanup $DCCS_LOG_FILE | sed 's/[~]/ \
/g' | grep 'Move(s) Exist for cycle' | sed 's/[^0-9]*//g' ) > cycleA.txt
Code Block B
( grep schdCycCleanup $DCCS_LOG_FILE | sed 's/[~]/ \n/g' | grep 'Move(s) Exist for cycle' | sed 's/[^0-9]*//g' ) > cycleB.txt
I have two code blocks(shown above) that make use of sed to trim the input down to 6 digits but one command is behaving differently than I expected.
Sample of input for the two code blocks
Mar 25 14:06:16 prd231 ajbtux[33423660]: 20160325140616:~schd_cem_svr:1:0:SCHD-MSG-MOVEEXISTCYCLE:200705008:AUDIT:~schdCycCleanup - /apps/dccs/ajbtux/source/SCHD/schd_cycle_cleanup.c - line 341~ SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210~
I get the following output when the sample input above goes through the two code blocks.
cycleA.txt content
389210
cycleB.txt content
25140616231334236602016032514061610200705008341389210
I understand that my last piped sed command (sed 's/[^0-9]*//g') is deleting all characters other than numbers so I omitted it from the block codes and placed the output in two additional files. I get the following output.
cycleA1.txt content
SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210
cycleB1.txt content
Mar 25 15:27:58 prd231 ajbtux[33423660]: 20160325152758: nschd_cem_svr:1:0:SCHD-MSG-MOVEEXISTCYCLE:200705008:AUDIT: nschdCycCleanup - /apps/dccs/ajbtux/source/SCHD/schd_cycle_cleanup.c - line 341 n SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210 n
I can see that the first code block is removing every thing other that (SCHD_CYCLE_CLEANUP - Move(s) Exist for cycle 389210) and is using the tilde but the second code block is just replacing the tildes with the character n. I can also see that it is necessary in the first code block for a line break after this(sed 's/[~]/ ) and that is why I though having \n would simulate a line break but that is not the case. I think my different output results are because of the way regular expressions are being used. I have tried to look into regular expressions and searched about them on stackoverflow but did not obtain what I was looking for. Could someone explain how I can achieve the same result from code block B as code block A without having part of my code be on a second line?
Thank you in advance
This is an example of the XY problem (http://xyproblem.info/). You're asking for help to implement something that is the wrong solution to your problem. Why are you changing ~s to newlines, etc when all you need given your posted sample input and expected output is:
$ sed -n 's/.*schdCycCleanup.* \([0-9]*\).*/\1/p' file
389210
or:
$ awk -F'[ ~]' '/schdCycCleanup/{print $(NF-1)}' file
389210
If that's not all you need then please edit your question to clarify your requirements for WHAT you are trying to do (as opposed to HOW you are trying to do it) as your current approach is just wrong.
Etan Reisner's helpful answer explains the problem and offers a single-line solution based on an ANSI C-quoted string ($'...'), which is appropriate, given that you originally tagged your question bash.
(Ed Morton's helpful answer shows you how to bypass your problem altogether with a different approach that is both simpler and more efficient.)
However, it sounds like your shell is actually something different - presumably ksh88, an older version of the Korn shell that is the default sh on AIX 6.1 - in which such strings are not supported[1]
(ANSI C-quoted strings were introduced in ksh93, and are also supported not only in bash, but in zsh as well).
Thus, you have the following options:
With your current shell, you must stick with a two-line solution that contains an (\-escaped) actual newline, as in your code block A.
Note that $(printf '\n') to create a newline does not work, because command substitutions invariably trim all trailing newlines, resulting in the empty string in this case.
Use a more modern shell that supports ANSI C-quoted strings, and use Etan's answer. http://www.ibm.com/support/knowledgecenter/ssw_aix_61/com.ibm.aix.cmds3/ksh.htm tells me that ksh93 is available as an alternative shell on AIX 6.1, as /usr/bin/ksh93.
If feasible: install GNU sed, which natively understands escape sequences such as \n in replacement strings.
[1] As for what actually happens when you try echo 'foo~bar~baz' | sed $'s/[~]/\\\n/g' in a POSIX-like shell that does not support $'...': the $ is left as-is, because what follow is not a valid variable name, and sed ends up seeing literal $s/[~]/\\\n/g, where the $ is interpreted as a context address applying to the last input line - which doesn't make a difference here, because there is only 1 line. \\ is interpreted as plain \, and \n as plain n, effectively replacing ~ instances with literal \n sequences.
GNU sed handles \n in the replacement the way you expect.
OS X (and presumably BSD) sed does not. It treats it as a normal escaped character and just unescapes it to n. (Though I don't see this in the manual anywhere at the moment.)
You can use $'' quoting to use \n as a literal newline if you want though.
echo 'foo~bar~baz' | sed $'s/[~]/\\\n/g'

Complex changes to a URL with sed

I am trying to parse an RSS feed on the Linux command line which involves formatting the raw output from the feed with sed.
I currently use this command:
feedstail -u http://www.heise.de/newsticker/heise-atom.xml -r -i 60 -f "{published}> {title} {link}" | sed 's/^\(.\{3\}\)\(.\{13\}\)\(.\{6\}\)\(.\{3\}\)\(.*\)/\1\3\5/'
This gives me a number of feed items per line that look like this:
Sat 20:33 GMT> WhatsApp-Ausfall: Server-Probleme blockieren Messaging-Dienst http://www.heise.de/newsticker/meldung/WhatsApp-Ausfall-Server-Probleme-blockieren-Messaging-Dienst-2121664.html/from/atom10?wt_mc=rss.ho.beitrag.atom
Notice the long URL at the end. I want to shorten this to better fit on the command line. Therefore, I want to change my sed command to produce the following:
Sat 20:33 GMT> WhatsApp-Ausfall: Server-Probleme blockieren Messaging-Dienst http://www.heise.de/-2121664
That means cutting everything out of the URL except a dash and that seven digit number preceeding the ".html/blablabla" bit.
Currently my sed command only changes stuff in the date bit. It would have to leave the title and start or the URL alone and then cut stuff out of it until it reaches the seven digit number. It needs to preserve that and then cut everything after it out. Oh yeah, and we need to leave a dash right in front of that number too.
I have no idea how to do that and can't find the answer after hours of googling. Help?
EDIT:
This is the raw output of a line of feedstail -u http://www.heise.de/newsticker/heise-atom.xml -r -i 60 -f "{published}> {title} {link}", in case it helps:
Sat, 22 Feb 2014 20:33:00 GMT> WhatsApp-Ausfall: Server-Probleme blockieren Messaging-Dienst http://www.heise.de/newsticker/meldung/WhatsApp-Ausfall-Server-Probleme-blockieren-Messaging-Dienst-2121664.html/from/atom10?wt_mc=rss.ho.beitrag.atom
EDIT 2:
It seems I can only pipe that output into one command. Piping it through multiple ones seems to break things. I don't understand why ATM.
Unfortunately (for me), I could only think of solving this with extended regexp syntax (either -E or -r flag on different systems):
... | sed -E 's|(://[^/]+/).*(-[0-9]+)\.html/.*|\1\2|'
UPDATE: In basic regexp syntax, the best I can do is
... | sed 's|\(://[^/]*/\).*\(-[0-9][0-9]*\)\.html/.*|\1\2|'
The key to writing this sort of regular expression is to be very careful about what the boundaries of what you expect are, so as to avoid the random gunk that you want to get rid of causing you problems. Also, you should bear in mind that you can use characters other than / as part of a s operation's delimiters.
sed 's!\(http://www\.heise\.de/\)newsticker/meldung/[^./]*\(-[0-9]+\)\.html[^ ]*!\1\2!'
Be aware that getting the RE right can be quite tricky; assume you'll need to test it! (This is a key part of the “now you have two problems” quote; REs very easily become horrendous.)
Something like this maybe?
... | awk -F'[^0-9]*' '{print "http://www.heise.de/-"$2}'
This might work for you (GNU sed):
sed 's|\(//[^/]*/\).*\(-[0-9]\{7\}\).*|\1\2|' file
You can place the first sed command so:
feedstail -u http://www.heise.de/newsticker/heise-atom.xml -r -i 60 -f "{published}> {title} {link}" |
sed 's/^\(.\{3\}\)\(.\{13\}\)\(.\{6\}\)\(.\{3\}\)\(.*\)/\1\3\5/;s|\(//[^/]*/\).*\(-[0-9]\{7\}\).*|\1\2|'

a simple sed script displaying only changed lines

How could I make a separate sed script (let's call it script.sed) that would display only the changed lines without having to use the -n option while executing it? (Sorry for my English)
I have a file called data2.txt with digits and I need to change the lines ending with ".5" and print those changed lines out in the console.
I know how to do it with a single command (sed -n 's/.5$//gp' data2.txt), however our university professor requires us to do the same using sed -f script.sed data2.txt command.
Any ideas?
The following should work for your sed script:
s/.5$//gp
d
The -n option will suppress automatic printing of the line, the other way to do that is to use the d command. From man page:
d Delete pattern space. Start next cycle.
This works because the automatic printing of the line happens at the end of a cycle, and using the d command means you never reach the end of a cycle so no lines are printed automatically.
This might work for you (GNU sed):
#n
s/.5$//p
Save this to a file and run as:
sed -f file.sed file.txt

sed: display lines selected for deleting

How to use verbose flag in sed. Eg. If I'm deleting some lines using sed command then I want them to get displayed on a screen whichever lines are getting deleted. Also let me know if this can be done through a script?
Thanks in advance
sed doesn't have a verbose flag.
You can write a sed script that separates deleted lines from other lines, though. You can look at the deleted lines later, and decide whether deleting them was a good idea.
Here's an example. I want to delete from test.dat every line that starts with a number.
$ cat test.dat
1 First line
2 Second line
3 Third line
A Keep this one
Here's the sed script that will "do" the deleting. It looks for lines that start with a number, writes them to the file "deleted.dat", and then deletes them from the pattern space.
$ cat code/sed/delete-verbose.sed
/^[0-9]/{
w /home/myusername/deleted.dat
d
}
Here's what happens when you run it.
$ sed -f code/sed/delete-verbose.sed test.dat
A Keep this one
And here's what it wrote to "deleted.dat".
$ cat deleted.dat
1 First line
2 Second line
3 Third line
When you're confident the script is going to do the right thing, redirect output to another file, or edit the file in-place (-i option).
This might work for you (GNU sed);
sed -e '/pattern_to_delete/{w /dev/stderr' -e ';d}' input_file > output_file
There is no verbose flag but by sending the lines to be deleted to stderr the effect you require can be achieved.

Replace non-unique occurences with sed or other command

my first post here and beginner level. Is there a way I can solve this problem with sed (or any other means)? I want to manipulate a newly created file daily and replace some IP and port occurences.
1) I want to replace the first occurence of "5027,5028" with A3 and the second with A4.
2) I want to replace the first occurence of "5026" with A1 and the second with A2.
PS. I have tried to simplify the example and left the preceeding lines with version="y" or version="x" that could be of help to distinguish the occurences from eachother. (The first x and y version pair is a primary connection and the other two the secondary connection).
Input file:
version="x"
commaSeparatedList="5027,5028"`
version="y"
commaSeparatedList="5026"
version="x"
commaSeparatedList="5027,5028"
version="y"
commaSeparatedList="5026"
Edited file:
version="1.4.1-12"
commaSeparatedList="A3"
version="1.3.0"
commaSeparatedList="A1"
version="1.4.1-12"
commaSeparatedList="A4"
version="1.3.0"
commaSeparatedList="A2"
Sorry, I had some editing horror for a few minutes. Hope it looks easier to understand now. I am basically receiving this file on a system that is deployed nightly and I want to edit this file using a cron job before it starts to make sure a connection works.
Do not bother trying to use sed for this. It can be done, but sed is the wrong tool.
Use awk instead. To replace the first occurrence of "5027,5028" with A3 and the second with A4.
awk '/5027,5028/ && count < 2 { if( count ++ ) repl="A4"; else repl="A3";
sub( "5027,5028", repl)} 1' input
The second replacement is left as an exercise. It is basically the same thing, and you can either run awk twice or just add additional clauses the above.
To overwrite the original file, use shell redirections:
awk ... input > tmpfile && mv tmpfile input
This might work for you (GNU Sed):
sed '1,/5027,5028/s/5027,5028/A3/;s/5027,5028/A4/;1,/5026/s/5026/A1/;s/5026/A2/' file