How to use regular expression in sed command - regex

i have some strings with this pattern in some files:
domain.com/page-10
domain.com/page-15
....
and i want to replace them with something like
domain.com/apple-10.html
domain.com/apple-15.html
i have found that i can use sed command to replace them at a time but because after the numbers should something be added i guess i have to use regular expression to do it. but i don't know how.

sed -i.bak -r 's/page-([0-9]+)/apple-\1.html/' file
sed 's/page-\([0-9][0-9]*\)/apple-\1.html/' file > t && mv t file
Besides sed, you can also use gawk's gensub()
awk '{b=gensub(/page-([0-9]+)/,"apple-\\1.html","g",$0) ;print b }' file

sed -i 's/page-\([0-9]*\)/apple-\1.html/' <filename>
The ([0-9]*) captures a group of digits; the \1 in the replacement string references that capture and adds it as part of the replacement string.
You may want to use something like -i.backup if you need to keep a copy of the file without the replacements, or just omit the -i and instead use the I/O redirection method instead.

One more way to resolve the problem:
sed -i.bak 's/\(^.*\)\(page-\)\(.*\)/\1apple-\3.html/' Files
Here the searching patterns are stored and retrieved using references (\1, \2, \3).

This will work
sed 's/$/\.html/g' file.txt

Related

removing unmatched lines with SED

I'm trying to remove everything but 3 separate lines with specific matching pattern and leave just the 3 lines I want
Here is my code;
sed -n '/matching pattern/matching pattern/matching pattern/p' > file.txt
If you have multiple commands on the same line, you need to separate the commands by a ;:
sed -n '/matching pattern/p;/matching pattern2/p;/matching pattern3/p' file
Alternatively you can put them onto separate lines:
sed -n '/matching pattern/p
/matching pattern2/p
/matching pattern3/p' file
Beside that, you can also use regex alternation:
sed -rn '/(pattern|pattern2|pattern3)/p' file
or (better) use grep:
grep -E '(pattern|pattern2|pattern3)' file
However, this might get messy if the patterns getting longer and more complicated.
awk to the rescue!
awk '/pattern1/ || /pattern2/ || /pattern3/' filename
I think it's cleaner than alternatives.
Sed with Deletion
There's always more than one way to do this sort of thing, but one useful sed programming pattern is using alternation with deletion. For example:
# BSD sed
sed -E '/root|daemon|nobody/!d' /etc/passwd
# GNU sed
sed -r '/root|daemon|nobody/!d' /etc/passwd
This makes it possible to express ideas like "delete everything except for the listed terms." Even when expressions are functionally equivalent, it can be helpful to use a construct that most closely matches the idea you're trying to convey.
This might work for you (GNU sed):
sed '/pattern1/b;/pattern2/b;/pattern3/b;d' file
The normal flow of sed is to print what remains in the pattern space after processing. Therefore if the required pattern is in the pattern space let sed do its thing otherwise delete the line.
N.B. the b command is like a goto and if it has no following identifier, it means break out of any further sed commands and print (or not print if the -n option is in action) the contents of the pattern space.
If I understood you correctly:
sed -n '/\(pattern1\|pattern2\|pattern3\)/p' file > newfile

Selective find/replace with sed

I need to do some find and replace in C++ source code: replace all occurrences of _uvw with xyz except when _uvw is part of abc_uvw or def_uvw. For example:
abc_uvw ghi_uvw;
jkl_uvw def_uvw;
should become:
abc_uvw ghixyz;
jklxyz def_uvw;
So far I came up with the following:
find . -type f -print0 | xargs -0 sed -i '/abc_uvw/\!s/_uvw/xyz/g'
This will replace all _uvw with xyz only in the lines that don't contain abc_uvw, which (1) doesn't handle such a case: abc_uvw ghi_uvw; and (2) doesn't take into account the second exception, that is def_uvw.
So how would one do that sort of selective find and replace with sed?
This might work for you (GNU sed):
sed -r 's/(abc|def)_uvw/\1\n_uvw/g;s/([^\n])_uvw/\1xyz/g;s/\n//g' file
Insert a newline infront of the strings you do not want to change. Change those strings which do not have a newline infront of them. Delete any newlines.
N.B. Newline is chosen as it cannot exist in an unadulterated sed buffer.
How about this?
$ cat file
abc_uvw ghi_uvw;
jkl_uvw def_uvw;
$ sed 's/abc_uvw/foo/g;s/def_uvw/bar/g;s/_uvw/xyz/g;s/foo/abc_uvw/g;s/bar/def_uvw/g' file
abc_uvw ghixyz;
jklxyz def_uvw;
You should use negative lookbehind. For example, in Perl:
perl -pe 's/(?<!(abc|def))_uvw/xyz/g' file.c
This performs a global substitution of any instances of _uvw that are not immediately preceded by abc or def.
Output:
abc_uvw ghixyz;
jklxyz def_uvw;
Sed is a useful tool and certainly has its place but Perl is a lot more powerful in terms of regular expressions. Using Perl, you get to specify exactly what you mean, rather than solving the problem in a more roundabout way.
This will work:
sed -e 's/abc_uvw/AAA_AAA/g; # shadow abc_uvw
s/def_uvw/DDD_DDD/g; # shadow def_uvw
s/_uvw/xyz/g; # substitute
s/AAA_AAA/abc_uvw/g; # recover abc_uvw
s/DDD_DDD/def_uvw/g # recover def_uvw
' input.cpp > output.cpp
cat output.cpp
sed 's/µ/µm/g;s/abc_uvw/µa/g;s/def_uvw/µd/g
s/_uvw/xyz/g
s/µd/def_uvw/g;s/µa/abc_uvw/g;s/µm/µ/g' YourFile
This is like the other in concept but "escaping" first the temporary pattern to filter on abc and def. I use µ but other char is possible, just avoid special sed char like /, \, &, ...

Replacing a fixed-position character field using Perl or sed

I need to replace a particular range of characters in each line of a file.
I tried this
perl -i -pe 'r77,79c/XXX/g' file
I am trying to change the 77th to 79th characters to XXX using Perl, but above code is not working.
you want to replace chars at position [77-79] with XXX?
try
perl -i -piorig_* -e "substr($_,76,3)=XXX" file
a backup file called orig_file will be created cause of preventing possible dataloss..
perl -i -pe 's/.{76}\K.../XXX/' file
You wrote:
Actually i want to search a pattern in a file and whatever lines matching that pattern needs to be replaced to 50th & 51st character to XX
Using sed:
sed -r '/pattern/s/^(.{49})..(.*)$/\1XX\2/' file
sed "/pattern/ s/^\(.\{49\}\)../\1XX/" YourFile
we don't touch the end

how to select lines containing several words using sed?

I am learning using sed in unix.
I have a file with many lines and I wanna delete all lines except lines containing strings(e.g) alex, eva and tom.
I think I can use
sed '/alex|eva|tom/!d' filename
However I find it doesn't work, it cannot match the line. It just match "alex|eva|tom"...
Only
sed '/alex/!d' filename
works.
Anyone know how to select lines containing more than 1 words using sed?
plus, with parenthesis like "sed '/(alex)|(eva)|(tom)/!d' file" doesn't work, and I wanna the line containing all three words.
sed is an excellent tool for simple substitutions on a single line, for anything else just use awk:
awk '/alex/ && /eva/ && /tom/' file
delete all lines except lines containing strings(e.g) alex, eva and tom
As worded you're asking to preserve lines containing all those words but your samples preserve lines containing any. Just in case "all" wasn't a misspeak: Regular expressions can't express any-order searches, fortunately sed lets you run multiple matches:
sed -n '/alex/{/eva/{/tom/p}}'
or you could just delete them serially:
sed '/alex/!d; /eva/!d; /tom/!d'
The above works on GNU/anything systems, with BSD-based userlands you'll have to insert a bunch of newlines or pass them as separate expressions:
sed -n '/alex/ {
/eva/ {
/tom/ p
}
}'
or
sed -e '/alex/!d' -e '/eva/!d' -e '/tom/!d'
You can use:
sed -r '/alex|eva|tom/!d' filename
OR on Mac:
sed -E '/alex|eva|tom/!d' filename
Use -i.bak for inline editing so:
sed -i.bak -r '/alex|eva|tom/!d' filename
You should be using \| instead of |.
Edit: Looks like this is true for some variants of sed but not others.
This might work for you (GNU sed):
sed -nr '/alex/G;/eva/G;/tom/G;s/\n{3}//p' file
This method would allow a range of values to be present i.e. you wanted 2 or more of the list then use:
sed -nr '/alex/G;/eva/G;/tom/G;s/\n{2,3}//p' file

Regular expression in sed to replace C++ includes

I'm trying to play with sed, to change all;
#include "X"
to:
#include <X>
However I can't seem to find a way to do this! - This is what I've done so far:
sed -i 's/#include ".*"/#include <.*>/g' filename
I think I'm in need of a variable to save the contains of ".*", i'm just unaware of how!
Yes, you do. Regexps use () to save the contents of a match and a \1 to retrieve it. If you use more than one set of (), then the 2nd match is in \2 , and so on.
sed -e 's/#include "\(.*\)"/#include <\1>/g' < filename
will do what you need.
Try:
sed 's/#include "\(.*\)"/#include <\1>/' x.cpp
Try:
sed -i 's/#include "\(.*\)"/#include <\1>/g' filename