Selective find/replace with sed - regex

I need to do some find and replace in C++ source code: replace all occurrences of _uvw with xyz except when _uvw is part of abc_uvw or def_uvw. For example:
abc_uvw ghi_uvw;
jkl_uvw def_uvw;
should become:
abc_uvw ghixyz;
jklxyz def_uvw;
So far I came up with the following:
find . -type f -print0 | xargs -0 sed -i '/abc_uvw/\!s/_uvw/xyz/g'
This will replace all _uvw with xyz only in the lines that don't contain abc_uvw, which (1) doesn't handle such a case: abc_uvw ghi_uvw; and (2) doesn't take into account the second exception, that is def_uvw.
So how would one do that sort of selective find and replace with sed?

This might work for you (GNU sed):
sed -r 's/(abc|def)_uvw/\1\n_uvw/g;s/([^\n])_uvw/\1xyz/g;s/\n//g' file
Insert a newline infront of the strings you do not want to change. Change those strings which do not have a newline infront of them. Delete any newlines.
N.B. Newline is chosen as it cannot exist in an unadulterated sed buffer.

How about this?
$ cat file
abc_uvw ghi_uvw;
jkl_uvw def_uvw;
$ sed 's/abc_uvw/foo/g;s/def_uvw/bar/g;s/_uvw/xyz/g;s/foo/abc_uvw/g;s/bar/def_uvw/g' file
abc_uvw ghixyz;
jklxyz def_uvw;

You should use negative lookbehind. For example, in Perl:
perl -pe 's/(?<!(abc|def))_uvw/xyz/g' file.c
This performs a global substitution of any instances of _uvw that are not immediately preceded by abc or def.
Output:
abc_uvw ghixyz;
jklxyz def_uvw;
Sed is a useful tool and certainly has its place but Perl is a lot more powerful in terms of regular expressions. Using Perl, you get to specify exactly what you mean, rather than solving the problem in a more roundabout way.

This will work:
sed -e 's/abc_uvw/AAA_AAA/g; # shadow abc_uvw
s/def_uvw/DDD_DDD/g; # shadow def_uvw
s/_uvw/xyz/g; # substitute
s/AAA_AAA/abc_uvw/g; # recover abc_uvw
s/DDD_DDD/def_uvw/g # recover def_uvw
' input.cpp > output.cpp
cat output.cpp

sed 's/µ/µm/g;s/abc_uvw/µa/g;s/def_uvw/µd/g
s/_uvw/xyz/g
s/µd/def_uvw/g;s/µa/abc_uvw/g;s/µm/µ/g' YourFile
This is like the other in concept but "escaping" first the temporary pattern to filter on abc and def. I use µ but other char is possible, just avoid special sed char like /, \, &, ...

Related

Why does this regex work in grep but not sed?

I have two regular expressions:
$ grep -E '\-\- .*$' *.sql
$ sed -E '\-\- .*$' *.sql
(I am trying to grep lines in sql files that have comments and remove lines in sql files that have comments)
The grep command works using this regex; however, the sed returns the following error:
sed: -e expression #1, char 7: unterminated address regex
What am I doing incorrectly with sed?
(The space after the two hyphens is required for sql comments if you are unfamiliar with MySql comments of this type)
You're trying to use:
sed -E '\-\- .*$' *.sql
Here sed command is not correct because you're not really telling sed to do something.
It should be:
sed -n '/-- /p' *.sql
and equivalent grep would be:
grep -- '-- ' *.sql
or even better with a fixed string search:
grep -F -- '-- ' *.sql
Using -- to separate pattern and arguments in grep command.
There is no need to escape - in a regex if it is outside bracket expression (or character class) i.e. [...].
Based on comments below it seems OP's intent is to remove commented section in all *.sql files that start with 2 hyphens.
You may use this sed for that:
sed -i 's/-- .*//g' *.sql
The problem here is not the regex, the problem is that sed requires a command. The equivalent of your grep would be:
sed -n '/\-\- .*$/p'
You suppress output for non-matching lines -n ... you search (wrap your regex in slashes) and you print p (after the last slash).
P.S.: As Anub pointed out, escaping the hyphens - inside the regex is unnecessary.
You are trying to use sed's \cregexpc syntax where with \-<...> you are telling sed the delimiter character you want use is a dash -, but you didn't terminate it where it should be: \-<...>- also add d command to delete those lines.
sed '\-\-\-.*$-d' infile
see man sed about that:
\cregexpc
Match lines matching the regular expression regexp. The c may be any character.
if default / was used this was not required so:
sed '/--.*$/d' infile
or simply:
sed '/^--/d' infile
and more accurately:
sed '/^[[:blank:]]*--/d' infile

sed regex with alternative on Solaris doesn't work

Currently I'm trying to use sed with regex on Solaris but it doesn't work.
I need to show only lines matching to my regex.
sed -n -E '/^[a-zA-Z0-9]*$|^a_[a-zA-Z0-9]*$/p'
input file:
grtad
a_pitr
_aupa
a__as
baman
12353
ai345
ki_ag
-MXx2
!!!23
+_)#*
I want to show only lines matching to above regex:
grtad
a_pitr
baman
12353
ai345
Is there another way to use alternative? Is it possible in perl?
Thanks for any solutions.
With Perl
perl -ne 'print if /^(a_)?[a-zA-Z0-9]*$/' input.txt
The (a_)? matches a_ one-or-zero times, so optionally. It may or may not be there.
The (a_) also captures the match, what is not needed. So you can use (?:a_)? instead. The ?: makes () only group what is inside (so ? applies to the whole thing), but not remember it.
with grep
$ grep -xiE '(a_)?[a-z0-9]*' ip.txt
grtad
a_pitr
baman
12353
ai345
-x match whole line
-i ignore case
-E extended regex, if not available, use grep -xi '\(a_\)\?[a-z0-9]*'
(a_)? zero or one time match a_
[a-z0-9]* zero or more alphabets or numbers
With sed
sed -nE '/^(a_)?[a-zA-Z0-9]*$/p' ip.txt
or, with GNU sed
sed -nE '/^(a_)?[a-z0-9]*$/Ip' ip.txt

removing unmatched lines with SED

I'm trying to remove everything but 3 separate lines with specific matching pattern and leave just the 3 lines I want
Here is my code;
sed -n '/matching pattern/matching pattern/matching pattern/p' > file.txt
If you have multiple commands on the same line, you need to separate the commands by a ;:
sed -n '/matching pattern/p;/matching pattern2/p;/matching pattern3/p' file
Alternatively you can put them onto separate lines:
sed -n '/matching pattern/p
/matching pattern2/p
/matching pattern3/p' file
Beside that, you can also use regex alternation:
sed -rn '/(pattern|pattern2|pattern3)/p' file
or (better) use grep:
grep -E '(pattern|pattern2|pattern3)' file
However, this might get messy if the patterns getting longer and more complicated.
awk to the rescue!
awk '/pattern1/ || /pattern2/ || /pattern3/' filename
I think it's cleaner than alternatives.
Sed with Deletion
There's always more than one way to do this sort of thing, but one useful sed programming pattern is using alternation with deletion. For example:
# BSD sed
sed -E '/root|daemon|nobody/!d' /etc/passwd
# GNU sed
sed -r '/root|daemon|nobody/!d' /etc/passwd
This makes it possible to express ideas like "delete everything except for the listed terms." Even when expressions are functionally equivalent, it can be helpful to use a construct that most closely matches the idea you're trying to convey.
This might work for you (GNU sed):
sed '/pattern1/b;/pattern2/b;/pattern3/b;d' file
The normal flow of sed is to print what remains in the pattern space after processing. Therefore if the required pattern is in the pattern space let sed do its thing otherwise delete the line.
N.B. the b command is like a goto and if it has no following identifier, it means break out of any further sed commands and print (or not print if the -n option is in action) the contents of the pattern space.
If I understood you correctly:
sed -n '/\(pattern1\|pattern2\|pattern3\)/p' file > newfile

replace number in a string

I am trying to match this string
'12.34.5.6',#### OR
'12.34.5.6', #### (Note the space after the comma)
in a series of files and replace #### with 2222.
I started small and this command successfully changed 1234 to 2222
sed -i 's/'12.34.5.6\''\,1234/'12.34.5.6\''\, 2222/g' file.txt
so I moved on to work on replacing 1234 with regex, below are some of the commands i've tried but do not work.
sed -i 's/'12.34.5.6\''\,\(\s?[0-9]{4,5}\)/'12.34.5.6\''\, 2222/g' file.txt
sed -i 's/'12.34.5.6\''\,[0-9][0-9][0-9][0-9][0-9]?/'12.34.5.6\''\, 2222/g' file.txt
Can someone help me out with this or give some pointers?
sed -r "s/('12[.]34[.]5[.]6',[ ]?)[0-9]{4}/\\12222/g"
This might do the trick:
sed -E "s/('12.34.5.6',\s?)[0-9]{4,5}/\12222/g"
Examples:
$ echo "'12.34.5.6', 2134" | sed -E "s/('12.34.5.6',\s?)[0-9]{4,5}/\12222/g"
'12.34.5.6', 2222
$ echo "'12.34.5.6',9230" | sed -E "s/('12.34.5.6',\s?)[0-9]{4,5}/\12222/g"
'12.34.5.6',2222
Explications:
With -E we ask sed to use extended regex (but this is mainly a matter of taste), the beginning of the regex is fairly simple: '12.34.5.6', just match this same string. We then add a space, followed by a ? to indicate it is optionnal. This first part is enclosed in braces to be able to use this in the replacement pattern.
Then, we add the #'s to the pattern. I assumed you used #'s in place of numbers based on your attempts with [0-9]{4,5} and [0-9][0-9][0-9][0-9][0-9].
Finally, in the replacement pattern we use the previously matched first pair of braces with \1, and add our 2's: \12222 (which will replace the numbers (#'s), discarded in the process because not enclosed in the braces).
PS. Next time please format your question for better readability.
PPS. I think the real issue here is not the regex but the quote escaping in your regex. Maybe take look at [this question].

How to use regular expression in sed command

i have some strings with this pattern in some files:
domain.com/page-10
domain.com/page-15
....
and i want to replace them with something like
domain.com/apple-10.html
domain.com/apple-15.html
i have found that i can use sed command to replace them at a time but because after the numbers should something be added i guess i have to use regular expression to do it. but i don't know how.
sed -i.bak -r 's/page-([0-9]+)/apple-\1.html/' file
sed 's/page-\([0-9][0-9]*\)/apple-\1.html/' file > t && mv t file
Besides sed, you can also use gawk's gensub()
awk '{b=gensub(/page-([0-9]+)/,"apple-\\1.html","g",$0) ;print b }' file
sed -i 's/page-\([0-9]*\)/apple-\1.html/' <filename>
The ([0-9]*) captures a group of digits; the \1 in the replacement string references that capture and adds it as part of the replacement string.
You may want to use something like -i.backup if you need to keep a copy of the file without the replacements, or just omit the -i and instead use the I/O redirection method instead.
One more way to resolve the problem:
sed -i.bak 's/\(^.*\)\(page-\)\(.*\)/\1apple-\3.html/' Files
Here the searching patterns are stored and retrieved using references (\1, \2, \3).
This will work
sed 's/$/\.html/g' file.txt