Escaping plus in sed regular expression - regex

There is a file with following text:
CXX_FLAGS = -fPIC -Wall -Wextra -Wno-missing-braces -ffloat-store -pthread -std=gnu++17
To replace the string "-std=gnu++17" with "-std=c++17 -std=gnu++17", I tried:
sed -i -e 's/\-std\=gnu\+\+17/\-std=c\+\+17 \-std=gnu\+\+17/g' filename
That however does not work, until I remove the \ escape from frst + sign in search expression. So these seem to be OK:
sed -i -e 's/\-std\=gnu++17/\-std=c\+\+17 \-std=gnu\+\+17/g' filename
sed -i -e 's/\-std\=gnu+\+17/\-std=c\+\+17 \-std=gnu\+\+17/g' filename
sed -i -e 's/\-std\=gnu..17/\-std=c\+\+17 \-std=gnu\+\+17/g' filename
I understand the + must be escaped when not in character class, but I thought one can prefix any character with backslash in regex. Why does escaping the + sign here cause the search-replace to fail?
The OS is Ubuntu 20.04 LTS.

You have not used -r nor -E option, so you tell sed to parse the regex pattern as a POSIX BRE expression. In GNU sed, in a POSIX BRE expression, \+ is a quantifier matching 1 or more occurrences of the quantified pattern. Run sed -e 's/\-std\=gnu\+\+17/\-std=c\+\+17 \-std=gnu\+\+17/g' <<< '-std=gnuuuu17' and the result will be -std=c++17 -std=gnu++17. To match +, you just need to use +.
Note you overescaped a lot of chars and your command is unnecessarily long because you repeated the pattern in both the LHS and RHS.
You may use the following POSIX BRE sed command with GNU sed:
sed -i 's/-std=gnu++17/-std=c++17 &/' filename
See the sed online demo:
s='CXX_FLAGS = -fPIC -Wall -Wextra -Wno-missing-braces -ffloat-store -pthread -std=gnu++17'
sed 's/-std=gnu++17/-std=c++17 &/' <<< "$s"
# => CXX_FLAGS = -fPIC -Wall -Wextra -Wno-missing-braces -ffloat-store -pthread -std=c++17 -std=gnu++17
Details
-std=gnu++17 - the string pattern matches -std=gnu++17 string exactly
-std=c++17 & - the replacement pattern is -std=c++17, space and & stands for the whole match, -std=gnu++17.

Related

insert string between each pair of doublequotes

I am stuck with situation, I have string as shown below:
-name "B_12*" -o -name "B_21*" -o -name "B_31" -o -name "B_41"
My requirement is I want to convert above string is as shown below:
-name "B_12*.tar" -o -name "B_21*.tar" -o -name "B_31.tar" -o -name "B_41.tar"
I am not expert with bash commands but I have little bit idea the problem could be solved with sed command.
The only tricky part here is that you need to match both quotes so that they won't be matched again. With a sed distro which has ERE support by -E option, following command would suffice.
sed -E 's/("[^"]*)"/\1.tar"/g' file
This pattern will match the text string without single quote , all you need to do is get all the matches and perform an alternate query to add .tar
\b[A-Z][^"]+
\b[A-Z] match a Char in scope [A-Z]
[^"] match until "
Demo Regex101.com
sed-replace-syntax

How to use -o with grep to extract a number

For example, if I try to run the command without the -o tag:
> grep '[0-9]*' <<< "ss1578130091522"
> ss1578130091522
If I try to run with the -o tag, I get this:
> grep -o '[0-9]*' <<< "ss1578130091522"
>
Why does it return me an empty line? Shouldn't it extract the number for me?
This is what I ideally want:
> grep -o '[0-9]*' <<< "ss1578130091522"
> 1578130091522
This is using zsh on macOS Catalina.
The regex [0-9]* matches the empty string, so that's what grep -o returns.
Try
grep -E -o '[0-9]+' <<<"ss1578130091522"
or
grep -o '[0-9]\+' <<<"ss1578130091522"
With -E, grep supports extended regular expression syntax.
Without -E, you have to use POSIX basic regular expression syntax, which
(insanely IMHO) requires the plus operator to be backslashed.
(The original grep from 1969 did not have this operator; that's why this syntax is "extended".)
You are clearly using zsh, not bash. These are two different, incompatible shells (though they share some features, and are both based on the Bourne shell sh).

Multiline regex search with ag

I'd like to "AND" search text in spesific multiline range in a file by regex with ag(the_silver_searcher). But the regex pattern not work.
Following regex pattern works well.
ag --multiline -G "^.*\.(md|txt)$" -C 1 -S "foo(\n|.)*baz" ./dev_note.md
(output)
40-
41:foo
42:bar
43:baz
44-
But following regex pattern will output nothing.(no matched)
ag --multiline -G "^.*\.(md|txt)$" -C 1 -S "(?=(.|\n)*(foo))(?=(.|\n)*(baz))" ./dev_note.md
Also I tried: ag --multiline -G "^.*\.(md|txt)$" -C 1 -S "(?=(.|\n)*(foo))(.|\n)*(?=(.|\n)*(baz))" ./dev_note.md

Regex sed issue

My sed expression looks as belos:
sed -i "s/-D CONSOLELOG /-D CONSOLELOG -fPIC /g" makefile.init
makefile.init
CFLAGS = -std=c99 -rdynamic -g -Wall -Wno-write-strings -D CONSOLELOG
Output after 1st Run( As expected)
CFLAGS = -std=c99 -rdynamic -g -Wall -Wno-write-strings -D CONSOLELOG -fPIC
2nd Run (Notice the extra fPIC at the end)
CFLAGS = -std=c99 -rdynamic -g -Wall -Wno-write-strings -D CONSOLELOG -fPIC -fPIC
I need to modify my sed expression to get output as in (1) irrespective of the number of times it is executed
This might work for you (GNU sed):
sed -ri 's/-D CONSOLELOG (-fPIC )?$/&-fPIC /' file
This would insert at most 2 -fPIC options following a -D CONSOLELOG option.
Sample changed for illustration purposes
$ cat ip.txt
42 foo baz
ijk baz xyz
$ sed -i 's/baz $/&123/' ip.txt
$ cat ip.txt
42 foo baz 123
ijk baz xyz
$ # further runs won't change input
$ sed -i 's/baz $/&123/' ip.txt
$ cat ip.txt
42 foo baz 123
ijk baz xyz
$ is a meta character to ensure matching only at end of line
so, matches elsewhere in the line won't be changed and hence applying the command again won't result in duplication
& in replacement section is backreference to entire matched string in search section
since there can only be one match at end of line, g modifier is not needed
To replace anywhere in the line(assuming only single match per line)
$ cat ip.txt
42 foo baz
ijk baz xyz
$ sed -i '/baz 123/! s/baz /&123/' ip.txt
$ cat ip.txt
42 foo baz 123
ijk baz 123xyz
$ # further runs won't change input
$ sed -i '/baz 123/! s/baz /&123/' ip.txt
$ cat ip.txt
42 foo baz 123
ijk baz 123xyz
sed commands can be qualified with addressing
here, /baz 123/! means lines not matching baz 123
Further reading: Difference between single and double quotes in Bash

Why is sed's + regex operator not matching any leading whitespace, but * is?

I thought + and * are both greedy. Why does + not seem to match anything in this context, but * does?
$ cat test
a
$ sed -i 's/^[ \t]+//g' test
$ cat test
a
$ sed -i 's/^[ \t]*//g' test
$ cat test
a
Those are just spaces (not tabs) before a, but tabs alone, or a mix of both, results in the same thing.
This is on sed (GNU sed) 4.2.2.
+ is not recognised in Basic Regular Expressions, which is the default for sed. Using the + is matching the literal + character.
Use -E option to force sed to recognise ERE (Extended Regular Expressions).
You need to escape the '+' with a backslash \+
sed -i 's/^[ \t]\+//g' test