Understand sed usage - regex

I use sed to automaticaly update the version in my doxyfile using this :
sed -i -e "s/PROJECT_NUMBER.([ ]{2,}=.*)/PROJECT_NUMBER = $$VERSION/g" ".doxygen"
with $$VERSION = 1.1.0 (for example)
and as a source :
PROJECT_NUMBER = 1.0.10
But it generate an copy version of my .doxygen named .doxygen-e and don't change the line. I've tested my regex here.
I don't understand what's wrong given the fact that it works with my plist file using this :
sed -i -e "s/#VERSION#/$$VERSION/g" "./$${TARGET}.app/Contents/Info.plist"

There are a couple of problems here:
You need to refer to a shell variable $FOO as $$FOO in a Makefile. If you are attempting to do it in bash or any other shell, saying:
$$FOO
would result in the numeric PID of the current process concatenated with FOO, e.g. if the PID of the current process is 1234, then you'd get:
1234FOO
That said, your regex seems to be wrong on more than one count. You say:
PROJECT_NUMBER.([ ]{2,}=.*)
Since you are not using any option for sed that would enable the use of Extended Regular Expressions, this would match the string PROJECT_NUMBER, followed by one character, followed by (, followed by 2 or more whitespaces, an = sign, until it encounters the last ) in the string.
Since you haven't mentioned anything about how the line in the file looks like, I'd assume that it's of the form:
PROJECT_NUMBER = 42.42
The following might work for you:
sed 's/\(PROJECT_NUMBER[ ]*=[ ]*\)[^ ]*/\1$VERSION/' filename
If invoking from within a Makefile, you'd need to double the $.

Related

sed regular expression does not work as expected. Differs on pipe and file

I have a string in text file where i want to replace the version number. Quotation marks can vary from ' to ". Also spaces around = can be there and can be not as well:
$data['MODULEXXX_VERSION'] = "1.0.0";
For testing i use
echo "_VERSION'] = \"1.1.1\"" | sed "s/\(_VERSION.*\)[1-9]\.[1-9]\.[1-9]/\11.1.2/"
which works perfectly.
When i change it to search in the file (the file has the same string):
sed "s/\(_VERSION.*\)[1-9]\.[1-9]\.[1-9]/\11.1.2/" -i test.php
, it does not find anything.
After after playing with the search part of regex, i found one more odd thing:
sed "s/\(_VERSION.*\)[1-9]\./\1***/" -i test.php
works and changes the string to $data['MODULEXXX_VERSION'] = "***0.0";, but
sed "s/\(_VERSION.*\)[1-9]\.[1-9]/\1***/" -i test.php
does not find anything anymore. Why?
I am using Ubuntu 17.04 desktop.
Anyone can explain what am I doing wrong? What would be the best command for replacing version numbers in the file for the string $data['MODULEXXX_VERSION'] = "***0.0";?
The main problem is that [1-9] doesn't match the 0s in the version number. You need to use [0-9].
Besides that, you may use the following sed command:
sed -r 's/(.*_VERSION['\''"]]\s*=\s*).*/\1"1.0.1";/' conf.php
This doesn't look at the current value, it simply replaces everything after the =.
I've used -r which enables extended posix regular expressions which makes it a bit simpler to formulate the pattern.
Another, probably cleaner attempt is to store the conf.php as a template like conf.php.tpl and then use a template engine to render the file. Or if you really want to use sed, the file may look like:
$data['FOO_VERSION'] = "FOO_VERSION_TPL";
Then just use:
sed 's/FOO_VERSION_TPL/1.0.1/' conf.php.tpl > conf.php
If there are multiple values to replace:
sed \
-e 's/FOO/BAR/' \
-e 's/HELLO/WORLD/' \
conf.php.tpl > conf.php
But I recommend a template engine instead of sed. That becomes more important when the content of the variables to replace may contain characters special to regular expressions.

sed replace exact match

I want to change some names in a file using sed. This is how the file looks like:
#! /bin/bash
SAMPLE="sample_name"
FULLSAMPLE="full_sample_name"
...
Now I only want to change sample_name & not full_sample_name using sed
I tried this
sed s/\<sample_name\>/sample_01/g ...
I thought \<> could be used to find an exact match, but when I use this, nothing is changed.
Adding '' helped to only change the sample_name. However there is another problem now: my situation was a bit more complicated than explained above since my sed command is embedded in a loop:
while read SAMPLE
do
name=$SAMPLE
sed -e 's/\<sample_name\>/$SAMPLE/g' /path/coverage.sh > path/new_coverage.sh
done < $1
So sample_name should be changed with the value attached to $SAMPLE. However when running the command sample_name is changed to $SAMPLE and not to the value attached to $SAMPLE.
I believe \< and \> work with gnu sed, you just need to quote the sed command:
sed -i.bak 's/\<sample_name\>/sample_01/g' file
In GNU sed, the following command works:
sed 's/\<sample_name\>/sample_01/' file
The only difference here is that I've enclosed the command in single quotes. Even when it is not necessary to quote a sed command, I see very little disadvantage to doing so (and it helps avoid these kinds of problems).
Another way of achieving what you want more portably is by adding the quotes to the pattern and replacement:
sed 's/"sample_name"/"sample_01"/' script.sh
Alternatively, the syntax you have proposed also works in GNU awk:
awk '{sub(/\<sample_name\>/, "sample_01")}1' file
If you want to use a variable in the replacement string, you will have to use double quotes instead of single, for example:
sed "s/\<sample_name\>/$var/" file
Variables are not expanded within single quotes, which is why you are getting the the name of your variable rather than its contents.
#user1987607
You can do this the following way:
sed s/"sample_name">/sample_01/g
where having "sample_name" in quotes " " matches the exact string value.
/g is for global replacement.
If "sample_name" occurs like this ifsample_name and you want to replace that as well
then you should use the following:
sed s/"sample_name ">/"sample_01 "/g
So that it replaces only the desired word. For example the above syntax will replace word "the" from a text file and not from words like thereby.
If you are interested in replacing only first occurence, then this would work fine
sed s/"sample_name"/sample_01/
Hope it helps

How can I use `sed` to replace the single quotes enclosing a directory with double quotes

What I want to achieve:
Suppose I have a file file with the following content:
ENV_VAR='/foo/`whoami`/bar/'
sh my_script.sh 'LOL'
I want to replace - using sed - the single quotes that surrounds the directory names, but not the ones that surrounds stuff that does not seem like a directory, for example, the arguments of a script.
That is, after running the sed command, I would expect the following output:
ENV_VAR="/foo/`whoami`/bar/"
sh my_script.sh 'LOL'
The idea is to make this happen without using tr to replace ' with ", nor sed like s/'/"/g, as I don't want to replace the lines that does not seem to be directories.
Please note that sed is running on AIX, so no GNU sed is available.
What I have tried:
If I use sed like this:
sed "s;'=.*/.*';&;g" file
... the & variable hold the regex previously matched, that is: ='/foo/`whoami`/bar/'. However, I can't figure out how to make the replacement so the single quotes gets transformed into double quotes.
I wonder if there's a way to make this work using sed only, via a one-liner.
This will do the job:
/usr/bin/sed -e "/='.*\/.*'/ s/'/\"/g" file
Basically, you just want the plain ' => " replacement, but not for all lines, just for those that match the pattern ='.*\/.*'/. And, in the s command you just need to escape the ".
This should work:
sed "s/'\(.*\/.*\)'/\"\1\"/g"
Captures the part between ' and uses a backreference.

Understanding a sed example

I found a solution for extracting the password from a Mac OS X Keychain item. It uses sed to get the password from the security command:
security 2>&1 >/dev/null find-generic-password -ga $USER | \
sed -En '/^password: / s,^password: "(.*)"$,\1,p'
The code is here in a comment by 'sr105'. The part before the | evaluates to password: "secret". I'm trying to figure out exactly how the sed command works. Here are some thoughts:
I understand the flags -En, but what are the commas doing in this example? In the sed docs it says a comma separates an address range, but there's 3 commas.
The first 'address' /^password: / has a trailing s; in the docs s is only mentioned as the replace command like s/pattern/replacement/. Not the case here.
The ^password: "(.*)"$ part looks like the Regex for isolating secret, but it's not delimited.
I can understand the end part where the back-reference \1 is printed out, but again, what are the commas doing there??
Note that I'm not interested in an easier alternative to this sed example. This will only be part of a larger bash script which will include some more sed parsing in an .htaccess file, so I'd really like to learn the syntax even if it is obscure.
Thanks for your help!
Here is sed command:
sed -En '/^password: / s,^password: "(.*)"$,\1,p'
Commas are used as regex delimiter it can very well be another delimiter like #:
sed -En '/^password: / s#^password: "(.*)"$#\1#p'`
/^password: / finds an input line that starts with password:
s#^password: "(.*)"$#\1#p finds and captures double-quoted string after password: and replaces the entire line with the captured string \1 ( so all that remains is the password )
First, the command extracts passwords from a file (or stream) and prints them to stdout.
While you "normally" might execute a sed command on all lines of a file, sed offers to specify a regex pattern which describes which lines the following command should get applied to.
In your case
/^password: /
is a regex, saying that the command:
s,^password: "(.*)"$,\1,p
should get executed for all lines looking like password: "secret". The command substitutes those lines with the password itself while suppressing the outer lines.
The substitute command might look uncommon but you can choose the delimiter in an sed command, it is not limited to /. In this case , was chosen.

Regular Expression to parse Common Name from Distinguished Name

I am attempting to parse (with sed) just First Last from the following DN(s) returned by the DSCL command in OSX terminal bash environment...
CN=First Last,OU=PCS,OU=guests,DC=domain,DC=edu
I have tried multiple regexs from this site and others with questions very close to what I wanted... mainly this question... I have tried following the advice to the best of my ability (I don't necessarily consider myself a newbie...but definitely a newbie to regex..)
DSCL returns a list of DNs, and I would like to only have First Last printed to a text file. I have attempted using sed, but I can't seem to get the correct function. I am open to other commands to parse the output. Every line begins with CN= and then there is a comma between Last and OU=.
Thank you very much for your help!
I think all of the regular expression answers provided so far are buggy, insofar as they do not properly handle quoted ',' characters in the common name. For example, consider a distinguishedName like:
CN=Doe\, John,CN=Users,DC=example,DC=local
Better to use a real library able to parse the components of a distinguishedName. If you're looking for something quick on the command line, try piping your DN to a command like this:
echo "CN=Doe\, John,CN=Users,DC=activedir,DC=local" | python -c 'import ldap; import sys; print ldap.dn.explode_dn(sys.stdin.read().strip(), notypes=1)[0]'
(depends on having the python-ldap library installed). You could cook up something similar with PHP's built-in ldap_explode_dn() function.
Two cut commands is probably the simplest (although not necessarily the best):
DSCL | cut -d, -f1 | cut -d= -f2
First, split the output from DSCL on commas and print the first field ("CN=First Last"); then split that on equal signs and print the second field.
Using sed:
sed 's/^CN=\([^,]*\).*/\1/' input_file
^ matches start of line
CN= literal string match
\([^,]*\) everything until a comma
.* rest
http://www.gnu.org/software/gawk/manual/gawk.html#Field-Separators
awk -v RS=',' -v FS='=' '$1=="CN"{print $2}' foo.txt
I like awk too, so I print the substring from the fourth char:
DSCL | awk '{FS=","}; {print substr($1,4)}' > filterednames.txt
This regex will parse a distinguished name, giving name and val a capture groups for each match.
When DN strings contain commas, they are meant to be quoted - this regex correctly handles both quoted and unquotes strings, and also handles escaped quotes in quoted strings:
(?:^|,\s?)(?:(?<name>[A-Z]+)=(?<val>"(?:[^"]|"")+"|[^,]+))+
Here is is nicely formatted:
(?:^|,\s?)
(?:
(?<name>[A-Z]+)=
(?<val>"(?:[^"]|"")+"|[^,]+)
)+
Here's a link so you can see it in action:
https://regex101.com/r/zfZX3f/2
If you want a regex to get only the CN, then this adapted version will do it:
(?:^|,\s?)(?:CN=(?<val>"(?:[^"]|"")+"|[^,]+))