Regex group match using shell [duplicate] - regex

This question already has answers here:
How do I use grep to extract a specific field value from lines
(2 answers)
Closed 3 years ago.
I am trying to match a pattern and set that as a variable.
I have a file with many "value=key". I want to find the value for key "fizz".
In the file I have this string
fizz="something_cool"
I try to parse it as:
cat file | grep fizz="(.*)"
I was thinking it would give me the group output, and then I would be able to use $1 to select it.
I also play with escaping characters and sed and awk. But I could not manage to get it working.

You need to enable extended regex for using unescaped ( and ) and quote pattern properly to make it:
grep -E 'fizz="(.*)"' file
However awk might be better choice here since it will do both search and filter in same command.
You may just use:
awk -F= '$1 == "fizz" {gsub(/"/, "", $2); print $2}' file
something_cool

Related

Pass variable as expression for regex lookaround [duplicate]

This question already has answers here:
Difference between single and double quotes in Bash
(7 answers)
Closed 2 years ago.
I'm trying to write a shell script that extracts a string that occurs between two other strings using a regex lookaround (though please let me know if there's a better way).
The string I'm searching through is the path /gdrive/My Drive/Github/gbks/NC_004113.1.gbk (in reality I have several of these strings) and the part that I want to extract is the NC_004113.1 (or whatever is in its place in another similar string). In other words, the part that I want to extract will always be flanked by /gdrive/My Drive/Github/gbks/ and .gbk.
I'm playing around with how to do this, and I thought that a regex lookaround might work. To complicate things slightly, the string itself is stored in a variable. I started to try the following, just to see if it would run, which it did:
input_directory="/gdrive/My Drive/Github/gbks/"
echo "/gdrive/My Drive/Github/gbks/NC_004113.1.gbk" | grep -oP "$input_directory"/.*
However, when I tried to do the same thing with a lookaround, the command failed:
input_directory="/gdrive/My Drive/Github/gbks/"
echo "/gdrive/My Drive/Github/gbks/NC_004113.1.gbk" | grep -oP '(?<="$input_directory")'
As a sanity check, I tried to pass the string directly as the expression, but it only worked when I omitted the quotation marks like so:
input_directory="/gdrive/My Drive/Github/gbks/"
echo "/gdrive/My Drive/Github/gbks/NC_004113.1.gbk" | grep -oP '(?=/gdrive/My Drive/Github/gbks/)'
This line actually gave me the output that I wanted (though I need to modify it so I'm passing the string in as a variable):
echo "/gdrive/My Drive/Github/gbks/NC_004113.1.gbk" | grep -oP '(?<=/gdrive/My Drive/Github/gbks/).*(?=.gbk)'
Ultimately, I think the code should look something like:
input_directory="/gdrive/My Drive/Github/gbks/"
echo "/gdrive/My Drive/Github/gbks/NC_004113.1.gbk" | grep -oP '(?<="$input_directory").*(?=.gbk)'
Thanks in advance!
-Rob
In grep -oP '(?<="$input_directory")', the variable input_directory won't be expanded becaues of the outer single quotes. You can do something like `
grep -oP '(?<='"$input_directory"')'
instead.

Use grep to get next word after match [duplicate]

This question already has answers here:
Using grep to get the next WORD after a match in each line
(6 answers)
Closed 9 months ago.
I want to use grep to get a number from a JSON file
For example, I want to get the 1.0872 from this:
{"base":"EUR","date":"2016-03-01","rates":{"USD":1.0872}}
Using
grep "USD" rates
gives out the whole line
{"base":"EUR","date":"2016-03-01","rates":{"USD":1.0872}}
I just want to display 1.0872.
I tried using a regex but it doesn't work (probably an error on my part since I've never done this before):
grep -oP '(?<="USD"\:)\w+' file
For "normal" integers and float values, you may use
grep -oP '(?<="USD":)\d+(?:\.\d+)?' file
If your numbers can have no integer part and can start with a ., use
grep -oP '(?<="USD":)\d*\.?\d+' file
An optional -:
grep -oP '(?<="USD":)-?\d*\.?\d+' file
See IDEONE demo

sed escape user input string [duplicate]

This question already has answers here:
Escape a string for a sed replace pattern
(17 answers)
Closed 7 years ago.
I am using sed for string replacement in a config file.
User has to input the string salt and then I replace this salt string in the config file:
Sample config file myconfig.conf
CONFIG_SALT_VALUE=SOME_DUMMY_VALUE
I use the command to replace dummy value with value of salt entered by the user.
sed -i s/^CONFIG_SALT_VALUE.*/CONFIG_SALT_VALUE=$salt/g" ./myconfig.conf
Issue : value of $salt can contain any character, so if $salt contains / (like 12d/dfs) then my above sed command breaks.
I can change delimiter to !, but now again if $salt contains amgh!fhf then my sed command will break.
How should I proceed to this problem?
You can use almost any character as sed delimiter. However, as you mention in your question, to keep changing it is fragile.
Maybe it is useful to use awk instead, doing a little bit of parsing of the line:
awk 'BEGIN{repl=ARGV[1]; ARGV[1]=""; FS=OFS="="}
$1 == "CONFIG_SALT_VALUE" {$2=repl}
1' "$salt" file
As one liner:
awk 'BEGIN{repl=ARGV[1]; ARGV[1]=""; FS=OFS="="} $1 == "CONFIG_SALT_VALUE" {$2=repl}1' "$salt" file
This sets = as field separator. Then, it checks when a line contains CONFIG_SALT_VALUE as parameter name. When this happens, it replaces the value to the one given.
To prevent values in $salt like foo\\bar from being interpreted, as that other guy commented in my original answer, we have the trick:
awk 'BEGIN{repl=ARGV[1]; ARGV[1]=""} ...' "$var" file
This uses the answer in How to use variable including special symbol in awk? where Ed Morton says that
The way to pass a shell variable to awk without backslashes being
interpreted is to pass it in the arg list instead of populating an awk
variable outside of the script.
and then
You need to set ARGV[1]="" after populating the awk variable to
avoid the shell variable value also being treated as a file name.
Unlike any other way of passing in a variable, ALL characters used in
a variable this way are treated literally with no "special" meaning.
This does not do in-place editing, but you can redirect to another file and then replace the original:
awk '...' file > tmp_file && mv tmp_file file

Replace string variable with string variable using Sed [duplicate]

This question already has answers here:
"sed" special characters handling
(3 answers)
Is it possible to escape regex metacharacters reliably with sed
(4 answers)
Escape a string for a sed replace pattern
(17 answers)
Closed 5 years ago.
I have a file called ethernet containing multiple lines. I have saved one of these lines as a variable called old_line. The contents of this variable looks like this:
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="2r:11:89:89:9g:ah", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"
I have created a second variable called new_line that is similar to old_line but with some modifications in the text.
I want to substitute the contents of old_line with the contents of new_line using sed. So far I have the following, but it doesn't work:
sed -i "s/${old_line}/${new_line}/g" ethernet
You need to escape your oldline so that it contains no regex special characters, luckily this can be done with sed.
old_line=$(echo "${old_line}" | sed -e 's/[]$.*[\^]/\\&/g' )
sed -i -e "s/${old_line}/${new_line}/g" ethernet
Since ${old_line} contains many regex special metacharacters like *, ? etc therefore your sed is failing.
Use this awk command instead that uses no regex:
awk -v old="$old_line" -v new="$new_line" 'p=index($0, old) {
print substr($0, 1, p-1) new substr($0, p+length(old)) }' ethernet

Matching pattern containing parentheses with sed [duplicate]

This question already has answers here:
Whether to escape ( and ) in regex using GNU sed
(4 answers)
Closed 4 years ago.
I need to insert '--' at the beginning of the line if line contains word VARCHAR(1000)
Sample of my file is:
TRIM(CAST("AP_RQ_MSG_TYPE_ID" AS NVARCHAR(1000))) AP_RQ_MSG_TYPE_ID,
TRIM(CAST("AP_RQ_PROCESSING_CD" AS NVARCHAR(1000)))
AP_RQ_PROCESSING_CD, TRIM(CAST("AP_RQ_ACQ_INST_ID" AS NVARCHAR(11)))
AP_RQ_ACQ_INST_ID, TRIM(CAST("AP_RQ_LOCAL_TXN_TIME" AS NVARCHAR(10)))
AP_RQ_LOCAL_TXN_TIME, TRIM(CAST("AP_RQ_LOCAL_TXN_DATE" AS
NVARCHAR(10))) AP_RQ_LOCAL_TXN_DATE, TRIM(CAST("AP_RQ_RETAILER" AS
NVARCHAR(11))) AP_RQ_RETAILER,
I used this command
sed 's/\(^.*VARCHAR\(1000\).*$\)/--\1/I' *.sql
But the result is not as expected.
Does anyone have idea what am I doing wrong?
this should do:
sed 's/.*VARCHAR(1000).*/--&/' file
The problem in your sed command is at the regex part. By default sed uses BRE, which means, the ( and ) (wrapping the 1000) are just literal brackets, you should not escape them, or you gave them special meaning: regex grouping.
The first and last (..) you have escaped, there you did right, if you want to reference it later by \1. so your problem is escape or not escape. :)
Use the following sed command:
sed '/VARCHAR(1000)/ s/.*/--\0/' *.sql
The s command appplies to all lines containing VARCHAR(1000). It then replaces the whole line .* by itself \0 with -- in front.
Through awk,
awk '/VARCHAR\(1000\)/ {sub (/^/,"--")}1' infile > outfile