How can I make my regular expression match variables inbetween brackets on single and multiple lines? - regex

I am trying to match all environment variables that get de-structured from process.env in my Typescript project. This includes matching de-structuring on single or multiple lines.
Take the following TS code examples that contains de-structuring from process.env, 1 is on a single line, the other is across multiple lines:
File1.ts:
const { OKTA_AUDIENCE_DOMAIN, OKTA_DOMAIN = '', OKTA_WEB_CLIENT_ID, OKTA_ISSUER: issuer = '', AUTH_TWO: ISSUE_TWO = '', } = process.env;
File2.ts:
const {
OKTA_AUDIENCE_DOMAIN,
OKTA_DOMAIN = '',
OKTA_WEB_CLIENT_ID,
OKTA_ISSUER: issuer = '',
AUTH_TWO: ISSUE_TWO = '',
} = process.env;
I have only been able to write a script that matches the de-strucutred variables on the single line. However, I want the script to match single and multi lines.
This is the script I currently have:
grep -Ezo '\{[^}]*\} = process.env' File1.ts
If ran on File1, I would get the following output, which is what I expect:
{ AUTH0_AUDIENCE_DOMAIN, AUTH0_DOMAIN = '', AUTH0_WEB_CLIENT_ID, AUTH0_ISSUER: issuer = '', AUTH_TWO: ISSUE_TWO = '', } = process.env
You can see here that it has correctly matched the environment variables being de-structured from process.env
Now if I were to run this same script on File2.ts, it would return nothing:
grep -Ezo '\{[^}]*\} = process.env' File2.ts
The output is empty.
How can I modify this script such that it matches the environment variables being de-structured on both single and multiple lines?

Try this:
cat File1.ts | tr '\n' ' ' | sed 's/ */ /g' | grep -Ezo '\{[^}]*\} = process.env'
cat File2.ts | tr '\n' ' ' | sed 's/ */ /g' | grep -Ezo '\{[^}]*\} = process.env'
Notes:
tr changes newlines to spaces (change to tr '\n\r' ' ' if you are on Windows)
sed gets rid of multiple spaces
grep has your original parameters

If you are okay with installing ripgrep
rg -oU '\{[^}]*\} = process\.env'
The -U option enables multiline matching.
That said, grep -oz '{[^}]*} = process\.env' works for me for your given input samples. Also, note that with grep -z the output will be terminated with ASCII NUL characters. So, you'll need to further process the output, for example pipe to tr '\0' '\n'

Related

Grep pattern with Multiple lines with AND operation

How can I determine a pattern exists over multiple lines with grep? below is a multiline pattern I need to check is present in the file
Status: True
Type: Master
I tried the below command but it checks multiple strings on a single line but fails for strings pattern match on multiple lines
if cat file.txt | grep -P '^(?=.*Status:.*True)(?=.*Type:.*Master)'; then echo "Present"; else echo "NOT FOUND"; fi
file.txt
Interface: vlan
Status: True
Type: Master
ID: 104
Using gnu-grep you can do this:
grep -zoP '(?m)^\s*Status:\s+True\s+Type:\s+Master\s*' file
Status: True
Type: Master
Explanation:
P: Enabled PCRE regex mode
-z: Reads multiline input
-o: Prints only matched data
(?m) Enables MULTILINE mode so that we can use ^ before each line
^: Start a line
With your shown samples, please try following awk program. Written and tested in GNU awk.
awk -v RS='(^|\n)Status:[[:space:]]+True\nType:[[:space:]]+Master' '
RT{
sub(/^\n/,"",RT)
print RT
}
' Input_file
Explanation: Simple explanation would be setting RS(Record separator of awk) as regex (^|\n)Status:[[:space:]]+True\nType:[[:space:]]+Master(explained below) and in main program checking if RT is NOT NULL then remove new line(starting one) in RT with NULL and print value of RT to get expected output shown by OP.
I did it as follows:
grep -A 1 "^.*Status:.*True" test.txt | grep -B 1 "^Type:.*Master"
The -A x means "also show the x lines After the found one.
The -B y means "also show the y lines Before the found one.
So: show the "Status" line together with the next one (the "Type" one), and then show the "Type" line together with the previous one (the "Status" one).
You could also keep track of the previous line setting in every line at the end prev = $0 and use a pattern to match the previous and the current line.
awk '
prev ~ /^[[:space:]]*Status:[[:space:]]*True$/ && $0 ~ /^[[:space:]]*Type:[[:space:]]*Master$/{
printf "%s\n%s", prev, $0
}
{prev = $0}
' file.txt
Output
Status: True
Type: Master

How to use regex in shell file (.sh) to capture '\' and newline (linefeed)?

I am trying to capture '\' and newline in a shell file (.sh).
I've tried in the site: https://regexr.com/ and it works.
But it seems the way is not the same as in the shell file.
Here is the target and i wanna get those three match groups:
some dummy code in front of
blablabla
CE3( Match_Group_1, \(some space may right after this backslash)
Match_Group_2, \(some space may right after this backslash)
Match_Group_3, \(some space may right after this backslash)
abcabc1234, \(some space may right after this backslash)
abcd12345 )
blablabla
blablabla
My regex in https://regexr.com/:
'\s*' can capture space, tab and newline. Get those match groups by (\w+)
\s*\(\s*(\w+)\s*,\s*\\\s*(\w+)\s*,\s*\\\s*(\w+)
My regex in shell file for match then print: it failed to get those three match groups
awk_cmd="awk 'match(\$0, /(${i})\\s*\(\\s*(\\w+)\\s*,\\s*\\\\s*(\\w+)\\s*,\\s*\\\\s*(\\w+)/, g) {print FILENAME \",\" NR \",\" g[1] \",\" g[3] \",\" g[4]}'"
Could anyone help me
So many thanks
Is this what you're trying to do?
$ awk_cmd() {
awk -v RS='^$' -v OFS='","' '
match($0,/\s*\(\s*(\w+)\s*,\s*\\\s*(\w+)\s*,\s*\\\s*(\w+)/,g) {
print "\"" FILENAME, NR, g[1], g[2], g[3] "\""
}
' "$#"
}
$ awk_cmd file
"file","1","Match_Group_1","Match_Group_2","Match_Group_3"
$ cat file | awk_cmd
"-","1","Match_Group_1","Match_Group_2","Match_Group_3"
Since your regexp has to span multiple lines it's not clear what value you expect NR to have. In the above I'm treating the whole input file as a single record so NR will always just be 1. If you're trying to print the line number where the string that matches the regexp starts that'd be:
$ awk_cmd() {
awk -v RS='^$' -v OFS='","' '
match($0,/(.*)\s*\(\s*(\w+)\s*,\s*\\\s*(\w+)\s*,\s*\\\s*(\w+)/,g) {
nr = gsub(/\n/,"&",g[1]) + 1
print "\"" FILENAME, nr, g[2], g[3], g[4] "\""
}
' "$#"
}
$ awk_cmd file
"file","3","Match_Group_1","Match_Group_2","Match_Group_3"
The above uses GNU awk for multi-char RS and the 3rd arg to match() and \s and \w shorthand for [[:space:]] and [[:alnum:]_].
So this awk solution is relative more portable, since it doesn't require gawk's custom extension within match() function -
match(s, r [, a] )
— The optional array where gawk stores individual match groups on your behalf.
——————————————————————————————————————————————————
------------
| input :: |
------------
some dummy code in front of
blablabla
CE3( Match_Group_1, \ $ # The dollar signs <[ $ ]>
Match_Group_2, \ $ # denote location of true
Match_Group_3, \ $ # line ending in order to
abcabc1234, \ $ # showcase trailing spaces
abcd12345 ) # after backslash <[ \\ ]>
# but b4 actual newline <[ \n ]>
blablabla
blablabla
——————————————————————————————————————————————————
------------
| command :: |
------------
[mg]awk '
BEGIN {
__ = sprintf("%.*s%.*s",!+(\
RS = "^$" ),
OFS = \
(_ = "\"")(",")(_),+_,
FS = "[,]"(_="[ \t\v\f\r]*")"[\\\\]"(_)(\
__ = "["(ORS)"]")(_)"|^.*"(__)(_)"[^(]+[(]"(_))
}
FNR < NR { # This just to showcase it can handle
# spaces, unicode, and emojis
# inside the match group
gsub("Match[_]Group[_]","Match Group \360\237\244\241 ")
}
$!NF=($(!(NF-=($!!—-NF= sprintf("%s%s:%.f%s%.f", __,
FILENAME,
FNR,OFS,NR))^_))) __'
——————————————————————————————————————————————————
------------
| output :: |
------------
"testbackslash_0002.txt:1","1","Match_Group_1","Match_Group_2","Match_Group_3"
"testbackslash_0001.txt:1","2","Match Group 🤡 1","Match Group 🤡 2","Match Group 🤡 3"
.

Linux Replace With Variable Containing Double Quotes

I have read the following:
How Do I Use Variables In A Sed Command
How can I use variables when doing a sed?
Sed replace variable in double quotes
I have learned that I can use sed "s/STRING/$var1/g" to replace a string with the contents of a variable. However, I'm having a hard time finding out how to replace with a variable that contains double quotes, brackets and exclamation marks.
Then, hoping to escape the quotes, I tried piping my result though sed 's/\"/\\\"/g' which gave me another error sed: -e expression #1, char 7: unknown command: E'. I was hoping to escape the problematic characters and then do the variable replacement: sed "s/STRING/$var1/g". But I couldn't get that far either.
I figured you guys might know a better way to replace a string with a variable that contains quotes.
File1.txt:
Example test here
<tag>"Hello! [world]" This line sucks!</tag>
End example file
Variable:
var1=$(cat file1.txt)
Example:
echo "STRING" | sed "s/STRING/$var1/g"
Desired output:
Example test here
<tag>"Hello! [world]" This line sucks!</tag>
End example file
using awk
$ echo "STRING" | awk -v var="$var1" '{ gsub(/STRING/,var,$0); print $0}'
Example test here
<tag>"Hello! [world]" This line sucks!</tag>
End example file
-v var="$var1": To use shell variable in awk
gsub(/STRING/,var,$0) : To globally substitute all occurances of "STRING" in whole record $0 with var
Special case : "If your var has & in it " say at the beginning of the line then it will create problems with gsub as & has a special meaning and refers to the matched text instead.
To deal with this situation we've to escape & as follows :
$ echo "STRING" | awk -v var="$var1" '{ gsub(/&/,"\\\\&",var); gsub(/STRING/,var,$0);print $0}'
&Example test here
<tag>"Hello! [world]" This line sucks!</tag>
End example file
The problem isn't the quotes. You're missing the "s" command, leading sed to treat /STRING/ as a line address, and the value of $var1 as a command to execute on matching lines. Also, $var1 has unescaped newlines and a / character that'll cause trouble in the substitution. So add the "s", and escape the relevant characters in $var1:
var1escaped="$(echo "$var1" | sed 's#[\/&]#\\&#; $ !s/$/\\/')"
echo "STRING" | sed "s/STRING/$var1escaped/"
...but realistically, #batMan's answer (using awk) is probably a better solution.
Here is one awk command that gets text-to-be-replaces from a file that may consist of all kind of special characters such as & or \ etc:
awk -v pat="STRING" 'ARGV[1] == FILENAME {
# read replacement text from first file in arguments
a = (a == "" ? "" : a RS) $0
next
}
{
# now run a loop using index function and use substr to get the replacements
s = ""
while( p = index($0, pat) ) {
s = s substr($0, 1, p-1) a
$0 = substr($0, p+length(pat))
}
$0 = s $0
} 1' File1.txt <(echo "STRING")
To be able to handle all kind of special characters properly this command avoids any regex based functions. We use plain text based functions such as index, substr etc.

Sed substitution with Apex character

I'm trying to implement the following substitution
sed -i 's/$config['default_host'] = '';/$config['default_host'] = 'localhost';/' /etc/roundcube/config.inc.php
but it's not working.
What i want to do to is replace $config['default_host'] = ''; with $config['default_host'] = 'localhost'; inside the file /etc/roundcube/config.inc.php
Any ideas?
You should escape the special characters, because sed consider $ as a end of the character in a line
sed "s/\$config\['default_host'\] = '';/\$config['default_host'] = 'localhost';/" fileName
Using Grouping concept
sed "s/\(\$config\['default_host'\] = \)'';/\1'localhost';/" fileName
Output:
$config['default_host'] = 'localhost';

Matching pattern including newline with OS X (BSD) sed

I want to match the following patterns using sed on OSX:
test = {
and
test =
{
I tried a lot of different things, including the line below but I can't figure out why it doesn't work:
sed -n -E -e "s/(^[a-zA-Z_]*[ ]*=[ "'\'$'\n'"]*{.*)/var \1/p" $path
I used the extquote $'\n' to match newline and included a backslash in front of it as I read on many posts on the internet. If I don't use any ( or [ groups it does work but I want to use the groups. I keep getting the following error:
sed: 1: "s/(^[a-zA-Z_]*[ ]*=[ \
...": unbalanced brackets ([])
Can anyone help me? I'm getting quite desperate.
This should work depending on what your exact data can be
sed '/[[:alpha:]]* =/{/ *{/!N;//s/^/var /;}' file
Input
test = {
blah
test =
{
wut
Output
var test = {
blah
var test =
{
wut