Modify bash variables with sed - regex

I am trying to modify a number of environmental variables containing predefined compiler flags. To do so, I tried using a bash loop that goes over all environmental variables listed with "env".
for i in $(env | grep ipo | awk 'BEGIN {FS="="} ; { print $1 } ' )
do echo $(sed -e "s/-ipo/ / ; s/-axAVX/ /" <<< $i)
done
This is not working since the loop variable $i contains just the name of the environmental variable stored as a character string. I tried searching a method to convert a string into a variable but things started becoming unnecessary complicated. The basic problem is how to properly supply the environmental variable itself to sed.
Any ideas how to properly modify my script are welcome.
Thanks,
Alex

Part I
The way you're parsing env is wrong. It breaks whenever you have spaces or wildcards. Instead use this:
while IFS= read -r line; do
# do stuff with variable line
done < <(env)
To see why your solution is broken, do:
for i in $(env); do
echo "$i"
done
and you'll very likely see a difference with the output of env.
Now the while method I gave will also break when you have newlines in your variables.
Very likely your env version has the flag -0 or --null. Use it to be 100% safe:
while IFS= read -r -d '' line; do
# do stuff with variable line
done < <(env --null)
Part II
When you have read your line, you want to split it into a key and a value. Don't use awk for that. Use Bash:
key=${line%%=*}
value=${line#*=}
Look:
while IFS= read -r -d '' line; do
key=${line%%=*}
value=${line#*=}
echo "key: $key"
echo "value: $value"
done < <(env --null)
Part III
Now I understand that you want to act only on the variables that contain the string ipo, and for these you want to substitute a space for the first occurence of the string -ipo and -axAVX. So:
while IFS= read -r -d '' line; do
key=${line%%=*}
value=${line#*=}
[[ $value = *ipo* ]] || continue
value=${value/-ipo/ }
value=${value/-axAVX/ }
echo "key: $key"
echo "new value: $value"
done < <(env --null)
Part IV
You want to replace the variable with this new value. You can use declare for this. (You don't need the export builtin, since your variable is already marked as exported):
while IFS= read -r -d '' line; do
key=${line%%=*}
value=${line#*=}
[[ $value = *ipo* ]] || continue
value=${value/-ipo/ }
value=${value/-axAVX/ }
declare "$key=$value"
done < <(env --null)
Part V
Finally, you'll try to put this in a script and you'll realize that it doesn't work: that's because a script is executed in a child process and every changes made in a child process are not seen by the parent process. So you'll want to source it! To source a file file, use:
. file
(yes, a dot, a space and the name of the file).

Try with indirect expansion:
for i in $(env | grep ipo | awk 'BEGIN {FS="="} ; { print $1 } ' )
do
echo $(sed -e "s/-ipo/ / ; s/-axAVX/ /" <<< ${!i})
done

I think the bit you are missing is the ${!i} to expand the variable called whatever $i is set to..
#!/bin/sh
for i in $(env | grep ipo | awk 'BEGIN {FS="="} ; { print $1 }' )
do
val=$(sed -e "s/-ipo/ / ; s/-axAVX/ /" <<< ${!i})
export ${i}=${val}
echo ${i} is now set to $val
done
... do stuff with new env variables
If you run the script it will change the environment variable for itself and anything it spawns.
When it returns however you will still have the same environment you started with.
$ echo $IPOVAR
blah -ipo -axAVX end # variable stats as this
$ sh env.sh
IPOVAR is now set to blah end # It is changed!
$ echo $IPOVAR
blah -ipo -axAVX end # Its still the same.

I believe you can do it all in awk:
env | grep ipo | awk -F= '{ gsub("-ipo","",$2); gsub("-axAVX","",$2); print $0}'

Related

Find regular expression in a file matching a given value

I have some basic knowledge on using regular expressions with grep (bash).
But I want to use regular expressions the other way around.
For example I have a file containing the following entries:
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
Now I want to use bash to figure out to which line a particular number matches.
For example:
grep 8 file
should return:
line_three=[7-9]
Note: I am aware that the example of "grep 8 file" doesn't make sense, but I hope it helps to understand what I am trying to achieve.
Thanks for you help,
Marcel
As others haven pointed out, awk is the right tool for this:
awk -F'=' '8~$2{print $0;}' file
... and if you want this tool to feel more like grep, a quick bash wrapper:
#!/bin/bash
awk -F'=' -v seek_value="$1" 'seek_value~$2{print $0;}' "$2"
Which would run like:
./not_exactly_grep.sh 8 file
line_three=[7-9]
My first impression is that this is not a task for grep, maybe for awk.
Trying to do things with grep I only see this:
for line in $(cat file); do echo 8 | grep "${line#*=}" && echo "${line%=*}" ; done
Using while for file reading (following comments):
while IFS= read -r line; do echo 8 | grep "${line#*=}" && echo "${line%=*}" ; done < file
This can be done in native bash using the syntax [[ $value =~ $regex ]] to test:
find_regex_matching() {
local value=$1
while IFS= read -r line; do # read from input line-by-line
[[ $line = *=* ]] || continue # skip lines not containing an =
regex=${line#*=} # prune everything before the = for the regex
if [[ $value =~ $regex ]]; then # test whether we match...
printf '%s\n' "$line" # ...and print if we do.
fi
done
}
...used as:
find_regex_matching 8 <file
...or, to test it with your sample input inline:
find_regex_matching 8 <<'EOF'
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
EOF
...which properly emits:
line_three=[7-9]
You could replace printf '%s\n' "$line" with printf '%s\n' "${line%%=*}" to print only the key (contents before the =), if so inclined. See the bash-hackers page on parameter expansion for a rundown on the syntax involved.
This is not built-in functionality of grep, but it's easy to do with awk, with a change in syntax:
/[0-3]/ { print "line one" }
/[4-6]/ { print "line two" }
/[7-9]/ { print "line three" }
If you really need to, you could programmatically change your input file to this syntax, if it doesn't contain any characters that need escaping (mainly / in the regex or " in the string):
sed -e 's#\(.*\)=\(.*\)#/\2/ { print "\1" }#'
As I understand it, you are looking for a range that includes some value.
You can do this in gawk:
$ cat /tmp/file
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
$ awk -v n=8 'match($0, /([0-9]+)-([0-9]+)/, a){ if (a[1]<n && a[2]>n) print $0 }' /tmp/file
line_three=[7-9]
Since the digits are being treated as numbers (vs a regex) it supports larger ranges:
$ cat /tmp/file
line_one=[0-3]
line_two=[4-6]
line_three=[75-95]
line_four=[55-105]
$ awk -v n=92 'match($0, /([0-9]+)-([0-9]+)/, a){ if (a[1]<n && a[2]>n) print $0 }' /tmp/file
line_three=[75-95]
line_four=[55-105]
If you are just looking to interpret the right hand side of the = as a regex, you can do:
$ awk -F= -v tgt=8 'tgt~$2' /tmp/file
You would like to do something like
grep -Ef <(cut -d= -f2 file) <(echo 8)
This wil grep what you want but will not display where.
With grep you can show some message:
echo "8" | sed -n '/[7-9]/ s/.*/Found it in line_three/p'
Now you would like to transfer your regexp file into such commands:
sed 's#\(.*\)=\(.*\)#/\2/ s/.*/Found at \1/p#' file
Store these commands in a virtual command file and you will have
echo "8" | sed -nf <(sed 's#\(.*\)=\(.*\)#/\2/ s/.*/Found at \1/p#' file)

Correcting file numbers using bash

I have a bunch of file names in a folder like this:
test_07_ds.csv
test_08_ds.csv
test_09_ds.csv
test_10_ds.csv
...
I want to decrease the number of every file, so that these become:
test_01_ds.csv
test_02_ds.csv
test_03_ds.csv
test_04_ds.csv
...
Here's what I came up with:
for i in $1/*; do
n=${i//[^0-9]/};
n2=`expr $n - 6`;
if [ $n2 -lt 10 ]; then
n2="0"$n2;
fi
n3=`echo $i | sed -r "s/[0-9]+/$n2/"`
echo $n3;
cp $i "fix/$n3";
done;
Is there a cleaner way of doing this?
This might help:
shopt -s extglob
for i in test_{07..10}_ds.csv; do
IFS=_ read s m e <<<"$i"; # echo "Start=$s Middle=$m End=$e"
n=${m#+(0)} # Remove leading zeros to
# avoid interpretation as octal number.
n=$((n-6)) # Subtract 6.
n=$(printf '%02d' "$n") # Format `n` with a leading 0.
# comment out the next echo to actually execute the copy.
echo \
cp "$i" "fix/${s}_${n}_${e}";
done;
Or collapsing it all together
#!/bin/bash
shopt -s extglob
for i in ${1:-.}/*; do # $1 will default to pwd `.`
IFS=_ read s m e <<<"$i"; # echo "Start=$s Middle=$m End=$e"
n=$(printf '%02d' "$((${m#+(0)}-6))")
cp "$i" "fix/${s}_${n}_${e}";
done;
You can use awk for simplification:
for f in *.csv; do
mv "$f" $(awk 'BEGIN{FS=OFS="_"} {$2 = sprintf("%02d", $2-6)} 1' <<< "$f")
done
Could you please try following code and let me know if this helps you.
awk 'FNR==1{OLD=FILENAME;split(FILENAME, A,"_");A[2]=A[2]-6;NEW=A[1]"_"A[2]"_"A[3];system("mv " OLD " " NEW);close(OLD)}' *.csv
Also I had assumed like your files are always starting from _7 name so I have deducted 6 from each of their names, also in case you could put complete path in mv command which is placed in above system awk's built-in utility and could move the files to another place too. Let me know how it goes then.

Search for substring matches in a file bash

The premise is to store a database file of colon separated values representing items.
var1:var2:var3:var4
I need to sort through this file and extract the lines where any of the values match a search string.
For example
Search for "Help"
Hey:There:You:Friends
I:Kinda:Need:Help (this line would be extracted)
I'm using a function to pass in the search string, and then passing the found lines to another function to format the output. However I can't seem to be able to get the format right when passing. Here is sample code i've tried of different ways that I've found on this site, but they don't seem to be working for me
#Option 1, it doesn't ever find matches
function retrieveMatch {
if [ -n "$1" ]; then
while read line; do
if [[ *"$1"* =~ "$line" ]]; then
formatPrint "$line"
fi
done
fi
}
#Option 2, it gets all the matches, but then passes the value in a
#format different than a file? At least it seems to...
function retrieveMatch {
if [ -n "$1" ]; then
formatPrint `cat database.txt | grep "$1"`
fi
}
function formatPrint {
list="database.txt" #default file for printing all info
if [ -n "$1" ]; then
list="$1"
fi
IFS=':'
while read var1 var2 var3 var4; do
echo "$var1"
echo "$var2"
echo "$var3"
echo "$var4"
done < "$list"
}
I can't seem to get the first one to find any matches
The second options gets the right values, but when I try to formatPrint, it throws an error saying that the list of values passed in are not a directory.
Honestly, I'd replace the whole thing with
function retrieveMatch {
grep "$1" | tr ':' '\n'
}
To be called as
retrieveMatch Help < filename
...like the original function (Option 1) appeared to be designed. To do more complicated things with matching lines, have a look at awk:
# in the awk script, the fields in the line will be $1, $2 etc.
awk -v pattern="$1" -F : '$0 ~ pattern { for(i = 1; i < NF; ++i) print $i }'
See this link. Awk is made to process exactly this sort of data, so if you plan to do complex things with it, it is definitely worth a look.
Answering the question more directly, there are two/three problems in your code. One is, as was pointed out in the comments to the question, that the line
if [[ *"$1"* =~ "$line" ]]; then
Will try to use "$line" as a regular expression to find a match in *"$1"*, assuming that *"$1"* does not become more than one token after pathname expansion because the * are not quoted. Assuming that the * are supposed to match anything the way they would in glob expressions (but not in regular expressions), this could be replaced with
if [[ "$line" =~ "$1" ]]; then
because =~ will report a match if the regex matches any part of the string.
The second problem is that you're divided on whether you want "$list" in formatPrint to be a file or a line. You say in retrieveMatch that it should be a line:
formatPrint "$line"
But you set it to a filename default in formatPrint:
list="database.txt" #default file for printing all info
You'll have to decide on one. If you decide that formatPrint should format lines, then the third problem is that the redirection in
while read var1 var2 var3 var4; do
echo "$var1"
echo "$var2"
echo "$var3"
echo "$var4"
done < "$list"
tries to use "$list" as a filename. This could be fixed by replacing the last line with
done <<< "$list" # using a here-string (bash-specific)
Or
done <<EOF
$list
EOF
(note: in the latter case, do not indent the code; it's a here-document that's taken verbatim). And, of course, read will only split four fields the way you wrote it.
I feel I must be missing something, but..
cat > foo.txt
Hey:There:You:Friends I:Kinda:Need:Help
Foo:Bar
[Give control-D]
grep -i help foo.txt
Hey:There:You:Friends I:Kinda:Need:Help
Does it fit the bill?
EDIT: To expand a little further on this thought..
cat > foo.bsh
#!/bin/bash
hits="$(grep -i help foo.txt)"
while read -r line; do
echo "${line}"
done <<< "$hits"
[Give control-D]

Get all variables in bash from text line

Suppose I have a text line like
echo -e "$text is now set for ${items[$i]} and count is ${#items[#]} and this number is $((i+1))"
I need to get all variables (for example, using sed) so that after all I have list containing: $text, ${items[$i]}, $i, ${#items[#]}, $((i+1)).
I am writing script which have some complex commands and before executing each command it prompts it to user. So when my script prompts command like "pacman -S ${softtitles[$i]}" you can't guess what this command is actually does. I just want to add a list of variables used in this command below and it's values. So I decided to do it via regex and sed, but I can't do it properly :/
UPD: It can be just a string like echo "$test is 'ololo', ${items[$i]} is 'today', $i is 3", it doesn't need to be list at all and it can include any temporary variables and multiple lines of code. Also it doesn't have to be sed :)
SOLUTION:
echo $m | grep -oP '(?<!\[)\$[{(]?[^"\s\/\047.\\]+[})]?' | uniq > vars
$m - our line of code with several bash variables, like "This is $string with ${some[$i]} variables"
uniq - if we have string with multiple same variables, this will remove dublicates
vars - temporary file to hold all variables found in text string
Next piece of code will show all variables and its values in fancy style:
if [ ! "`cat vars`" == "" ]; then
while read -r p; do
value=`eval echo $p`
Style=`echo -e "$Style\n\t$Green$p = $value$Def"`
done < vars
fi
$Style - predefined variable with some text (title of the command)
$Green, $Def - just tput settings of color (green -> text -> default)
Green=`tput setaf 2`
Def=`tput sgr0`
$p - each line of vars file (all variables one by one) looped by while read -r p loop.
You could simply use the below grep command,
$ grep -oP '(?<!\[)(\$[^"\s]+)' file
$text
${items[$i]}
${#items[#]}
$((i+1))
I'm not sure its perfect , but it will help for you
sed -r 's/(\$[^ "]+)/\n\1\n/g' filename | sed -n '/^\$/p'
Explanation :
(\$[^ "]+) - Match the character $ followed by any charter until whitespace or double quote.
\n\1\n - Matched word before and after put newlines ( so the variable present in separate line ) .
/^\$/p - start with $ print the line like print variable
A few approaches, I tested each of them on file which contains
echo -e "$text is now set for ${items[$i]} and count is ${#items[#]} and this number is $((i+1))"
grep
$ grep -oP '\$[^ "]*' file
$text
${items[$i]}
${#items[#]}
$((i+1))
perl
$ perl -ne '#f=(/\$[^ "]*/g); print "#f"' file
$text ${items[$i]} ${#items[#]} $((i+1))
or
$ perl -ne '#f=(/\$[^ "]*/g); print join "\n",#f' file
$text
${items[$i]}
${#items[#]}
$((i+1))
The idea is the same in all of them. They will collect the list of strings that start with a $ and as many subsequent characters as possible that are neither spaces nor ".

Bash escaping script & sed capture group

I have a strange behavior in the bash script that I don't understand.
Basically in the code below I try to escape meta-characters...
while IFS=, read _type _content; do
if [ -z "$patternfilter" ]; then
if [ "$_type" == "rex" ]; then
patternfilter="$_content"
elif [ "$_type" == "txt" ]; then
patternfilter="`echo "$_content" | sed -re 's/([-^[{}()*+/.,;?$|#\\])/\\\1/g' -e 's/]/\\]/g'`"
fi
else
if [ "$_type" == "rex" ]; then
patternfilter="$patternfilter|$_content"
elif [ "$_type" == "txt" ]; then
patternfilter="$patternfilter|`echo "$_content" | sed -re 's/([-^[{}()*+/.,;?$|#\\])/\\\1/g' -e 's/]/\\]/g'`"
fi
fi
done < $patternfile
The outpout give me the following :
blabal\1bla\1blabla\1toto\1com
Instead of :
blabal\(bla\)blabla\[toto\]\.com
If I enter directly in the console the code it works ... I miss something but I don't know what.
[root]# patternfilter="blabal(bla)blabla[toto].com"
[root]# echo "$patternfilter" | sed -re 's/([-^[{}()*+/.,;?$|#\\])/\\\1/g' -e 's/]/\\]/g'
blabal\(bla\)blabla\[toto\]\.com
You cannot reliably escape characters in sed as whether or not a character needs to be escaped is context sensitive. Also, the shell is an environment from which to call tools. The standard UNIX tool to manipulate text is awk. Just have your shell call awk to do everything. By the way, your use of `...` instead of $(...) will interpret double escapes and your use of read without -r will expand escapes.
SInce awk can operate on strings as well as REs you almost certainly won't have to escape anything since the usual reason to escape chars is to try to make your tool that only works on REs work on strings, which is an impossible task.
If you tell us what you're trying to do with patternfilter along with some sample input and expected output, we can show you how to do it simply and robustly.
Check the next script:
while IFS=, read -r line; do
result1="`echo "$line" | sed -re 's/([-^[{}()*+/.,;?$|#\\])/\\\1/g' -e 's/]/\\]/g'`"
echo "1=$result1="
result2="$(echo "$line" | sed -re 's/([-^[{}()*+/.,;?$|#\\])/\\\1/g' -e 's/]/\\]/g')"
echo "2=$result2="
done <<'EOF'
blabal(bla)blabla[toto].com
EOF
prints:
1=blabal\1bla\1blabla\1toto]\1com=
2=blabal\(bla\)blabla\[toto\]\.com=
instad of the backticks use $(), as in the result2=... line. (and always use -r for read -r)
You can escape simpler, with the printf "%q" such,
while IFS=, read _type _content; do
res=$(printf "%q" "$_content")
echo "==$res=="
done <<EOF
txt,blabal(bla)blabla[toto].com
EOF
what prints
==blabal\(bla\)blabla\[toto\].com==
But, read #EdMorton's answer.