sed find with a regex and replace does not work [duplicate]

sed find with a regex and replace does not work [duplicate] - regex

I'm trying to refine my code by getting rid of unnecessary white spaces, empty lines, and having parentheses balanced with a space in between them, so:
int a = 4;
if ((a==4) || (b==5))
a++ ;
should change to:
int a = 4;
if ( (a==4) || (b==5) )
a++ ;
It does work for the brackets and empty lines. However, it forgets to reduce the multiple spaces to one space:
int a = 4;
if ( (a==4) || (b==5) )
a++ ;
Here is my script:
#!/bin/bash
# Script to refine code
#
filename=read.txt
sed 's/((/( (/g' $filename > new.txt
mv new.txt $filename
sed 's/))/) )/g' $filename > new.txt
mv new.txt $filename
sed 's/ +/ /g' $filename > new.txt
mv new.txt $filename
sed '/^$/d' $filename > new.txt
mv new.txt $filename
Also, is there a way to make this script more concise, e.g. removing or reducing the number of commands?

If you are using GNU sed then you need to use sed -r which forces sed to use extended regular expressions, including the wanted behavior of +. See man sed:
-r, --regexp-extended
use extended regular expressions in the script.
The same holds if you are using OS X sed, but then you need to use sed -E:
-E Interpret regular expressions as extended (modern) regular expressions
rather than basic regular regular expressions (BRE's).

You have to preceed + with a \, otherwise sed tries to match the character + itself.
To make the script "smarter", you can accumulate all the expressions in one sed:
sed -e 's/((/( (/g' -e 's/))/) )/g' -e 's/ \+/ /g' -e '/^$/d' $filename > new.txt
Some implementations of sed even support the -i option that enables changing the file in place.

Sometimes, -r and -e won't work.
I'm using sed version 4.2.1 and they aren't working for me at all.
A quick hack is to use the * operator instead.
So let's say we want to replace all redundant space characters with a single space:
We'd like to do:
sed 's/ +/ /'
But we can use this instead:
sed 's/ */ /'
(note the double-space)

May not be the cleanest solution. But if you want to avoid -E and -r to remain compatible with both versions of sed, you can do a repeat character cc* - that's 1 c then 0 or more c's == 1 or more c's.
Or just use the BRE syntax, as suggested by #cdarke, to match a specific number or patternsc\{1,\}. The second number after the comma is excluded to mean 1 or more.

This might work for you:
sed -e '/^$/d' -e ':a' -e 's/\([()]\)\1/\1 \1/g' -e 'ta' -e 's/ */ /g' $filename >new.txt

on the bash front;
First I made a script test.sh
cat test.sh
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
SRC=`echo $line | awk '{print $1}'`
DEST=`echo $line | awk '{print $2}'`
echo "moving $SRC to $DEST"
mv $SRC $DEST || echo "move $SRC to $DEST failed" && exit 1
done < "$1"
then we make a data file and a test file aaa.txt
cat aaa.txt
<tag1>19</tag1>
<tag2>2</tag2>
<tag3>-12</tag3>
<tag4>37</tag4>
<tag5>-41</tag5>
then test and show results.
bash test.sh list.txt
Text read from file: aaa.txt bbb.txt
moving aaa.txt to bbb.txt

Related

SED - Regex fails

Given the following files:
input_file:
if_line1
if_line2
template_file_1:
temp_file_line1
temp_file_line2
##regex_match## <= must be replaced by input_file
temp_file_line3
template_file_2:
temp_file_line1
temp_file_line2
{my_file.global} <= must be replaced by input_file
temp_file_line3
output_file:
temp_file_line1
temp_file_line2
if_line1
if_line2
temp_file_line3
For template_file_1 the following sed command works:
sed -n -e '/##regex_match##/{r input_file' -e 'b' -e '}; p' template_file_1 > output_file
However, for template_file_2 the analog sed command fails:
sed -r -n -e '/(?<={).+\.global(?=})/{r input_file' -e 'b' -e '}; p' template_file_2 > output_file
sed complains the regular expression was invalid
The given regex is at least PCRE valid, for example grep -oP '(?<={).+\.global(?=})' template_file_2 works. Any idea how to deal with that?

perl one-liners:
perl -pe 'do {local $/; open $f, "<input_file"; $_ = <$f>; close $f} if /\{.+?\.global\}/' template_file_2
or perhaps this one, not "pure" perl
perl -ne 'if (/\{.+?\.global\}/) {system("cat","input_file")} else {print}' template_file_2
Using CPAN modules can make this really tidy:
perl -MPath::Tiny -pe '$_ = path("input_file")->slurp if /\{.+?\.global\}/' template_file_2

idk exactly what that PCRE is intended to do but taking a guess at it, this will work using any awk in any shell on every UNIX box:
$ awk 'NR==FNR{new=new s $0; s=ORS; next} /##regex_match##/{$0=new} 1' input_file template_file_1
temp_file_line1
temp_file_line2
if_line1
if_line2
temp_file_line3
$ awk 'NR==FNR{new=new s $0; s=ORS; next} /\{[^.{}]+\.global}/{$0=new} 1' input_file template_file_2
temp_file_line1
temp_file_line2
if_line1
if_line2
temp_file_line3

Find regular expression in a file matching a given value

I have some basic knowledge on using regular expressions with grep (bash).
But I want to use regular expressions the other way around.
For example I have a file containing the following entries:
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
Now I want to use bash to figure out to which line a particular number matches.
For example:
grep 8 file
should return:
line_three=[7-9]
Note: I am aware that the example of "grep 8 file" doesn't make sense, but I hope it helps to understand what I am trying to achieve.
Thanks for you help,
Marcel

As others haven pointed out, awk is the right tool for this:
awk -F'=' '8~$2{print $0;}' file
... and if you want this tool to feel more like grep, a quick bash wrapper:
#!/bin/bash
awk -F'=' -v seek_value="$1" 'seek_value~$2{print $0;}' "$2"
Which would run like:
./not_exactly_grep.sh 8 file
line_three=[7-9]

My first impression is that this is not a task for grep, maybe for awk.
Trying to do things with grep I only see this:
for line in $(cat file); do echo 8 | grep "${line#*=}" && echo "${line%=*}" ; done
Using while for file reading (following comments):
while IFS= read -r line; do echo 8 | grep "${line#*=}" && echo "${line%=*}" ; done < file

This can be done in native bash using the syntax [[ $value =~ $regex ]] to test:
find_regex_matching() {
local value=$1
while IFS= read -r line; do # read from input line-by-line
[[ $line = *=* ]] || continue # skip lines not containing an =
regex=${line#*=} # prune everything before the = for the regex
if [[ $value =~ $regex ]]; then # test whether we match...
printf '%s\n' "$line" # ...and print if we do.
fi
done
}
...used as:
find_regex_matching 8 <file
...or, to test it with your sample input inline:
find_regex_matching 8 <<'EOF'
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
EOF
...which properly emits:
line_three=[7-9]
You could replace printf '%s\n' "$line" with printf '%s\n' "${line%%=*}" to print only the key (contents before the =), if so inclined. See the bash-hackers page on parameter expansion for a rundown on the syntax involved.

This is not built-in functionality of grep, but it's easy to do with awk, with a change in syntax:
/[0-3]/ { print "line one" }
/[4-6]/ { print "line two" }
/[7-9]/ { print "line three" }
If you really need to, you could programmatically change your input file to this syntax, if it doesn't contain any characters that need escaping (mainly / in the regex or " in the string):
sed -e 's#\(.*\)=\(.*\)#/\2/ { print "\1" }#'

As I understand it, you are looking for a range that includes some value.
You can do this in gawk:
$ cat /tmp/file
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
$ awk -v n=8 'match($0, /([0-9]+)-([0-9]+)/, a){ if (a[1]<n && a[2]>n) print $0 }' /tmp/file
line_three=[7-9]
Since the digits are being treated as numbers (vs a regex) it supports larger ranges:
$ cat /tmp/file
line_one=[0-3]
line_two=[4-6]
line_three=[75-95]
line_four=[55-105]
$ awk -v n=92 'match($0, /([0-9]+)-([0-9]+)/, a){ if (a[1]<n && a[2]>n) print $0 }' /tmp/file
line_three=[75-95]
line_four=[55-105]
If you are just looking to interpret the right hand side of the = as a regex, you can do:
$ awk -F= -v tgt=8 'tgt~$2' /tmp/file

You would like to do something like
grep -Ef <(cut -d= -f2 file) <(echo 8)
This wil grep what you want but will not display where.
With grep you can show some message:
echo "8" | sed -n '/[7-9]/ s/.*/Found it in line_three/p'
Now you would like to transfer your regexp file into such commands:
sed 's#\(.*\)=\(.*\)#/\2/ s/.*/Found at \1/p#' file
Store these commands in a virtual command file and you will have
echo "8" | sed -nf <(sed 's#\(.*\)=\(.*\)#/\2/ s/.*/Found at \1/p#' file)

Sed copy pattern between range only once

I am using sed to edit some sql script. I want to copy all the lines from the first "CREATE" pattern until the first "ALTER" pattern. The issue I am having is that sed copies all lines between each set of CREATE and ALTER instead of only the first occurrence (more than once).
sed -n -e '/CREATE/,/ALTER/w createTables.sql' $filename

Perl to the rescue:
perl -ne 'print if /CREATE/ .. /ALTER/ && close ARGV' -- "$filename" > createTables.sql
It closes the input when the ALTER is matched, i.e. it doesn't read any further.

Using sed
sed -n '/CREATE/,/ALTER/{p;/ALTER/q}' file > createTables.sql
or alternatively(note the newline)
sed -n '/CREATE/,/ALTER/{w createTables.sql
/ALTER/q}' file

bash regex multiple match in one line

I'm trying to process my text.
For example i got:
asdf asdf get.this random random get.that
get.it this.no also.this.no
My desired output is:
get.this get.that
get.it
So regexp should catch only this pattern (get.\w), but it has to do it recursively because of multiple occurences in one line, so easiest way with sed
sed 's/.*(REGEX).*/\1/'
does not work (it shows only first occurence).
Probably the good way is to use grep -o, but i have old version of grep and -o flag is not available.

This grep may give what you need:
grep -o "get[^ ]*" file

Try awk:
awk '{for(i=1;i<=NF;i++){if($i~/get\.\w+/){print $i}}}' file.txt
You might need to tweak the regex between the slashes for your specific issue. Sample output:
$ awk '{for(i=1;i<=NF;i++){if($i~/get\.\w+/){print $i}}}' file.txt
get.this
get.that
get.it

With awk:
awk -v patt="^get" '{
for (i=1; i<=NF; i++)
if ($i ~ patt)
printf "%s%s", $i, OFS;
print ""
}' <<< "$text"
bash
while read -a words; do
for word in "${words[#]}"; do
if [[ $word == get* ]]; then
echo -n "$word "
fi
done
echo
done <<< "$text"
perl
perl -lane 'print join " ", grep {$_ =~ /^get/} #F' <<< "$text"

This might work for you (GNU sed):
sed -r '/\bget\.\S+/{s//\n&\n/g;s/[^\n]*\n([^\n]*)\n[^\n]*/\1 /g;s/ $//}' file
or if you want one per line:
sed -r '/\n/!s/\bget\.\S+/\n&\n/g;/^get/P;D' file

find lines containing "^" and replace entire line with ""

I have a file with a string on each line... ie.
test.434
test.4343
test.4343t34
test^tests.344
test^34534/test
I want to find any line containing a "^" and replace entire line with a blank.
I was trying to use sed:
sed -e '/\^/s/*//g' test.file
This does not seem to work, any suggestions?

sed -e 's/^.*\^.*$//' test.file
For example:
$ cat test.file
test.434
test.4343
test.4343t34
test^tests.344
test^34534/test
$ sed -e 's/^.*\^.*$//' test.file
test.434
test.4343
test.4343t34
$
To delete the offending lines entirely, use
$ sed -e '/\^/d' test.file
test.434
test.4343
test.4343t34

other ways
awk
awk '!/\^/' file
bash
while read -r line
do
case "$line" in
*"^"* ) continue;;
*) echo "$line"
esac
done <"file"
and probably the fastest
grep -v "\^" file

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

sed find with a regex and replace does not work [duplicate] - regex

This might work for you: sed -e '/^$/d' -e ':a' -e 's/\([()]\)\1/\1 \1/g' -e 'ta' -e 's/ */ /g' $filename >new.txt

Related

SED - Regex fails

Find regular expression in a file matching a given value

Sed copy pattern between range only once

bash regex multiple match in one line

find lines containing "^" and replace entire line with ""

Categories

Resources