perl regex doesn't match and include next line [duplicate] - regex

I'm trying to use perl (v5.14.2) via a bash shell (GNU Bash-4.2) in Kubuntu (GNU/Linux) to search and replace a string that includes a newline character, but I'm not succeeding yet.
Here's the text file I'm searching:
<!-- filename: prac1.html -->
hello
kitty
blah blah blah
When I use a text editor's (Kate's) search-and-replace functionality or when I use a regex tester (http://regexpal.com/), I can easily get this regex to work:
hello\nkitty
But when using perl in the command line, none of the following commands have worked:
perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello.*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html
Actually, I got desperate and tried many other patterns, including all of these (different permutations in the "single-line" and "multi-line" modes):
perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,s' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,m' prac1.html
perl -p -i -e 's,hello.kitty,newtext,m' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,ms' prac1.html
perl -p -i -e 's,hello.kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,ms' prac1.html
(I also tried using \r \r\n \R \f \D etc., and global mode as well.)
Can anyone spot the issue or suggest a solution?

Try doing this, I make this possible by modifying the input record separator (a newline by default) :
perl -i -p00e 's,hello\nkitty,newtext,' prac1.html
from perldoc perlrun :
-0[octal/hexadecimal]
specifies the input record separator ($/ ) as an octal or hexadecimal
number. If there are no digits, the null character is the separator.
Other switches may precede or follow the digits. For example, if you
have a version of find which can print filenames terminated by the
null character, you can say this:
find . -name '*.orig' -print0 | perl -n0e unlink
The special value 00 will cause Perl to slurp files in paragraph mode.
Any value 0400 or above will cause Perl to slurp files whole, but by
convention the value 0777 is the one normally used for this purpose.

The problem is that "-p" has already implicitly wrapped this loop around your "-e", and the "<>" is splitting the input into lines, so your regexp never gets a chance to see more than one line.
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
See the perlrun manpage for more information.

Related

Sed command to search by regex in file

I need to get a number of version from file. My version file looks like this:
#define MINOR_VERSION_NUMBER 1
I try to use sed command:
VERSION_MINOR=`sed -i -e 'MINOR_VERSION_NUMBER\s+\([0-9]+\).*/\1/p' $WORKSPACE/project/common/version.h`
but I get error:
sed: -e expression #1, char 2: extra characters after command
The "address" that selects matching lines needs to be enclosed in /.../ (or \X...X for any X).
sed -ne '/MINOR_VERSION_NUMBER/{ s/.*\([0-9]\).*/\1/;p }'
Don't use -i, it changes the file in place and doesn't output anything.
The more common way would be to use awk to find the line and extract the wanted column:
awk '(/MINOR_VERSION_NUMBER/){print$3}'
using grep
grep MINOR_VERSION_NUMBER | grep -o '[0-9]*$'
Demo :
$echo "#define MINOR_VERSION_NUMBER 1" | grep -o '[0-9]*$'
1
$echo "#define MINOR_VERSION_NUMBER 1123" | grep -o '[0-9]*$'
1123
$
Here is a correction of your attempt. Change your line:
VERSION_MINOR=`sed -i -e 'MINOR_VERSION_NUMBER\s+\([0-9]+\).*/\1/p' $WORKSPACE/project/common/version.h`
into:
VERSION_MINOR=`sed -n -e '/^#define\s\+MINOR_VERSION_NUMBER\s\+\([0-9]\+\).*/ s//\1/p' $WORKSPACE/project/common/version.h`
This can be made more readable with GNU sed's -r option:
VERSION_MINOR=`sed -n -r -e '/^#define\s+MINOR_VERSION_NUMBER\s+([0-9]+).*/ s//\1/p' $WORKSPACE/project/common/version.h`
As stated by choroba, awk would be more suited than sed for this kind of processing (see his answer).
However, here is another solution using bash's read builtin, together with GNU grep:
read x x VERSION_MINOR x < <(grep -F -w -m1 MINOR_VERSION_NUMBER $WORKSPACE/project/common/version.h)
VERSION_MINOR=$(echo "#define MINOR_VERSION_NUMBER 1" | tr -s ' ' | cut -d' ' -f3)

sed string replace is giving some kind of warning?

I am using sed with grep command to replace a string. Old string is in 8 files at home location and I want to replace all of these with new string. I am using this:
#! /bin/bash
read oldstring
read newstring
sed -i -e 's/'Soldstring'/'$newstring'/' grep "$oldstring" /home/*
Now this command works but I am getting an warning:
sed: can't read grep: No such file or directory
sed: can't read oldstring: No such file or directory
Any ideas?
You probably wanted
sed -i -e "s|Soldstring|$newstring|" $(grep -l "$oldstring" /home/*)
However that form is unsafe. Better use xargs:
grep -l "$oldstring" /home/* | xargs sed -i -e "s|Soldstring|$newstring|"
And another if possible is to store on arrays:
readarray -t files < <(exec grep -l "$oldstring" /home/*)
sed -i -e "s|Soldstring|$newstring|" "${files[#]}"
You are not executing grep, you are giving it as a parameter to sed.
are you missing backticks?
sed -i -e 's/'Soldstring'/'$newstring'/' `grep "$oldstring" /home/*`
sed -i -e "s/$oldstring/$newstring/g" `grep -l "$oldstring" /home/*`
Just in order to clearly point out the various typos in your code:
#! /bin/bash
# ^
# extra space here (not really an error I think -- but unusual)
read oldstring
read newstring
sed -i -e 's/'Soldstring'/'$newstring'/' grep "$oldstring" /home/*
# ^ ^ ^
# `S` instead of `$` here | |
# here and there
# missing backticks (`)
As a side note, I suggest backticks above, but, since you are using bash, the syntax $(grep ....) is probably better than the classic Bourne Shell syntax `grep ....`. Finally, as suggested by konsolebox, "command nesting" might be unsafe, for example, in this case, if some file names contain spaces.

Find parameters beginning with a dash in a string

Assumed one has a string containing parameters:
echo "-v foo -d --print bar-foo ba-z fOo"
How can one get parameters beginning with a dash?
-v -d --print
An alternative:
STR="-v foo -d --print bar-foo ba-z fOo"
echo "$STR" | egrep -o -e "(^| )+--?[^ ]+" | sed -e 's/ //g'
Will output:
-v
-d
--print
If you want to parse options passed to your script, you should consider using getopt.
References:
example of how to use getopts in bash
$ str="-v foo -d --print bar-foo ba-z"
$ for i in $str; do test ${i::1} = - && echo $i; done
-v
-d
--print
Note this is an instance where you must not quote the variable, since you want word splitting to occur. (That is, do not write for i in "$str")

How can i replace the all email address found in particular folder in linux

I have some scripts in some folder. like /var/www/sites
Now i want to replace all the email address hardcoded in the scripts in all folders and subfolders and replace with my email address
how can i do that.
I can find using
grep -rn "abc#gmail.com" /var/www/sites/
But i don't know how to use regex and replace
Try perl:
perl -p -i -e 's/abc#gmail.com/new#gmail.com/g' /var/www/sites/*
Or with perl/find:
find /var/www/sites/ -exec perl -p -i -e 's/abc#gmail.com/new#gmail.com/g' {} \;
Open a shell, then
if you have bash4 :
oldmail="abc#gmail.com"
newmail="myemail#provider.tld"
shopt -s globstar
sed -i "/$oldmail/s/$oldmail/$newmail/g" /var/www/sites/**/*
if not :
oldmail="abc#gmail.com"
newmail="myemail#provider.tld"
find /var/www/sites -type f -exec sed -i "/$oldmail/s/$oldmail/$newmail/g" {} +
This solutions have the advantage to not modify the timestamps in the files even if the file doesn't contains the searched string, unlike sed -i & perl -i -pe solutions without a previous grep (I do this here with /pattern/)
find /var/www/sites -type f | xargs sed --in-place 's/abc#gmail\.com/mynewemail#elsewhere.com/g'
Try sed.
grep -rl "abc#gmail.com" /var/www/sites/ | xargs sed -i 's/oldemail/newemail/g'
Edit:
Took feedback into account. Sorry about the previously wrong solution!

help with grep [[:alpha:]]* -o

file.txt contains:
##w##
##wew##
using mac 10.6, bash shell, the command:
cat file.txt | grep [[:alpha:]]* -o
outputs nothing. I'm trying to extract the text inside the hash signs. What am i doing wrong?
(Note that it is better practice in this instance to pass the filename as an argument to grep instead of piping the output of cat to grep: grep PATTERN file instead of cat file | grep PATTERN.)
What shell are you using to execute this command? I suspect that your problem is that the shell is interpreting the asterisk as a wildcard and trying to glob files.
Try quoting your pattern, e.g. grep '[[:alpha:]]*' -o file.txt.
I've noticed that this works fine with the version of grep that's on my Linux machine, but the grep on my Mac requires the command grep -E '[[:alpha:]]+' -o file.txt.
sed 's/#//g' file.txt
/SCRIPTS [31]> cat file.txt
##w##
##wew##
/SCRIPTS [32]> sed 's/#//g' file.txt
w
wew
if you have bash >3.1
while read -r line
do
case "$line" in
*"#"* )
if [[ $line =~ "^#+(.*)##+$" ]];then
echo ${BASH_REMATCH[1]}
fi
esac
done <"file"