Need to use grep to find a string within a file

Need to use grep to find a string within a file - regex

Im using a shell script to get a file using wget and search it for a pattern. My shell script is as follows:
#Execute commands one by one
while read line
do
STARTTIME=$(($(date +%s%N)/1000000))
line2=$(wget -q --post-data "$line" -O PHPFiles/test.php http://localhost:1234/XSS/XSS2/test.php)
ENDTIME=$(($(date +%s%N)/1000000))
GAP=$(($ENDTIME-$STARTTIME))
DIFF=$(($DIFF+($ENDTIME-$STARTTIME)))
echo "Time Taken "$GAP
finalSearchLine1="${line/&name2=/ }"
finalSearchLine2="${finalSearchLine1/name=/}"
echo "$finalSearchLine2"
if grep -q -F "$finalSeachLine2" -a PHPFiles/test.php;
then
echo found
success=$((success+1))
else
echo not found
failure=$((failure+1))
fi
rm PHPFiles/test.php
done < $1
echo "***************"
echo "Success "$success
echo "Failure "$failure
echo "Total Time "$DIFF
echo "Average Time "$((DIFF/(success+failure)))
However, I'm having trouble with the grep command. Sometimes, the data $finalSearchLine2 contains quotes such as:
<script >alert("XSS"); </script>
This seem to cause trouble with the grep command. For the if statement, I always seem to get the result as found even when there is no matching pattern in the $finalSearchLine2 variable. I dont know if its possible to use escape strings within the variable for grep. Can anyone suggest a possible solution for this?

Grep needs double quotes to be escaped like this \"
So as a first solution you could try:
temp_variable=$(sed 's/"/\\"/g' <<< $temp)
if grep -q -F "$temp_variable" -a /PHPFiles/test.php;
So you first escape the double quotes with sed and you store the result in temp_variable. Then you use temp_variable in grep.

Related

How to pass regular expression matching string from a file in awk?

I have a requirement where I have to split a large file into small files. Each line of the large file containing the matching string should be put into another file with the output file name same as the matching string. For one string I can get it done via awk as shown below.
awk '/apple/{print}' large_file.txt > apple.txt
I want a script which takes the regular expression matching string from another file and puts the results into a file with the same name as the matching string. How to get it done with awk command?
Let's say the string to be matched is put into a file called matching_string.txt the contents of which would look like this:
apple
orange
mango
If the large_file.txt is something like:
apple is a great fruit
we should eat apple
orange is juicy
mango is the king of fruits
litchi is a seasonal fruit
then the resulting file should be
apple.txt:
apple is a great fruit
we should eat apple
orange.txt:
orange is juicy
mango.txt:
mango is the king of fruits
I am new to the Linux environment and beginner level at scripting. Any other solution using regular expression, sed, python etc. should be also okay.
EDIT
Working Script:
I tweaked my script a little based on the answer by #Stephen Quan, it works for the tsch shell.
#!/bin/tcsh -f
foreach word ("`cat pattern.txt`")
if (-r ${word}.txt) then
rm -rf ${word}.txt
endif
awk "/${word}/ { print }" large.txt > ${word}.txt
end

Why use awk? Grep does the job too. Usually, awk '/pattern/{print}' can be replaced by the shorter grep -e 'pattern'.
pattern=apple
grep -e "$pattern" large.txt > "$pattern.txt"
Write a script or a shell function. For instance, a simple shell function can be defined ad-hoc and then called.
filter() { grep -e "$1" large.txt > "$1.txt"; }
for pattern in apple orangle mango; do filter "$pattern"; done
As a shell script (e.g. filter.sh):
#!/bin/sh
grep -e "$1" large.txt > "$1.txt"
Needless to say, the script file must have the executable bit set, otherwise it cannot be executed (obviously).
Assuming your pattern file (e.g. pattern.txt) contains one pattern per line:
#!/bin/sh
while IFS= read -r pattern <&3; do
filter "$pattern"
# or: ./filter.sh "$pattern"
done 3< pattern.txt
All of that can be done without script or function if you simply want a one-shot task to be done (but defining and using the function is not really more complicated than calling its body directly):
while IFS= read -r pattern <&3; do
grep -e "$pattern" large.txt > "$pattern.txt"
done 3< pattern.txt
Note that a for loop cannot be used here, since your program will break as soon as one of your patterns contains space or tab characters.

To do this in awk:
for word in $(cat matching_string.txt)
do
awk "/${word}/ { print }" large_file.txt > ${word}.txt
done
while IFS= read -r word
do
if [ -f ${word}.txt ]; then rm ${word}.txt; fi
awk "/${word}/ { print }" large_file.txt > ${word}.txt
done < matching_string.txt
The pattern is a regex pattern followed by a command. Note that when you get into regex-capture groups, you may find that the implementation of awk varies from one platform to another.
If it is a simplistic regex, I prefer perl because in cross-platform environments (particularly osx and git-bash on Windows), perl has a more consistent implementation for regex handling. In this case, the perl solution would be:
while IFS= read -r word
do
if [ -f ${word}.txt ]; then rm ${word}.txt; fi
perl -ne "if (/${word}/) { print }" < large_file.txt > ${word}.txt
done < matching_string.txt
I wanted to also demonstrate capture groups. In this case, it is a bit of over-engineered to represent your line as 3 capture groups (prefix, word, postfix), but, I do this because it serves as a template for you to create more complex regex capture group processing scenarios:
while IFS= read -r word
do
if [ -f ${word}.txt ]; then rm ${word}.txt; fi
perl -ne "if (/(.*)(${word})(.*)/) { print $1$2$3 . '\n' }" < large_file.txt > ${word}.txt
done < matching_string.txt

use grep -e pattern:
pattern=orange
grep -e "$pattern" large.txt > "$pattern.txt"
then use the read command to read all Patterns and generate all files:
filename='patternfile.txt'
while read pattern; do
grep -e "$pattern" large.txt > "$pattern.txt"
done < $filename

grab a argument as regex pattern inside a shell script

This is simple script to run ls with filter :
sh myscript.sh ".pyc"
myscript.sh :
echo "---------------------------"
for i in `ls | grep '.*\.pyc'`; do
echo "$i"
done
it will do 'ls' and only show *.pyc. Now i want to put that pattern in the argument :
sh myscript.sh ".pyc"
and modify the script :
echo "---------------------------"
for i in `ls | grep '.*\$1'`; do
echo "$i"
done
But this doesn't work. it returns empty result. How to properly insert that $1 in the regex while inside the shell script ?

Replace everything with this: printf '%s\n' *"$1".
Or alternatively just run one of printf '%s\n' *.pyc, ls *.pyc, ls -d *.pyc, etc.
You probably want *.pyc (a shell glob/wildcard which expands to all files ending .pyc), as opposed to using grep.

Bash: Using quoted variable for grep within quoted expression

I'm trying to create a function within a bash script that queries a log file. Within the query function, I have something that resembles the following:
if [ -n "$(cat some-log-file.log | grep \"$1\")" ]; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
If I send something I know will be in the log file as $1, like "I don't", I get the output:
$ ./y.sh query "I don't"
grep: don't": No such file or directory
No lines matched the search term.
If I try to single quote the $() expression, it sends the literal string and always evaluates true. I'm guessing it has something to do with the way grep interprets backslashes, but I can't figure it out. Maybe I'm overseeing something simple, but I've been at this for hours looking on forums and plugging in all kinds of strange combinations of quotes and escape characters. Any help or advice is appreciated.

It's actually really easy, if you realize that $() is allowed to have unescaped quotes:
if [ -n "$(cat some-log-file.log | grep "$1")" ]; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
You can actually even skip that step, though, because grep gives an appropriate exit code:
if grep -q "$1" some-log-file.log; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
In short, this happens for the same reason that "$1" works: Shell parameter expansion and command substitution happen before word splitting and quote removal. See more about how bash parses commands in the Shell Expansions section of the bash manual.

Grepping for a sentence from inside a bash script

I have a log file from which I want to grep for some error messages using a bash script, however I am not quite getting how to pass it the sentence and then use it in the grep call.
$./grep_sentence_script.sh "Call to server failed"
grep_sentence.sh
#!/bin/sh
sentence=$1
`grep $sentence logfile.log`
Could someone please help me with it.

put the variable inside double quotes.
#!/bin/sh
sentence=$1
grep "$sentence" logfile.log

Just this will be sufficient:
#!/bin/bash
grep -iF "$1" logfile.log
Important to use -F (fixed string) option in order to avoid regex interpretation of special meta characters like $, . etc.

Bash Script sed command not working correctly with file passed through command line

Problem
As I am trying to write a script to rename massive files according to some regex requirement, the command work ok on my iTerm2 succeeds but the same command fails to do the work in the script.
Plus some of my file names includes some Chinese and Korean characters.(don't know whether that is the problem or not)
code
So My code takes three input: Old regex, New regex and the files that need to be renamed.
Here is not code:
#!/bin/bash
# we have less than 3 arguments. Print the help text:
if [ $# -lt 3 ] ; then
cat << HELP
ren -- renames a number of files using sed regular expressions USAGE: ren 'regexp'
'replacement' files...
EXAMPLE: rename all *.HTM files into *.html:
ren 'HTM' 'html' *.HTM
HELP
exit 0
fi
OLD="$1"
NEW="$2"
# The shift command removes one argument from the list of
# command line arguments.
shift
shift
# $# contains now all the files:
for file in "$#"; do
if [ -f "$file" ] ; then
newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
if [ -f "$newfile" ]; then
echo "ERROR: $newfile exists already"
else
echo "renaming $file to $newfile ..."
mv "$file" "$newfile"
fi
fi
done
I register the bash command in the .profile as:
alias ren="bash /pathtothefile/ren.sh"
Test
The original file name is "제01과.mp3" and I want it to become "第01课.mp3".
So with my script I use:
$ ren "제\([0-9]*\)과" "第\1课" *.mp3
And it seems that the sed in the script has not worked successfully.
But the following which is exactly the same, works to replaces the name:
$ echo "제01과.mp3" | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
Any thoughts? Thx
Print the result
I have make the following change in the script so that it could print the process information:
newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
echo "The ${file} is changed to ${newfile}"
And the result for my test is:
The 제01과.mp3 is changed into 제01과.mp3
ERROR: 제01과.mp3 exists already
So there is no format problem.
Updating(all done under bash 4.2.45(2), Mac OS 10.9)
Testing
As I try to execute the command from the bash directly. I mean with the for loop. There is something interesting. I first stored all the names into a files.txt file using:
$ ls | grep mp3 > files.txt
And do the sed and bla bla. While single command in bash interactive mode like:
$ file="제01과.mp3"
$ echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
gives
第01课.mp3
While in the following in the interactive mode:
files=`cat files.txt`
for file in $files
do
echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
done
gives no changes!
And by now:
echo $file
gives:
$ 제30과.mp3
(There are only 30 files)
Problem Part
And I tried the first command which worked before:
$ echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
It gives no changes as:
$ 제30과.mp3
So I create a new newfile and tried again as:
$ newfile="제30과.mp3"
$ echo $newfile | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
And it gives correctly:
$第30课.mp3
WOW ORZ... Why! Why ! Why! And I try to see whether file and newfile are the same, and of course, they are not:
if [[ $file == $new ]]; then
echo True
else
echo False
fi
gives:
False
My guess
I guess there are some encoding problems , but I have found non reference, could anyone help? Thx again.
Update 2
I seem to understand that there are a huge difference between string and the file name. To be specific, it I directly use a variable like:
file="제30과.mp3"
in the script, the sed works fine. However, if the variable was passed from the $# or set the variable like:
file=./*mp3
Then the sed fails to work. I don't know why. And btw, mac sed has no -r option and in ubuntu -r does not solve the question I mention above.

Some errors combined:
In order to use groups in a regex, you need extended regex -r in sed, -E in grep
escaping correctly is a beast :)
Example
files="제2과.mp3 제30과.mp3"
for file in $files
do
echo $file | sed -r 's/제([0-9]*)과\.mp3/第\1课.mp3/g'
done
outputs
第2课.mp3
第30课.mp3

If you are not doing this as a programming project, but want to skip ahead to the part where it just works, I found these resources listed at http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/x4055.htm:
MMV (and MCP, MLN, ...) utilities use a specialized syntax to perform bulk file operations on paths. (http://linux.maruhn.com/sec/mmv.html)
mmv before\*after.mp3 Before\#1After.mp3
Esomaniac, a Java alternative that also works on Windows, is apparently dead (home page is parked).
rename is a perl script you can download from CPAN: https://metacpan.org/release/File-Rename
rename 's/\.JPG$/.jpg/' *.JPG

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Need to use grep to find a string within a file - regex

Related

How to pass regular expression matching string from a file in awk?

grab a argument as regex pattern inside a shell script

Bash: Using quoted variable for grep within quoted expression

Grepping for a sentence from inside a bash script

Bash Script sed command not working correctly with file passed through command line

Categories

Resources