grab a argument as regex pattern inside a shell script - regex

This is simple script to run ls with filter :
sh myscript.sh ".pyc"
myscript.sh :
echo "---------------------------"
for i in `ls | grep '.*\.pyc'`; do
echo "$i"
done
it will do 'ls' and only show *.pyc. Now i want to put that pattern in the argument :
sh myscript.sh ".pyc"
and modify the script :
echo "---------------------------"
for i in `ls | grep '.*\$1'`; do
echo "$i"
done
But this doesn't work. it returns empty result. How to properly insert that $1 in the regex while inside the shell script ?

Replace everything with this: printf '%s\n' *"$1".
Or alternatively just run one of printf '%s\n' *.pyc, ls *.pyc, ls -d *.pyc, etc.
You probably want *.pyc (a shell glob/wildcard which expands to all files ending .pyc), as opposed to using grep.

Related

Pass parameter into sed command

I am trying to write a bash script in which the user can pass a regex as a parameter. So for example you run the program this way.
./clean_txt.sh s/.*\(my_choices.*gz\)/\1p
In the program, I am using that parameter this way.
ls /home/user/ | sed -n '$1' > cleaned_file.txt
echo "sed -n '$1'"
In my echo, I see the regular expression passed when when program was initiated. But my cleaned_file.txt is empty. The only way this works is if I hardcode the a regular expression into the program itself but that defeats the purpose of what I am trying to do.
Any idea on how I can pass that parameter into the sed command?
The problem is that your variable is not being expanded. You need to wrap it in double quotes (which is what you're doing in the echo already):
ls /home/user/ | sed -n "$1" > cleaned_file.txt
Note that ls is not needed:
files=( /home/user/* )
sed -n "$1" <<<"${files[#]}" > cleaned_file.txt
Would do the same thing. This uses a glob to create an array containing all the filenames, which is used as input to sed.

Log Extract: SED Command

I am trying to extract logs from my application within specific time-stamps. So i wrote the following script
a= echo $1 | sed 's/\//\\\//g';
b= echo $2 | sed 's/\//\\\//g';
sed -n "/$a/,/$b/p" SystemOut.log;
Here a and b are the timestamps which i pass as parameters. When i run the script SED does not expand the variables.
But if i run the following script in terminal it works fine
sed -n '/6\/30\/14 9:03/,/6\/30\/14 9:04/p' SystemOut.log
Anyone can help?
I am running the script as following-
sh extract.sh '6/30/14 9:01' '6/30/14 9:03'
Try this way:
a=$(echo $1 | sed 's/\//\\\//g');
b=$(echo $2 | sed 's/\//\\\//g');
sed -n "/$a/,/$b/p" SystemOut.log;
In order to store the output of a command in a variable you can use $()
Use double quote "" to expand variable. like
sed -n "/\"$a\"/,/\"$b\"/p" SystemOut.log;

Need to use grep to find a string within a file

Im using a shell script to get a file using wget and search it for a pattern. My shell script is as follows:
#Execute commands one by one
while read line
do
STARTTIME=$(($(date +%s%N)/1000000))
line2=$(wget -q --post-data "$line" -O PHPFiles/test.php http://localhost:1234/XSS/XSS2/test.php)
ENDTIME=$(($(date +%s%N)/1000000))
GAP=$(($ENDTIME-$STARTTIME))
DIFF=$(($DIFF+($ENDTIME-$STARTTIME)))
echo "Time Taken "$GAP
finalSearchLine1="${line/&name2=/ }"
finalSearchLine2="${finalSearchLine1/name=/}"
echo "$finalSearchLine2"
if grep -q -F "$finalSeachLine2" -a PHPFiles/test.php;
then
echo found
success=$((success+1))
else
echo not found
failure=$((failure+1))
fi
rm PHPFiles/test.php
done < $1
echo "***************"
echo "Success "$success
echo "Failure "$failure
echo "Total Time "$DIFF
echo "Average Time "$((DIFF/(success+failure)))
However, I'm having trouble with the grep command. Sometimes, the data $finalSearchLine2 contains quotes such as:
<script >alert("XSS"); </script>
This seem to cause trouble with the grep command. For the if statement, I always seem to get the result as found even when there is no matching pattern in the $finalSearchLine2 variable. I dont know if its possible to use escape strings within the variable for grep. Can anyone suggest a possible solution for this?
Grep needs double quotes to be escaped like this \"
So as a first solution you could try:
temp_variable=$(sed 's/"/\\"/g' <<< $temp)
if grep -q -F "$temp_variable" -a /PHPFiles/test.php;
So you first escape the double quotes with sed and you store the result in temp_variable. Then you use temp_variable in grep.

extract a base directory from the output of ps

I am looking to extract a basedir from the output of ps -ef | grep classpath myprog.jar
root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar
java is always a sub-dir under the basedir but the install path can vary from server to server e.g.
/usr/local/myprog/java/jre/bin
/opt/test/testing/myprog/java/jre/bin
So once i have my string how do I extract everything from before java until the beginning of the path?
That is, /usr/local/myprog or /opt/test/testing/myprog/
Using sed:
$ echo "root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar" | sed 's/.*\ \(.*\)\/java.*/\1/'
/opt/myprog
Using grep -P:
ps -ef | grep -oP '\S+(?=/java)'
/opt/myprog
If your grep doesn't support -P then use:
s='root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar'
[[ "$s" =~ (/[^[:blank:]]+)/java ]] && echo "${BASH_REMATCH[1]}"
/opt/myprog
echo "root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar" | awk '{split($8,a,"/java"); print a[1]}'
Use pgrep to find all of the Java processes instead of using ps -ef | grep .... This way, you don't have to worry about your grep command showing up as one of your items.
Instead of running ps -ef, you can use the -o option to only pull up the desired fields, and most ps commands take --no-header to eliminate the header fields. This way, your script doesn't have to worry about header lines.
Finally, I am using Shell Parameter Expansion which is sometimes way easier than using sed to change a variable:
$ ps -o pid,args --no-headers $(pgrep -f "java .* myproj.jar") | while read pid command arguments
do
directory=${command%/java*}
echo "The directory for Process ID $pid is $directory"
done
By the way, you could be running multiple commands, so I loop through the ps command.
ps axo args | awk '/classpath myprog.jar/{print substr($0, 0,index($0, "java")-1)}'
For example:
$ echo '/opt/myprog/java/jre/bin -classpath myprog.jar' \
| awk '/classpath myprog.jar/{print substr($0, 0,index($0, "java")-1)}'
/opt/myprog/
You can (and probably should) switch both of the $0's to $1's if you know for sure that your path will not contain spaces. Or add additional fields to the ps -o list using commas (as in, o pid,args) and use $2 rather than $1.
You can match the following regex:
'((\/\w+)+)\/java'
and the first captured group \1 or $1 will contain the wanted string
Demo: http://regex101.com/r/zU2vV4

Bash Script sed command not working correctly with file passed through command line

Problem
As I am trying to write a script to rename massive files according to some regex requirement, the command work ok on my iTerm2 succeeds but the same command fails to do the work in the script.
Plus some of my file names includes some Chinese and Korean characters.(don't know whether that is the problem or not)
code
So My code takes three input: Old regex, New regex and the files that need to be renamed.
Here is not code:
#!/bin/bash
# we have less than 3 arguments. Print the help text:
if [ $# -lt 3 ] ; then
cat << HELP
ren -- renames a number of files using sed regular expressions USAGE: ren 'regexp'
'replacement' files...
EXAMPLE: rename all *.HTM files into *.html:
ren 'HTM' 'html' *.HTM
HELP
exit 0
fi
OLD="$1"
NEW="$2"
# The shift command removes one argument from the list of
# command line arguments.
shift
shift
# $# contains now all the files:
for file in "$#"; do
if [ -f "$file" ] ; then
newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
if [ -f "$newfile" ]; then
echo "ERROR: $newfile exists already"
else
echo "renaming $file to $newfile ..."
mv "$file" "$newfile"
fi
fi
done
I register the bash command in the .profile as:
alias ren="bash /pathtothefile/ren.sh"
Test
The original file name is "제01과.mp3" and I want it to become "第01课.mp3".
So with my script I use:
$ ren "제\([0-9]*\)과" "第\1课" *.mp3
And it seems that the sed in the script has not worked successfully.
But the following which is exactly the same, works to replaces the name:
$ echo "제01과.mp3" | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
Any thoughts? Thx
Print the result
I have make the following change in the script so that it could print the process information:
newfile=`echo "$file" | sed "s/${OLD}/${NEW}/g"`
echo "The ${file} is changed to ${newfile}"
And the result for my test is:
The 제01과.mp3 is changed into 제01과.mp3
ERROR: 제01과.mp3 exists already
So there is no format problem.
Updating(all done under bash 4.2.45(2), Mac OS 10.9)
Testing
As I try to execute the command from the bash directly. I mean with the for loop. There is something interesting. I first stored all the names into a files.txt file using:
$ ls | grep mp3 > files.txt
And do the sed and bla bla. While single command in bash interactive mode like:
$ file="제01과.mp3"
$ echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
gives
第01课.mp3
While in the following in the interactive mode:
files=`cat files.txt`
for file in $files
do
echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
done
gives no changes!
And by now:
echo $file
gives:
$ 제30과.mp3
(There are only 30 files)
Problem Part
And I tried the first command which worked before:
$ echo $file | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
It gives no changes as:
$ 제30과.mp3
So I create a new newfile and tried again as:
$ newfile="제30과.mp3"
$ echo $newfile | sed s/"제\([0-9]*\)과\.mp3"/"第\1课\.mp3"/g
And it gives correctly:
$第30课.mp3
WOW ORZ... Why! Why ! Why! And I try to see whether file and newfile are the same, and of course, they are not:
if [[ $file == $new ]]; then
echo True
else
echo False
fi
gives:
False
My guess
I guess there are some encoding problems , but I have found non reference, could anyone help? Thx again.
Update 2
I seem to understand that there are a huge difference between string and the file name. To be specific, it I directly use a variable like:
file="제30과.mp3"
in the script, the sed works fine. However, if the variable was passed from the $# or set the variable like:
file=./*mp3
Then the sed fails to work. I don't know why. And btw, mac sed has no -r option and in ubuntu -r does not solve the question I mention above.
Some errors combined:
In order to use groups in a regex, you need extended regex -r in sed, -E in grep
escaping correctly is a beast :)
Example
files="제2과.mp3 제30과.mp3"
for file in $files
do
echo $file | sed -r 's/제([0-9]*)과\.mp3/第\1课.mp3/g'
done
outputs
第2课.mp3
第30课.mp3
If you are not doing this as a programming project, but want to skip ahead to the part where it just works, I found these resources listed at http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/x4055.htm:
MMV (and MCP, MLN, ...) utilities use a specialized syntax to perform bulk file operations on paths. (http://linux.maruhn.com/sec/mmv.html)
mmv before\*after.mp3 Before\#1After.mp3
Esomaniac, a Java alternative that also works on Windows, is apparently dead (home page is parked).
rename is a perl script you can download from CPAN: https://metacpan.org/release/File-Rename
rename 's/\.JPG$/.jpg/' *.JPG