grep parts of string with match and not match? - regex

I have a log file which contains string errorCode:null or errorCode:404 etc. I want to find all occurrences where errorCode is not null.
I use:
grep -i "errorcode" -A 10 -B 10 app.2020-.* | grep -v -i "errorCode:null"`,
grep -v -i 'errorcode:[0-9]^4' -A 10 -B 10 app.2020-.*
but this is not the right regex match. How could this be done?

If you have gnu-grep then use a negative lookahead in regex as this with -P option:
grep -Pi 'errorcode(?!:null)' -A 10 -B 10 app.2020-.*
If you don't have gnu grep then try this awk:
awk '/errorcode/ && !/errorcode:null/' app.2020-.*
it would require more bit of code in awk to match equivalent of -A 10 -B 10 options of grep.

I have no idea why you are using -A 10 -B 10, I've just used the following command and everything is working fine:
grep -i "ErrorCode" test.txt | grep -v -i "Errorcode:null"
This the file content:
Errorcode:null
Errorcode:405
Errorcode:504
Nonsens
This is the output of the command:
Errorcode:405
Errorcode:504

Related

Sed command to search by regex in file

I need to get a number of version from file. My version file looks like this:
#define MINOR_VERSION_NUMBER 1
I try to use sed command:
VERSION_MINOR=`sed -i -e 'MINOR_VERSION_NUMBER\s+\([0-9]+\).*/\1/p' $WORKSPACE/project/common/version.h`
but I get error:
sed: -e expression #1, char 2: extra characters after command
The "address" that selects matching lines needs to be enclosed in /.../ (or \X...X for any X).
sed -ne '/MINOR_VERSION_NUMBER/{ s/.*\([0-9]\).*/\1/;p }'
Don't use -i, it changes the file in place and doesn't output anything.
The more common way would be to use awk to find the line and extract the wanted column:
awk '(/MINOR_VERSION_NUMBER/){print$3}'
using grep
grep MINOR_VERSION_NUMBER | grep -o '[0-9]*$'
Demo :
$echo "#define MINOR_VERSION_NUMBER 1" | grep -o '[0-9]*$'
1
$echo "#define MINOR_VERSION_NUMBER 1123" | grep -o '[0-9]*$'
1123
$
Here is a correction of your attempt. Change your line:
VERSION_MINOR=`sed -i -e 'MINOR_VERSION_NUMBER\s+\([0-9]+\).*/\1/p' $WORKSPACE/project/common/version.h`
into:
VERSION_MINOR=`sed -n -e '/^#define\s\+MINOR_VERSION_NUMBER\s\+\([0-9]\+\).*/ s//\1/p' $WORKSPACE/project/common/version.h`
This can be made more readable with GNU sed's -r option:
VERSION_MINOR=`sed -n -r -e '/^#define\s+MINOR_VERSION_NUMBER\s+([0-9]+).*/ s//\1/p' $WORKSPACE/project/common/version.h`
As stated by choroba, awk would be more suited than sed for this kind of processing (see his answer).
However, here is another solution using bash's read builtin, together with GNU grep:
read x x VERSION_MINOR x < <(grep -F -w -m1 MINOR_VERSION_NUMBER $WORKSPACE/project/common/version.h)
VERSION_MINOR=$(echo "#define MINOR_VERSION_NUMBER 1" | tr -s ' ' | cut -d' ' -f3)

grep with extended regex over multiple lines

I'm trying to get a pattern over multiple lines. I would like to ensure the line I'm looking for ends in \r\n and that there is specific text that comes after it at some point. The two problems I've had are I often get unmatched parenthesis in groupings or I get a positive match when there is none. Here are two simple examples.
echo -e -n "ab\r\ncd" | grep -U -c -z -E $'(\r\n)+.*TEST'
grep: Unmatched ( or \(
What exactly is unmatched there? I don't get it.
echo -e -n "ab\r\ncd" | grep -U -c -z -E $'\r\n.*TEST'
1
There is no TEST in the string, so why does this return a count of 1 for matches?
I'm using grep (GNU grep) 2.16 on Ubuntu 14. Thanks
Instead of -E you can use -P for PCRE support in gnu grep to use advanced regex like this:
echo -ne "ab\r\ncd" | ggrep -UczP '\r\n.*TEST'
0
echo -ne "ab\r\ncd" | ggrep -UczP '\r\n.*cd'
1
grep -E matches only in single line input.

extract a base directory from the output of ps

I am looking to extract a basedir from the output of ps -ef | grep classpath myprog.jar
root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar
java is always a sub-dir under the basedir but the install path can vary from server to server e.g.
/usr/local/myprog/java/jre/bin
/opt/test/testing/myprog/java/jre/bin
So once i have my string how do I extract everything from before java until the beginning of the path?
That is, /usr/local/myprog or /opt/test/testing/myprog/
Using sed:
$ echo "root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar" | sed 's/.*\ \(.*\)\/java.*/\1/'
/opt/myprog
Using grep -P:
ps -ef | grep -oP '\S+(?=/java)'
/opt/myprog
If your grep doesn't support -P then use:
s='root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar'
[[ "$s" =~ (/[^[:blank:]]+)/java ]] && echo "${BASH_REMATCH[1]}"
/opt/myprog
echo "root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar" | awk '{split($8,a,"/java"); print a[1]}'
Use pgrep to find all of the Java processes instead of using ps -ef | grep .... This way, you don't have to worry about your grep command showing up as one of your items.
Instead of running ps -ef, you can use the -o option to only pull up the desired fields, and most ps commands take --no-header to eliminate the header fields. This way, your script doesn't have to worry about header lines.
Finally, I am using Shell Parameter Expansion which is sometimes way easier than using sed to change a variable:
$ ps -o pid,args --no-headers $(pgrep -f "java .* myproj.jar") | while read pid command arguments
do
directory=${command%/java*}
echo "The directory for Process ID $pid is $directory"
done
By the way, you could be running multiple commands, so I loop through the ps command.
ps axo args | awk '/classpath myprog.jar/{print substr($0, 0,index($0, "java")-1)}'
For example:
$ echo '/opt/myprog/java/jre/bin -classpath myprog.jar' \
| awk '/classpath myprog.jar/{print substr($0, 0,index($0, "java")-1)}'
/opt/myprog/
You can (and probably should) switch both of the $0's to $1's if you know for sure that your path will not contain spaces. Or add additional fields to the ps -o list using commas (as in, o pid,args) and use $2 rather than $1.
You can match the following regex:
'((\/\w+)+)\/java'
and the first captured group \1 or $1 will contain the wanted string
Demo: http://regex101.com/r/zU2vV4

Regular expression doesn't work in grep

Why grep doesn't match "COL1,COL2,COL3," with this regexp as expected but "COL1,COL2,COL3,COL4,COL5,COL6,"? It matches correctly in text editor but not using grep, am I missing any special escaping or..? (using OS X Lion)
The text:
COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8,COL9
The command:
grep -E --color=auto '^([^,]*,){3}' file.csv
Grep version:
grep (GNU grep) 2.5.1
Your command:
grep -E --color=auto '^([^,]*,){3}' file.csv
will only color the string COL1,COL2,COL3, differently but if you want that string in output then use -o option like this:
grep -E -o '^([^,]*,){3}'

help with grep [[:alpha:]]* -o

file.txt contains:
##w##
##wew##
using mac 10.6, bash shell, the command:
cat file.txt | grep [[:alpha:]]* -o
outputs nothing. I'm trying to extract the text inside the hash signs. What am i doing wrong?
(Note that it is better practice in this instance to pass the filename as an argument to grep instead of piping the output of cat to grep: grep PATTERN file instead of cat file | grep PATTERN.)
What shell are you using to execute this command? I suspect that your problem is that the shell is interpreting the asterisk as a wildcard and trying to glob files.
Try quoting your pattern, e.g. grep '[[:alpha:]]*' -o file.txt.
I've noticed that this works fine with the version of grep that's on my Linux machine, but the grep on my Mac requires the command grep -E '[[:alpha:]]+' -o file.txt.
sed 's/#//g' file.txt
/SCRIPTS [31]> cat file.txt
##w##
##wew##
/SCRIPTS [32]> sed 's/#//g' file.txt
w
wew
if you have bash >3.1
while read -r line
do
case "$line" in
*"#"* )
if [[ $line =~ "^#+(.*)##+$" ]];then
echo ${BASH_REMATCH[1]}
fi
esac
done <"file"