Not operator in regex [duplicate] - regex

This question already has answers here:
Negative matching using grep (match lines that do not contain foo)
(3 answers)
Closed 12 months ago.
I have a file and I wish to grep out all the lines that do not start with a timestamp. I tried using the following regex but it did not work:
cat myFile | grep '^(?!\[0-9\]$).*$'
Any other suggestions or something that I might be doing wrong here?

Why not simply use grep -v option like this to negate:
grep -v "<pattern>" file
Let's say you want to grep all the lines in a shell script that are not commented ( do not have # at start ) then you can use:
grep -v "^\s*#" file.sh

Try this:
cat myFile | grep '^\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d'
This assumes your timestamp is of the pattern dddd-dd-dd dd:dd:dd, but you change it to what matches your timestamp if it's something else.
Note: Unless you're using some kind of cmd chaining, grep pattern file is a simpler syntax
BTW: Your use of a double-negative makes me unsure if you want the timestamp lines or you want the non-timestamp lines.

You don't need a not operator, just use grep as it is most easily used: finding a pattern:
grep '^[0-9]' myFile

Related

grep regular expression [duplicate]

This question already has answers here:
Match empty lines in a file with 'grep'
(3 answers)
Closed 5 years ago.
Hi I stuck with some script that is doing some text filtering
script is counting occurrences by doing:
cat file | sort | grep -v "^$" | uniq -c | sort -nr | head -20
Is not obvious for me how will grep -v "^$" works.
As I am understanding -v which is invert the sense of matching, inverting pattern with begging of line and end of line is not obvious for me.
I was trying few examples but is not clear to understand for me how it works (i.e. it filter spaces but not carriage returns)
It will just get rid of empty lines. "^$" matches lines that start and end without anything in between the start and end.

Exclude pattern in a Grep using extended regex [duplicate]

This question already has answers here:
How to invert a grep expression
(5 answers)
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 5 years ago.
I got a Grep that is killing me
Let's suppose i got the a file (file.xml) with the two below entries:
pos_ADF_datasource-1450-jdbc.xml
datasource-1450-jdbc.xml
Now If i run the below grep:
grep -E '(ADF)' file.txt
I got the below output:
pos_ADF_datasource-1450-jdbc.xml
Now i want to exclude ADF to get the other entry, it should be easy, but i tried it all and I'm unable to let it works:
grep -E '(?<!ADF)' file.txt
I tried many but i'm sure there is something i'm not considering that is making my expression not working...
I need and want to use the -E, i know it works not using the extended regex!
Please guys me light me!
RESOLVED:
Thanks Wiktor for the below consideration:
ERE POSIX does not support lookarounds. Even if you use -P excluding 'ADF' it will just match any position that is not preceded with ADF
You cannot check with an ERE regex if a string does not contain a pattern. Only if it is not equal, does not start/end with a pattern. You may only do it with a PCRE regex. grep -P '^(?!.*ADF)' file.txt
Then i figured it out with grep -Pe:
grep -Pe "^((?!.*ADF).)*-jdbc.xml$" file.xml

Regex Pattern matching and extraction using grep [duplicate]

This question already has answers here:
How to use sed/grep to extract text between two words?
(14 answers)
Closed 2 years ago.
I have very strange interest to pattern match a line for a string and extract a value using grep. Below is the input and I want to extract the date alone from the string.
Input Host-GOOGLE-production.2015-08-01-21.migrant.deploy:{R:[{A:"0b87654nuy",RC:"JAVA".....[and the line continues]
For the above input, I wanted to write a regex that matches the date and string that comes after {A:" and before ",RC:. I know I can do this through sed and awk but I wanted to perform this task only through grep.
As a first step, to extract only the data, I tried the below command but it dint work.
Someone know how to extract both these strings to extract the values. please share your thoughts. It would be nice if I get an answers/suggestion that extract both values 2015-08-01 & 0b87654nuy in one single command using grep
$grep -o --perl-regexp "(Host-GOOGLE-production.([0-9]+?-[0-9]+?-[0-9]+)?-.*)"
Desired O/P for the above command: 2015-08-01
I wanted to write a regex that matches the date and string that comes after {A:" and before ",RC:
You can use this grep:
grep -oP '(?<=A:").*?(?=",RC:)' file
0b87654nuy
It would be nice if I get an answers/suggestion that extract both values 2015-08-01 & 0b87654nuy in one single command using grep
Use \K and alternation operator to get both outputs.
grep -oP '\bHost-GOOGLE-production\.\K[0-9]+-[0-9]+-[0-9]+(?=-)|A:"\K.[^"]*(?=",RC:)'
Example:
$ echo 'Host-GOOGLE-production.2015-08-01-21.migrant.deploy:{R:[{A:"0b87654nuy",RC:"JAVA".....[and the line continues]' | grep -oP '\bHost-GOOGLE-production\.\K[0-9]+-[0-9]+-[0-9]+(?=-)|A:"\K.[^"]*(?=",RC:)'
2015-08-01
0b87654nuy

Extract word after a known pattern in UNIX [duplicate]

This question already has answers here:
get the next word after grep matching [duplicate]
(3 answers)
Closed 7 years ago.
I have a file called in.txt which contains a whole bunch of code, however I need to extract a user ID which is guaranteed to be of the form 'EID:nmb685', potentially with content before and/or after the guaranteed format. I want to extract the 'nmb685' using a bash script. I've tried some combinations of grep and sed but nothing has worked.
if your grep doesn't support -p but supports -o, you can combine grep and awk.
grep -o 'EID:\w\+' file|awk -F':' '{print $2}'
Though can it be done by awk alone, but this is more straightforward.
If your grep supports -P, perl-regexp parameter, you may use this.
grep -oP 'EID:\K\w+' file
What is being output after the ID? Is there anything consistent that you can match against?
If you know the length of the userid you can use:
grep "EID:......" in.txt > out.txt
or if you don't maybe something like this (checks all char/num followed by space, preceeded by EID:)
grep "EID:[A-Za-z0-9]* " in.txt > out.txt
Not very elegant, but this works:
grep "EID:" in.txt | sed 's/\(.*\EID:......\).*/\1/g' | sed 's/^.*EID://'
Select all lines with the substring "EID:"
Remove everything after "EID:" plus 6 characters
Remove everything before (and including) "EID:"

regex: find strings that do not begin with a certain prefix [duplicate]

This question already has an answer here:
Regular expression for a string that does not start with a sequence
(1 answer)
Closed 9 years ago.
I want to find a word in strings, but only if it doesn't begin with a prefix.
for example.
I'd like to find all the appearances of APP_PERFORM_TASK, but only if they are not starting with a prefix of CMD_DO("
so,
CMD_DO("APP_PERFORM_TASK") <- OK (i don't need to know about this)
BLAH("APP_PERFORM_TASK") <-- NOT OK, this should match my search.
I tried:
(?!CMD_DO\(")APP_PERFORM_TASK
But that doesn't produce the results I need. What I doing wrong?
Here's a quick way:
Use the --invert-match (also known as -v) flag to ignore CMD_DO and pipe the results to a second grep that only matches BLAH:
grep -v CMD_DO dummy | grep BLAH
Try replacing NegativeLookAhead (?!) with NegativeLookBehind (?<!) in your regex
(?<!CMD_DO\(")APP_PERFORM_TASK
Check this in action here
Based on your comment: Let's concentrate on command line tool grep
Here is grep solution without using -P switch (perl like regex):
grep 'APP_PERFORM_TASK' file | grep -v '^CMD_DO("'
Here is grep solution using -P switch and negative lokbehind:
grep -P '(?<!^CMD_DO\(")APP_PERFORM_TASK' file
Try this
(?!CMD_DO\(").*APP_PERFORM_TASK.*
To handle an input line with both the desirable and undesirable forms like:
CMD_DO("APP_PERFORM_TASK") BLAH("APP_PERFORM_TASK")
you'd need something like this in awk (using GNU awk for gensub()):
awk -v s="APP_PERFORM_TASK" 'gensub("CMD_DO\\(\\""s,"","") ~ s' file
i.e. get rid of all of the unwanted occurrences of the string then test whats left.
An awk version
awk '/APP_PERFORM_TASK/ && !/^CMD_DO/' file