The source text is as following:
Time: 8/26/2015 12:12:12 AM
I want to extract both of time and date values, so I used this pattern:
Time: (.+)
But because of that some times the source is like this:
Time: 8/26/2015 13:13:13 Fired by event
I had to add to the pattern that my text should ends with AM or PM, but with the following pattern I don't get my considered result:
Time: (.+) AM|PM$
Would you please help me to find the correct pattern?
Note: I don't want the times like 8/26/2015 13:13:13, I want only
the times that ends with AM or PM
You could use something like this:
Time: (\d+\/\d+\/\d+ \d+:\d+:\d+)
Update: For times ending with AM/PM only you'll have to add (AM|PM) group at the end:
Time: (\d+\/\d+\/\d+ \d+:\d+:\d+) (AM|PM)
Live demo
Try this instead
Time:\s+(.+)\s+(AM|PM)$
you could use
Time:\s(.*?)\s(.*?)\s.*
Group 1 is date
Group 2 is time
However, It may be easier to just split the string on whitespace and take the second and third items
Related
Let's say that we have this text:
2020-09-29
2020-09-30
2020-10-01
2020-10-02
2020-10-12
2020-10-16
2020-11-12
2020-11-23
2020-11-15
2020-12-01
2020-12-11
2020-12-30
I want to do something like this:
\d\d\d\d-(NOT10)-(30)
So i want to get all dates of any year, but not of the 10th month and it is important, that the day is 30.
I tried a lot to do this using negative lookahead asserations but i did not come up with any working regexes.
You can use negative lookaheads:
\d\d\d\d-(?!10)\d\d-30
The Part (?!10) ensures that no 10 follows at the point where it is inserted into the regex. Notice that you still need to match the following digits afterwards, thus the \d\d part.
Generally speaking you can not (to my knowledge) negate a part that then also matches parts of the string. But with negative lookaheads you can simulate this as I did above. The generalized idea looks something like:
(?!<special-exclusion-pattern>)<general-inclusion-pattern>
Where the special-exclusion-pattern matches a subset of the general-inclusion-pattern. In the above case the general inclusion pattern is \d\d and the special exclusion pattern ins 10.
Try :
/20\d{2}-(?:0[1-9]|1[12])-30/
Explanation :
20\d{2} it will match 20XX
(?:0[1-9]|1[12]) it will match 0X or 11, 12
30 it will match 30
Demo :https://regex101.com/r/O2F1eV/1
It's easiest to simply convert the substring (if present) that matches /^\d{4}-10-30$/ to an empty string, then split the resulting string on one or more newlines.
If your string were
2020-10-16
2020-10-30
2020-11-12
2020-11-23
and was held by the variable str, then in Ruby, for example,
str.sub(/^\d{4}-10-30$/,'')
#=> "2020-10-16\n\n2020-11-12\n2020-11-23\n"
so
str.sub(/^\d{4}-10-30$/,'').split
#=> ["2020-10-16", "2020-11-12", "2020-11-23"]
Whatever language you are using undoubtedly has similar methods.
I have reading logs from a log file which is recording multiline type. While reading QRadar assembling two record and take it as a one log.
I have describe start and end pattern of the log line while adding the log source to QRadar as:
Start Pattern RegEx: ^(\d{7})\,
End Pattern RegEx: (\d{2}:\d{2}:\d{2})$
I should have read the logs like :
1158896,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:01:15,Security Management,Log in to the server,Network Management,Succeeded,User name: someuser,2019-09-29 03:01:15
1158897,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:03:16,Security Management,Log out the server,Network Management,Succeeded,"User name: someuserOnline duration: 0 day(s) 0 hour(s) 2 minute(s) 1 second(s)",2019-09-29 03:03:16
But I receive some of them assembled, like:
1158896,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:01:15,Security Management,Log in to the server,Network Management,Succeeded,User name: someuser,2019-09-29 03:01:151158897,someuser,Inner User,Minor,10.6.130.11,2019-09-29 03:03:16,Security Management,Log out the server,Network Management,Succeeded,"User name: someuserOnline duration: 0 day(s) 0 hour(s) 2 minute(s) 1 second(s)",2019-09-29 03:03:16
Here are the regex101.com records of my start and end pattern of RegEx.
https://regex101.com/r/2IfMR7/3
https://regex101.com/r/2IfMR7/4
As you see, it works normally in regex101.com
Why QRadar is reading them as one?
You (or qradar) might be using a greedy quantifier coupled with a multiline capture character.
If you're doing something like this: ^(\d{7})\,(?:\n|.)*(\d{2}:\d{2}:\d{2})$ where the central group is (?:\n|.)* or any similar phrase matching across multiple lines, the greedy operator * means it'll try to match from the very first 7 digits to the very last timestamp on the entire log page, ignoring your start and end anchors. Try using *? instead; the ? makes it non-greedy, so it'll stop at the first timestamp.
Compare: greedy vs non-greedy.
I am programming a function that will extract the different times from a schedule using regular expressions in Python. Here is an example of the schedule that I got from a website using BeautifulSoup:
Interactive talk with discussion17:00-18:00 Documentary ‘Occupy Gezi’
We present to you Taksim Gezi park boycott with all ways; day and
night, with good sides and bad sides18.00 - 19:00 Poet Maria van
Daalen ‘Haitian Vodoo’, poet from Querido publishers19:00
Food20:30-22:30
As shown above, the input text has starting times with and without ending times. There is also inconsistency with using either “:” or “.” when separating the hours from the minutes.
Using regex101, I have made the following (very ugly) regular expression that seems to work on all different times: \d\d[:|.]\d\d(\s*.\s*\d\d[:|.]\d\d)?
To search the text on Python I use the following code:
def extract_times(string):
list_of_times = re.findall('\d\d[:|.]\d\d(\s*.\s*\d\d[:|.]\d\d)?', string)
return list_of_times
However, when I put the example text from above in this function, it returns this:
['-18:00', ' - 19:00', '', '-22:30']
I expected something like [’17:00-18:00’], [’19:00’].
What have I done wrong?
Use this one : \d{1,2}[:.]([\d\s-]+[:.])?\d{2}}
Explanation
\d{1,2} one or two digits to match 1:00 and 01:00
[:.] to match 18:00 and 18.00
[\d\s-]+ n digit, whitespace or dash (optional)
[:.]\d{2} to match 18:00 and 18.00 (optional)
\d{2} 2 digits
In your sample text, the following will match (use full match) :
Match 1 17:00-18:00
Match 2 18.00 - 19:00
Match 3 19:00
Match 4 20:30-22:30
Demo
I am using powershell with regex to try and extract the following time from the line below "01:42:35". However I want to ignore the time "02:42:35" but I am unsure of how to do it.
2013-07-04 02:42:35 Alert 172.172.19.9 Jul 4 01:42:35 ...
Currently I am using this time regex: $time_regex = "(\d+):(\d+):(\d+)"
How can I adapt this to the above specification?
Note: the time i am trying to get is not at the end of the line and the second time always has a date next to it in the format "Jul 4 " whereas the first time has a date next to it in the format "2013-07-04"
Thanks
$time_regex = "(?<=\w+ \d+ )(\d+):(\d+):(\d+)"
will only match a time string that's preceded by an alphanumeric "word" and a number.
If is always at the end of the line use:
$t = "2013-07-04 02:42:35 Alert 172.172.19.9 Jul 4 01:42:35"
[regex]::match( $t, "(\d+:){2}(\d+)$" ) | select -expa value
Edit after comment:
try this:
$time_regex = "(?<= \d+ )(\d+:){2}\d+"
I keep getting into situations where I end up making two regular expressions to find subtle changes (such as one script for 0-9 and another for 10-99 because of the extra number)
I usually use [0-9] to find strings with just one digit and then [0-9][0-9] to find strings with multiple digits, is there a better wildcard for this?
ex. what expression would I use to simultaneously find the strings
6:45 AM and 10:52 PM
You can specify repetition with curly braces. [0-9]{2,5} matches two to five digits. So you could use [0-9]{1,2} to match one or two.
[0-9]{1,2}:[0-9]{2} (AM|PM)
I personally prefer to use \d for digits, thus
\d{1,2}:\d{2} (AM|PM)
[0-9] 1 or 2 times followed by : followed by 2 [0-9]:
[0-9]{1,2}:[0-9]{2}\s(AM|PM)
or to be valid time:
(?:[1-9]|1[0-2]):[0-9]{2}\s(?:AM|PM)
If you are looking for a time patten, you'd do something like:
\d{1,2}:\d{1,2} (AM|PM)
Or for more specific time regex
[0-1]{0,1}[0-9]{1,2}:[0-5][0-9] (AM|PM)
Much like the other answers, except the AM/PM is not captured, which should be more efficient
\d{1,2}:\d{1,2}\s(?:AM|PM)
if I have a file containing:
1 ABC
2 123XYZ
3 6:45 AM
4 123DHD
5 ABC
6 10:52 PM
7 CDE
and run the following
$>grep -P '6:45\sAM|10:52\sPM' temp
6:45 AM
10:52 PM
$>.
should do the trick (-P is a perl regx)
EDIT:
Perhaps I misunderstood, the other answers are very good if I were looking to just find a time, but you seem to be after specific times. the others would match ANY time in HH:MM format.
overall, I believe the items you are after would be the | pipe character which is used in this case to allow alternative phrases and the {n,m} match n-m times {1,2} would match 1-2 times, etc.
It can be able to check all type of time formats :
e.g. 12:05PM, 3:19AM, 04:25PM, 23:52PM
my $time = "12:52AM";
if ($time =~ /^[01]?[0-9]\:[0-5][0-9](AM|PM)/) {
print "Right Time Dude...";
}
else { print "Wrong time Dude"; }
This is the regex you want.
/^[01]?[0-9]\:[0-5][0-9](AM|PM)/
Having this string as input:
Sat, 6 May 2017 02:08:08 +0000
I did this regEx to get combinations of one or two digits:
[0-9]*:[0-9]*:[0-9]*