regex for extracting date from string - regex

I have the following date string - "2013-02-20T17:24:33Z"
I want to write a regex to extract just the date part "2013-02-20". How do I do that? Any help will be appreciated.
Thanks,
Murtaza

You could use capture group for this.
/(\d{4}-\d{2}-\d{1,2}).*/
Using $1, you can get your desired part.

Well straightforward approach would be \d\d\d\d-\d\d-\d\d but you can also use quantifiers to make it look nicer \d{4}-\d{2}-\d{2}.

Just search for the first T and use substring. I assume you always get a well-formatted date string.
If the date string is not guaranteed to be valid, you can use any date related library to parse and validate the input (validation includes the calendar logic, which regex fails to achieve), and reformat the output.
No sample code, since you didn't mention the language.

using substring
string date = "2013-02-20T17:24:33Z";
string h = date.Substring(0, 10);

Related

Regex Expressions for value extraction from URL param string

I have a string where I'd like to extract some values with regex.
Whole string:
event=B:Rel&time=1511879856&date=20171128-143736&ref=57b3e1ab741d5a017ab8009033350b18&dir=out&src_if=GW1&dst_if=PRI1
I would like to isolate the values between the = and the next & creating this result set for the string given above.
B:Rel
1511879856
20171128-143736
57b3e1ab741d5a017ab8009033350b18
out
GW1
PRI1
Thanks for the help!
Try this:
[^=]+=([^&]+)(?:&|$)
DEMO: https://regex101.com/r/Z3pcR4/1
Depending on your language, you could split the input on this regex:
&?[^=]*=

Regex for Date Format mm/dd/yyyy hh:mm:ss.SSS AM/PM

I am looking for a regex to match my date format in
mm/dd/yyyy hh:mm:ss.SSS AM/PM
I found the following online
^(((0[13578]|1[02])/.-/.-\s(0[0-9]|1[0-2]):(0[0-9]|[1-59]\d):(0[0-9]|[1-59]\d)\s(AM|am|PM|pm))|((0[13456789]|1[012])/.-/.-\s(0[0-9]|1[0-2]):(0[0-9]|[1-59]\d):(0[0-9]|[1-59]\d)\s(AM|am|PM|pm))|((02)/.-/.-\s(0[0-9]|1[0-2]):(0[0-9]|[1-59]\d):(0[0-9]|[1-59]\d)\s(AM|am|PM|pm))|((02)/.-/.-\s(0[0-9]|1[0-2]):(0[0-9]|[1-59]\d):(0[0-9]|[1-59]\d)\s(AM|am|PM|pm)))$
This does not match my date
06/12/2014 12:45:56.12 AM
How to tweek the above to accept milliseconds also ?
A simple and fast solution:
\d\d\/\d\d\/\d\d\d\d \d\d\:\d\d\:\d\d\.\d\d\d (AM|PM|am|pm)
If you want to use it in PCRE you need to add a delimiter like # in start and end of pattern:
#\d\d\/\d\d\/\d\d\d\d \d\d\:\d\d\:\d\d\.\d\d\d (AM|PM|am|pm)#
For more accuracy:
([0]\d|1[012])\/([012]\d|3[01])\/\d\d\d\d ([01]\d|2[0123])\:[012345]\d\:[012345]\d\.\d\d\d (AM|PM|am|pm)
Well, first one can accept invalid values like 99/99/9999 99:99:99.999 AM I prefer to go with second one because micro-optimization is not good on 99.99% of times :)

Extract all tokens from string using regex in Scala

I have a string like "httpx://__URL__/__STUFF__?param=value"
This sample is a url by convention...it could be anything with zero or more __X__ tokens in it.
I want to use a regex to extract a list of all the tokens, so output here would be List("__URL__","__STUFF__"). Remember, I don't know beforehand how many (if any) tokens may be in the input string.
I've been struggling but unable to come up with a regex expression that will do the trick.
Something like this did not work:
(?:.?(__[a-zA-Z0-9]+__).?)+
Scala Regex, which is just a wrapper around Java Regex, will never return multiple subgroups for repetitions.
The only way about it is to have a regex for the token, and then find it multiple times. You pretty much already have everything you want:
"__[a-zA-Z0-9]+__".r findAllIn "httpx://__URL__/__STUFF__?param=value"
That returns an Iterator. Use .toSeq or similar to convert into a collection.
Greg, have you tried a simple
_+[^_]+_+
This will match all the __TOKENS__
It doesn't do any check for any __TOKENLIKE__ string after the ?params, but you have mentioned you are not only using that for urls. If you need some refinement, please let us know.
Combine a regex with split:
def urlPathComponents(s: String): Option[Array[String]] =
"""(?<=http(s?)://)[^?]+""".r findFirstIn s map (_.split("/"))

regex for repeating values

I am trying to find the correct regex (for use with Java and JavaScript) to validate an array of day-of-week and 24-hour time formats. I figured out the time format but am struggling to come up with the full solution.
The regex needs to validate patterns which include one or more of the following, separated by a comma.
{two-character day} HH:MM-HH:MM
Three examples of valid strings would be:
M 5:30-7:00
M 5:30-7:00, T 5:30-7:00, W 18:00-19:30
F 12:00-14:30, Sa 6:45-8:15, Su 6:45-8:15
This should validate a 24-hour time:
/^((M|T|W|Th|Fr|Sa|Su) ([01]?[0-9]|2[0-3]):[0-5][0-9]-([01]?[0-9]|2[0-3]):[0-5][0-9](, )?)+$/
Credit for the time bit goes to mkyong: http://www.mkyong.com/regular-expressions/how-to-validate-time-in-24-hours-format-with-regular-expression/
you can try this
[A-Za-z]{1,2}[ ]\d+:\d+-\d+:\d+
You could try this: ([MTWFS][ouehra]?) ([0-9]|[1-2][0-9]):([0-6][0-9])-([0-9]|[1-2][0-9]):([0-6][0-9])
I'd go with this:
(((M|T(u|h)|W|F|S(a|u)) ((1*\d)|(2[0-3])):[1-5]\d-((1*\d)|(2[0-3])):[1-5]\d(, )?)+
This should do the trick:
^(M|Tu|W|Th|F|Sa|Su) \d{1,2}:\d{2}-\d{1,2}:\d{2}(, (M|Tu|W|Th|F|Sa|Su) \d{1,2}:\d{2}-\d{1,2}:\d{2})*$
Note that you show T in your example above which is ambiguous. You might want to enforce Tu and Th as shown in my regex.
This will capture all sets in an array. The T in the short day of week list is debatable (tuesday or thursday?).
^((?:[MTWFS]|Tu|Th|Sa|Su)\s(?:[0-9]{1,2}:[0-9]{2})-(?:[0-9]{1,2}:[0-9]{2})(?:,\s)?)+$
The (?:) are non-capturing groups, so your actual matches will be (for example):
M 5:30-7:00
T 5:30-7:00
W 18:00-19:30
But the entire line will validate.
Added ^ and $ for line boundaries and an explicit time-time match because some regular expression parsers may not work with the previous way that I had it.

MATLAB 2012 regular expression

I have a set of strings that I'd like to parse in MATLAB 2012 that all have the following format:
string-int-int-int-int-string
I'd like to pluck out the third integer (the rest are 'don't cares'), but I haven't used MATLAB in ages and need to refresh on regular expressions. I tried using the regular expression '(.*)-(.*)-(.*)-\d-(.*)' but no dice. I did check out the MATLAB regexp page, but wasn't able to figure out how to apply that information to this case.
Anyone know how I might get the desired result? If so, could you explain what the expression you're using is doing to get that result so that others might be able to apply the answer to their unique situation?
Thanks in advance!
str = 'XyzStr-1-2-1000-56789-ILoveStackExchange.txt';
[tok] = regexp(str, '^.+?-.+?-.+?-(\d+?)-.+?-.+?', 'tokens');
tok{:}
ans =
'1000'
Update
Explanation, upon request.
^ - "Anchor", or match beginning of string.
.+? - Wildcard match, one or more, non-greedy.
- - Literal dash/hyphen.
(\d+?) - Digits match, one or more, non-greedy, captured into a token.
^.*?-.*?-.*?-(\d+)-.*?-.*?$
OR
^(?:[^-]*?-){3}(\d+)(?:.*?)$
Group1 now contains your required data