PHP Regular Expression - not working.. should be - regex

I am trying to extract dates from a text variable.
I have created a regex which extracts 'MOST' formats of date as follows:
$regexp = '#[0-9]{2,4}[-\/ ]{1}([A-Za-z]{3}|[0-9]{2})[-\/ ]{1}[0-9]{2,4}#';
preg_match_all($regexp, $output, $dates);
It does not however extract dates of the format '08 Aug 2012' and I do not know why.. As far as I can tell.. it should..
For now I have inserted a seperate regex which works:
$regexp = '#[0-9]{2}[ ]{1}[A-Za-z]{3}[ ]{1}[0-9]{4}#';
preg_match_all($regexp, $output, $dates);
which is essentially the same..
It however seems pointless to have multiple regex when I need only have one.
If anyone could tell me why the first regex isnt working for such a format, and explain why, it would be greatly appreciated.
Thanks

Well, your regexp is correct for the date format you presented. And as such it also works without problems: http://ideone.com/XxdKV

Related

What is the best way to replace a portion of string with PowerShell -replace

I was given some very long csv files and have been tasked with making the information in them readable. I want to turn time formats like 2022-08-07, 08:00:00 into a time format like 08/07/2022 8:00:00. I was looking at wild cards and was going to start with replacing the dashes with slashes first and tried -replace "(....{-}..{-}..","\1/\2/\3/" with no change to the string. I found a similar post that mentioned some regex that can helped him achieve something similar? If anyone can help me out or point me to some resources to learn more about this that would be great.
You could do what you are describing this way:
$OldDate = '2022-08-07, 08:00:00'
if ($OldDate -match '(?<Year>\d{4})-(?<Month>\d\d)-(?<Day>\d\d), (?<Hour>\d\d):(?<Minute>\d\d):(?<Seconds>\d\d)') {
Write-Host "$($Matches.Month)/$($Matches.Day)/$($Matches.Year) $($Matches.Hour):$($Matches.Minute):$($Matches.Seconds)"
}
But I think life would be a lot simpler to convert the string to a DateTime and then back to a string:
$OldDate = '2022-08-07, 08:00:00'
[DateTime]::ParseExact($OldDate, "yyyy-MM-dd, hh:mm:ss", $null).ToString('MM/dd/yyyy h:mm:ss')

regex to determine YYYYMMDD pattern in filename

Imagine I have the following file names:
ZRD0004170011600001020190521.dat
ZRD0004170011600001020190521.pdf
ZRD0004170011600001020190521_TC.pdf
FLX0004170007100001020180630.dat
RES0004170007100001020180331.dat
RES0004170007100001020180930.dat
RES0004170007100001020181231.dat
RES0004170012200001020180930.dat
RES0004170012200001020181231.dat
ZNP0004170120190226.dat
ZNP0004170120190226.pdf
ZRD0004170012600001020190520.dat
ZRD0004170012600001020190520.pdf
ZRD0004170012600001020190520_TC.pdf
I want to detect the date pattern YYYYMMDD which is appearing in these files, which can appear immediately before "." or before "_TC".
Can someone help me here?
Thanks in advance!
Normally, I think you can use this regex :
[0-9]{8}(\.|_TC)
Is it what you want ?

How to exclude delimiters inside text qualifiers using Regex?

I am trying to exclude delimiters within text qualifiers. For this, I am trying to use Regex. However, I am new to Regex and am not able to fully accomplish my needs. I would be very greatful if someone can help me out.
In Alteryx, I load a delimited flat text file as 'non-delimited' and say that it does not have text qualifiers. Thus, the input will look something like this:
"aabb"|ccdd|eeff|gghh
"aa|bb"|ccdd|eeff|gghh
"aa|bb"|ccdd|"ee|ff"|gghh
"aa|bb"|"cc|dd"|"ee|ff"|"gg|hh"
"aabb"|"ccdd"|"eeff"|"gghh"
"aabb"|"ccdd"|"eeff"|"gg|hh"
aabb|ccdd|eeff|gghh
"aa|bb"|ccdd|eeff|"gg|hh"
aabb|cc|dd|eeff|gghh
aabb|"cc||dd"|eeff|gghh
aabb|"c|c|dd"|eeff|gghh
"aa||bb"|ccdd|eeff|gghh
"a|a|b|b"|ccdd|eeff|gghh
"aabb"|ccdd|eeff|"g|g|hh"
"aabb"|ccdd|eeff|"gg||hh"
I want to exclude all delimiters that are in between text qualifiers.
I have tried to use Regex to replace the delimiters within text qualifiers with nothing.
So far, I have tried the following Regex code for my target:
(")(.*?[^"])\|+(.*?)(")
And I have used the following for my replace:
$1$2$3$4
However, this will not fix te lines 11, 13, 14 and 15.
I wish to obtain the following results:
"aabb"|ccdd|eeff|gghh
"aabb"|ccdd|eeff|gghh
"aabb"|ccdd|"eeff"|gghh
"aabb"|"ccdd"|"eeff"|"gghh"
"aabb"|"ccdd"|"eeff"|"gghh"
"aabb"|"ccdd"|"eeff"|"gghh"
aabb|ccdd|eeff|gghh
"aabb"|ccdd|eeff|"gghh"
aabb|cc|dd|eeff|gghh
aabb|"ccdd"|eeff|gghh
aabb|"ccdd"|eeff|gghh
"aabb"|ccdd|eeff|gghh
"aabb"|ccdd|eeff|gghh
"aabb"|ccdd|eeff|"gghh"
"aabb"|ccdd|eeff|"gghh"
Thank you in advance for helping me out!
With kind regards,
Robin
I can't think of the correct syntax in REGEX unless you are putting in each pattern that could be found.
However, an easier way (maybe not as performant), would be to use a Text to Columns selecting Ignore delimiters in quotes. If you need it back together in one cell afterwards, you can transpose, then remove delimiters followed by a Summarize to concatenate each RecordID Group.

regex for extracting date from string

I have the following date string - "2013-02-20T17:24:33Z"
I want to write a regex to extract just the date part "2013-02-20". How do I do that? Any help will be appreciated.
Thanks,
Murtaza
You could use capture group for this.
/(\d{4}-\d{2}-\d{1,2}).*/
Using $1, you can get your desired part.
Well straightforward approach would be \d\d\d\d-\d\d-\d\d but you can also use quantifiers to make it look nicer \d{4}-\d{2}-\d{2}.
Just search for the first T and use substring. I assume you always get a well-formatted date string.
If the date string is not guaranteed to be valid, you can use any date related library to parse and validate the input (validation includes the calendar logic, which regex fails to achieve), and reformat the output.
No sample code, since you didn't mention the language.
using substring
string date = "2013-02-20T17:24:33Z";
string h = date.Substring(0, 10);

Regexp. eg: <time[HH:mm]>

I'm trying to get a regexp to get the the [HH:mm] out of the following, let's say:
Hello, today its <date[dd/mm/yy]> and the time is <time[HH:mm]>
The things between the [] can be different, it could be dd/mm or mm/yy for the date, HH:mm:ss or mm:ss for the time...
I tried to get something working with wikipedia or regular-expressions.info, but well, I failed, so I need you to help me!:)
Thanks in advance
/\d\d:\d\d$/g
See it working at http://refiddle.com/111
This basically looks for the HH:MM pattern at the end of the string.
If you need to match anywhere in the string, then this will work
/\b\d\d:\d\d\b/g
See it at http://refiddle.com/112.