This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I wrote the following regex to match date strings looking like:
2019/01/02 08:20:19
the regex is val reg = "([\\d]{4})/([\\d]{2})/([\\d]{2}) ([\\d]{2}).*.r"
The Scala function is:
val dateExtraction: String => Map[String, String] = {
string: String => {
string match {
case reg(year, month, day, hour) =>
Map(YEAR -> year, MONTH -> month, DAY -> day, HOUR -> hour )
case _ => Map(YEAR -> "", MONTH -> "", DAY -> "", HOUR -> "")
}
}
}
val YEAR = "YEAR"
val MONTH = "MONTH"
val DAY = "DAY"
val HOUR= "HOUR"
I want to get the year, month, day and hour from the regex.
But the date above is not parsed as expected and I get a null result. Any idea how to fix this, please.
I would use java.time for such a problem, like:
val input = "2019/01/02 08:20:19";
val formatter = DateTimeFormatter.ofPattern("yyyy/MM/dd HH:mm:ss")
val dt = LocalDateTime.from(formatter.parse(input)).atZone(ZoneId.systemDefault())
dt.getYear() // 2019
dt.getMonthValue() // 1
dt.getDayOfMonth() // 2
dt.getHour() // 8
Related
Hi in powerbi I am trying to create a list of dates starting from a column in my table [COD], and then ending on a set date. Right now this is just looping through 60 months from the column start date [COD]. Can i specify an ending variable for it loop until?
List.Transform({0..60}, (x) =>
Date.AddMonths(
(Date.StartOfMonth([COD])), x))
Assuming
start=Date.StartOfMonth([COD]),
end = #date(2020,4,30),
One way is to add column, custom column with formula
= { Number.From(start) .. Number.From(end) }
then expand and convert to date format
or you could generate a list with List.Dates instead, and expand that
= List.Dates(start, Number.From(end) - Number.From(start)+1, #duration(1, 0, 0, 0))
Assuming you want start of month dates through June 2023. In the example below, I have 2023 and 6 hard coded, but this could easily come from a parameter Date.Year(DateParameter) or or column Date.Month([EndDate]).
Get the count of months with this:
12 * (2023 - Date.Year([COD]) )
+ (6 - Date.Month([COD]) )
+ 1
Then just use this column in your formula:
List.Transform({0..[Month count]-1}, (x) =>
Date.AddMonths(Date.StartOfMonth([COD]), x)
)
You could also combine it all into one harder to read formula:
List.Transform(
{0..
(12 * ( Date.Year(DateParameter) - Date.Year([COD]) )
+ ( Date.Month(DateParameter) - Date.Month([COD]) )
)
}, (x) => Date.AddMonths(Date.StartOfMonth([COD]), x)
)
If there is a chance that COD could be after the End Date, you would want to include error checking the the Month count formula.
Generate list:
let
Start = Date1
, End = Date2
, Mos = ElapsedMonths(End, Start) + 1
, Dates = List.Transform(List.Numbers(0,Mos), each Date.AddMonths(Start, _))
in
Dates
ElapsedMonths(D1, D2) function def:
(D1 as date, D2 as date) =>
let
DStart = if D1 < D2 then D1 else D2
, DEnd = if D1 < D2 then D2 else D1
, Elapsed = (12*(Date.Year(DEnd)-Date.Year(DStart))+(Date.Month(DEnd)-Date.Month(DStart)))
in
Elapsed
Of course, you can create a function rather than hard code startdate and enddate:
(StartDate as date, optional EndDate as date, optional Months as number)=>
let
Mos = if EndDate = null
then (if Months = null
then error Error.Record("Missing Parameter", "Specify either [EndDate] or [Months]", "Both are null")
else Months
)
else ElapsedMonths(StartDate, EndDate) + 1
, Dates = List.Transform(List.Numbers(0, Mos), each Date.AddMonths(StartDate, _))
in
Dates
Is there a way to extract in one call all the matched subgroups of a string according to a regular expression.
I have a date like this:
Thu, 07 Apr 2022 15:03:32 GMT
And I created the following regexp to extract all the parts of this date:
let re =
Str.regexp
{|\([a-zA-Z]+\), \([0-9]+\) \([a-zA-Z]+\) \([0-9]+\) \([0-9]+\):\([0-9]+\):\([0-9]+\).*|}
And to extract each parts I use it like this:
let parse_date date =
let re =
Str.regexp
{|\([a-zA-Z]+\), \([0-9]+\) \([a-zA-Z]+\) \([0-9]+\) \([0-9]+\):\([0-9]+\):\([0-9]+\).*|}
in
let wday = Str.replace_first re {|\1|} date in
let day = Str.replace_first re {|\2|} date in
let mon = Str.replace_first re {|\3|} date in
let year = Str.replace_first re {|\4|} date in
let hour = Str.replace_first re {|\5|} date in
let min = Str.replace_first re {|\6|} date in
let sec = Str.replace_first re {|\7|} date in
Format.eprintf "RE DATE: %s %s %s %s %s %s %s#." wday day mon year hour min
sec
If the parts were stored in an array I could easily use it like this:
let parse_date date =
let re =
Str.regexp
{|\([a-zA-Z]+\), \([0-9]+\) \([a-zA-Z]+\) \([0-9]+\) \([0-9]+\):\([0-9]+\):\([0-9]+\).*|}
in
let parts = Str.match_groups re date in (* this function doesn't exist *)
let wday = parts.(1) in
let day = parts.(2) in
let mon = parts.(3) in
let year = parts.(4) in
let hour = parts.(5) in
let min = parts.(6) in
let sec = parts.(7) in
Format.eprintf "RE DATE: %s %s %s %s %s %s %s#." wday day mon year hour min
sec
but this doesn't appear to exist. Is there another way to do it or is my solution the only one available?
Since this isn't a XY problem, my goal is really to extract each part of a date so maybe there's another solution than using Str and I'll be happy to use it.
You can use Str.matched_group to return a particular capture group's match:
let parse_date date =
let re = Str.regexp
{|\([a-zA-Z]+\), \([0-9]+\) \([a-zA-Z]+\) \([0-9]+\) \([0-9]+\):\([0-9]+\):\([0-9]+\).*|} in
if Str.string_match re date 0 then
let wday = Str.matched_group 1 date in
let day = Str.matched_group 2 date in
let mon = Str.matched_group 3 date in
let year = Str.matched_group 4 date in
let hour = Str.matched_group 5 date in
let min = Str.matched_group 6 date in
let sec = Str.matched_group 7 date in
Format.sprintf "RE DATE: %s %s %s %s %s %s %s#." wday day mon year hour min sec
else
"RE DATE: Not matched"
let _ = parse_date "Thu, 07 Apr 2022 15:03:32 GMT" |> print_endline
The Str package is pretty primitive, though. I'd suggest using a different library for regular expressions, like PCRE-Ocaml. It does have a way to get an array of matched groups:
let parse_date2 date =
let rex = Pcre.regexp
{|([a-zA-Z]+), ([0-9]+) ([a-zA-Z]+) ([0-9]+) ([0-9]+):([0-9]+):([0-9]+).*|} in
try
let parts = Pcre.exec ~rex date |> Pcre.get_substrings in
let wday = parts.(1) in
let day = parts.(2) in
let mon = parts.(3) in
let year = parts.(4) in
let hour = parts.(5) in
let min = parts.(6) in
let sec = parts.(7) in
Format.sprintf "RE DATE: %s %s %s %s %s %s %s#." wday day mon year hour min sec
with Not_found -> "RE DATE: Not matched"
let _ = parse_date2 "Thu, 07 Apr 2022 15:03:32 GMT" |> print_endline
For simple format with fixed number of fields and separators, Scanf might be enough:
let date s = Scanf.sscanf s "%s#, %02d %s %d %d:%d:%d %s"
(fun day_name day month year h m s timezone ->
day_name,day,month,year,h,m,s,timezone
)
let x = date "Thu, 07 Apr 2022 15:03:32 GMT"
I wrote the following code :
val reg = "([\\d]{4})-([\\d]{2})-([\\d]{2})(T)([\\d]{2}):([\\d]{2})".r
val dataExtraction: String => Map[String, String] = {
string: String => {
string match {
case reg(year, month, day, symbol, hour, minutes) =>
Map(YEAR -> year, MONTH -> month, DAY -> day, HOUR -> hour)
case _ => Map(YEAR -> "", MONTH -> "", DAY -> "", HOUR -> "")
}
}
}
val YEAR = "YEAR"
val MONTH = "MONTH"
val DAY = "DAY"
val HOUR = "HOUR"
This function is supposed to be applied to strings having the following format: 2018-08-22T19:10:53.094Z
When I call the function :
dataExtractions("2018-08-22T19:10:53.094Z")
Your pattern, for all its deficiencies, does work. You just have to unanchor it.
val reg = "([\\d]{4})-([\\d]{2})-([\\d]{2})(T)([\\d]{2}):([\\d]{2})".r.unanchored
. . .
dataExtraction("2018-08-22T19:10:53.094Z")
//res0: Map[String,String] = Map(YEAR -> 2018, MONTH -> 08, DAY -> 22, HOUR -> 19)
But the comment from #CAustin is correct, you could just let the Java LocalDateTime API handle all the heavy lifting.
import java.time.LocalDateTime
import java.time.format.DateTimeFormatter._
val dt = LocalDateTime.parse("2018-08-22T19:10:53.094Z", ISO_DATE_TIME)
Now you have access to all the data without actually saving it to a Map.
dt.getYear //res0: Int = 2018
dt.getMonthValue //res1: Int = 8
dt.getDayOfMonth //res2: Int = 22
dt.getHour //res3: Int = 19
dt.getMinute //res4: Int = 10
dt.getSecond //res5: Int = 53
Your pattern matches only strings that look exactly like yyyy-mm-ddThh:mm, while the one you are testing against has milliseconds and a Z at the end.
You can append .* at the end of your pattern to cover strings that have additional characters at the end.
In addition, let me show you a more idiomatic way of writing your code:
// Create a type for the data instead of using a map.
case class Timestamp(year: Int, month: Int, day: Int, hour: Int, minutes: Int)
// Use triple quotes to avoid extra escaping.
// Don't capture parts that you will not use.
// Add .* at the end to account for milliseconds and timezone.
val reg = """(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}).*""".r
// Instead of empty strings, use Option to represent a value that can be missing.
// Convert to Int after parsing.
def dataExtraction(str: String): Option[Timestamp] = str match {
case reg(y, m, d, h, min) => Some(Timestamp(y.toInt, m.toInt, d.toInt, h.toInt, min.toInt))
case _ => None
}
// It works!
dataExtraction("2018-08-22T19:10:53.094Z") // => Some(Timestamp(2018,8,22,19,10))
I'd like to use dates and times in my code, so I have loaded the Calendar Lib using opam. I have a simple piece of code that demonstrates the problem (example.ml):
open CalendarLib
type datefun = date -> int
let run_datefun (f : datefun) (d : date) = (f d)
let () =
let mydate = make 2016 5 23 in
printf "Day of week = %i" run_datefun days_in_month mydate
As far as I can see the Calendar days_in_month method has a type signature of date -> int.
When I try and compile this code (corebuild -pkg calendar example.byte) I get the following error:
File "example.ml", line 3, characters 15-19:
Error: Unbound type constructor date
which seems to me like the compiler is looking for a Date constructor for a date type.
What am I doing wrong?
The functions and datatypes you'd like to use are inside the Date module, so rephrasing your code we get (I've also taken the liberty of rewriting the output phrase and inserted the missing parentheses):
open CalendarLib
type datefun = Date.t -> int
let run_datefun (f : datefun) (d : Date.t) = (f d)
let () =
let mydate = Date.make 2016 5 23 in
Printf.printf "# of days in current month = %i\n" (run_datefun Date.days_in_month mydate)
A little test (by the way, you don't need corebuild for this):
$ ocamlbuild -pkg calendar example.ml example.byte
Finished, 3 targets (3 cached) in 00:00:00.
$ _build/calendar.byte
# of days in current month = 31
I have two Strings. The first is like this one:
{"userId":"554555454-45454-54545","start":"20141114T172252.466z","end":"20141228T172252.466z","accounts":[{"date":"20141117T172252.466z","tel":"0049999999999","dec":"a dec","user":"auser"},{"date":"20141118T172252.466z","tel":"004888888888","dec":"another dec","user":"anotheruser"}]}
the second one has the same dates but in a different format. Instead of
20141117T172252.466z
it shows
2014-11-14,17:22:52
I'm trying to extract the dates of the first String and assert that are the same with the dates from the second String. I've tried it with regular expressions but I'm getting an error Illegal repetition. How can I do this?
You can use SimpleDateFormat from java:
import java.text.SimpleDateFormat
import java.util.Date
val s1 = "{\"userId\":\"554555454-45454-54545\",\"start\":\"20141114T172252.466z\"}"
val s2 = "{\"userId\":\"554555454-45454-54545\",\"start\":\"2014-11-14,17:22:52\"}"
val i1 = s1.indexOf("start")
val i2 = s2.indexOf("start")
val str1 = s1.replace("T", "_").substring(i1+8, i1+ 23)
val str2 = s2.substring(i2+8, i2+27)
val date1: Date = new SimpleDateFormat("yyyyMMdd_hhmmss").parse(str1)
val date2: Date = new SimpleDateFormat("yyyy-MM-dd,hh:mm:ss").parse(str2)
val result = date1==date2