chrono parse including timezone

chrono parse including timezone - c++

I'm using #howard-hinnant 's date library (now part of C++20) to parse a date string that includes a timezone abbreviation. The parse takes place without errors, but it appears that the timezone is ignored. For example:
istringstream inEst{"Fri, 25 Sep 2020 13:44:43 EST"};
std::chrono::time_point<std::chrono::system_clock, chrono::seconds> tpEst;
inEst >> date::parse("%a, %d %b %Y %T %Z", tpEst);
std::cout << chrono::duration_cast<chrono::milliseconds>( tpEst.time_since_epoch() ).count() << std::endl;
istringstream inPst{"Fri, 25 Sep 2020 13:44:43 PST"};
std::chrono::time_point<std::chrono::system_clock, chrono::seconds> tpPst;
inPst >> date::parse("%a, %d %b %Y %T %Z", tpPst);
std::cout << chrono::duration_cast<chrono::milliseconds>( tpPst.time_since_epoch() ).count() << std::endl;
istringstream inGmt{"Fri, 25 Sep 2020 13:44:43 GMT"};
std::chrono::time_point<std::chrono::system_clock, chrono::seconds> tpGmt;
inGmt >> date::parse("%a, %d %b %Y %T %Z", tpGmt);
std::cout << chrono::duration_cast<chrono::milliseconds>( tpGmt.time_since_epoch() ).count() << std::endl;
Produces the output:
1601041483000
1601041483000
1601041483000
Am I doing something wrong, or is the timezone info not used by the parser?

Unfortunately there is no way to reliably and uniquely identify a time zone given just a time zone abbreviation. Some abbreviations are used by multiple time zones, sometimes even with different UTC offsets.
So in short, the time zone abbreviation is parsed, but does not identify a UTC offset which can be used to alter the parsed timestamp.
See Convert a time zone abbreviation into a time zone for code that attempts to at least narrow down which timezones are using a specific time zone abbreviation at one time.
Alternatively, if you a parse a UTC offset ("%z" or "%Ez"), then that offset will be applied to the timestamp to convert it to a sys_time.
Fwiw, I ran each of these three examples through the find_by_abbrev overload taking local_time described here. The results are interesting in that they likely confirm the fragility of parsing time zone abbreviations:
"Fri, 25 Sep 2020 13:44:43 EST"
Could be any of these time zones:
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Atikokan
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Cancun
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Jamaica
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Panama
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC EST
All of these have a UTC offset of -5h. So in that sense, the UTC equivalent is unique (2020-09-25 18:44:43 UTC as shown above). However one has to wonder if America/Montreal was actually intended, which has a UTC offset of -4h and an abbreviation of EDT on this date.
"Fri, 25 Sep 2020 13:44:43 PST"
Has only one match!
2020-09-25 13:44:43 PST 2020-09-25 05:44:43 UTC Asia/Manila
This has a UTC offset of 8h. But I have to wonder if America/Vancouver was intended, which has a UTC offset of -7h and an abbreviation of PDT on this date.
If one knows the matching UTC offset for the abbreviations one will be parsing, one could parse into a local_time, parse the abbreviation, look up the UTC offset, and apply it to transform the local_time into a sys_time. This library makes it easy to parse the abbreviation along with the timestamp:
local_seconds tpEst;
std::string abbrev;
inEst >> date::parse("%a, %d %b %Y %T %Z", tpEst, abbrev);
sys_seconds tpUTC{tpEst - local_seconds{} - get_offset(abbrev)};
where get_offset(abbrev) is a custom map you've written to return offsets given a time zone abbreviation. Note that this wouldn't help if (for example) EDT (-4h) was intended but EST (-5h) was parsed.
Another possible strategy is to write a map of abbreviations to time zone names (instead of to offsets). For example: Both "EST" and "EDT" could map to "America/Toronto", and then you could do:
local_seconds tpEst;
std::string abbrev;
inEst >> date::parse("%a, %d %b %Y %T %Z", tpEst, abbrev);
zoned_seconds zt{get_tz_name(abbrev), tpEst};
sys_seconds tpUTC = zt.get_sys_time();

Related

Define and convert datetime in AWS Athena

I have a process where I need to to match UTC datetime and EDT datetime.
As you know, EDT can be changed between 4/5 hours from UTC.
How can I define one datetime to be in UTC and another to be in EDT and match the two?
Something like (datetime_A is my EDT timestamp, and datetime_B is my UTC):
Where CAST((datetime_A as EDT) to UTC)=datetime_B
Thanks!

Athena - Convert String based timestamp to ISO time

I have a timestamp column that has a value like this Fri, 12 Mar 2021 14:00:02:270
I want to convert it to timestamp format to use any timestamp-related functions.
Expected output:
2021-03-12 14:00:02
I tried this, but seems its not the right syntax.
cast(date_parse(recordtime,'%a, %d %b %Y %T:%i:%S:')as TIMESTAMP )

From the documentation, the error seems to be at the end of the query, because %T is the format Time, 24-hour (hh:mm:ss), so you don't need to specify %i and %S after that.
This one works:
SELECT cast(date_parse('Fri, 12 Mar 2021 14:00:02:270', '%a, %d %b %Y %T:%f') as timestamp)
You have to add %f at the end to handle the millisecond after your Time format.

How can I correctly convert this?

I tried a couple of times converting this date format
Wed, 02 April 2015 15:50:53 SAST
to this format
YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ]
but with no luck so far.
Is there any better way to this, that I might have missed?
Here's what I attempted:
date = Wed, 02 April 2015 15:50:53 SAST
splitter = date.split(" ")
joiner = " ".join(splitter[1:len(splitter)-1])
date = datetime.datetime.strptime(joiner,"%d %b %Y %H:%M:%S")
date = datetime.datetime.strftime(date,"%A, %b %d %Y %H:%M:%S %z")
When I'm saving it to the db, I'm receiving this error:
[Wed, 02 April 2015 15:50:53 SAST for that value has an invalid format. It must be in YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."]

Have a look at strptime (str --> time) and strftime (date --> str).
EDIT:
You are trying to save a string to a DateTimeField. Just remove the string conversion (strftime).

Finding difference between string time objects in python

I have a list of strings that I am reading from a file - Each of the strings has a time offset that was recorded while storing the data.
date1= "Mon May 05 20:00:00 EDT 2014"
date2="Mon Nov 18 19:00:00 EST 2013"
date3="Mon Nov 07 19:00:00 PST 2013"
I need to find the difference in days between each pair of strings.
from datetime import datetime
from dateutil import tz
def days_hours_minutes(td):
return td.days, td.seconds//3600, (td.seconds//60)%60
date1='Fri Dec 05 19:00:00 2014' # it does not work with EDT, EST etc.
date2='Fri Dec 03 19:00:00 2014'
fmt = "%a %b %d %H:%M:%S %Y"
str1 = datetime.strptime(date1, fmt)
str2 = datetime.strptime(date2, fmt)
td=(str1-str2)
x=days_hours_minutes(td)
print x
#gives (2, 0, 0)
Basically, convert each string to its "my_time_obj" and then take the difference in days.
However, my actual string dates, have "EDT", "EST", "IST" etc - and on using the %Z notation, I get the ValueError: time data 'Fri Dec 05 19:00:00 EST 2014' does not match format '%a %b %d %H:%M:%S %Z %Y'
from the datetime documentation, I see that I can use %Z to convert this to a timezone notation - what am I missing ?
https://docs.python.org/2/library/datetime.html

I would go with parsing the timezone using pytz and do something like this (given that you know how your date string is built):
from datetime import datetime
from dateutil import tz
from pytz import timezone
def days_hours_minutes(td):
return td.days, td.seconds//3600, (td.seconds//60)%60
date1_str ='Fri Dec 05 19:00:00 2014 EST'
date2_str ='Fri Dec 03 19:00:00 2014 UTC'
fmt = "%a %b %d %H:%M:%S %Y"
date1_list = date1_str.split(' ')
date2_list = date1_str.split(' ')
date1_tz = timezone(date1_list[-1]) # get only the timezone without date parts for date 1
date2_tz = timezone(date2_list[-1]) # get only the timezone without date parts for date 2
date1 = date1_tz.localize(datetime.strptime(' '.join(date1_list[:-1]), fmt)) # get only the date parts without timezone for date 1
date2 = date2_tz.localize(datetime.strptime(' '.join(date2_list[:-1]), fmt)) # get only the date parts without timezone for date 2
td=(date1-date2)
x=days_hours_minutes(td)
print x

Converting time strings to POSIX timestamps and finding the differences using only stdlib:
#!/usr/bin/env python
from datetime import timedelta
from email.utils import parsedate_tz, mktime_tz
dates = [
"Mon May 05 20:00:00 EDT 2014",
"Mon Nov 18 19:00:00 EST 2013",
"Mon Nov 07 19:00:00 PST 2013",
]
ts = [mktime_tz(parsedate_tz(s)) for s in dates] # timestamps
differences = [timedelta(seconds=a - b) for a, b in zip(ts, ts[1:])]
print("\n".join(map(str, differences)))
Read the above links about the inherit ambiguity of the input. If you want a more robust solution; you have to use explicit pytz timezones such as 'America/New_York' or else email module hardcodes "timezone abbr. to utc offset" mapping e.g., EDT -> -0400, EST -> -0500, PST -> -0800.
Output
168 days, 0:00:00
10 days, 21:00:00
differences is a list of timedelta objects, you could get full days using td.days attribute (for non-negative intervals) or to get the value including fractions:
days = td.total_seconds() / 86400

Convert gmtime() to 4 byte hex

I have the time in this format:
Fri, 19 Dec 2014 03:55:24
and I want to convert it to a 4 byte hexadecimal value. My question is similar to this one: question, but the difference is that I have a different format, and I use the gmtime() function because I want the date since the Epoch. This is what I tried so far (by trying to break the code from the answer of the similar question into smaller parts):
ut = time.strftime("%a, %d %b %Y %H:%M:%S ", time.gmtime())
time.strptime(ut, '%a, %d %b %Y %H:%M:%S')
But I get the error:
ValueError: uncoverted data remains:
Could you please help me?
I know that it is a simple question, but I cannot see what is the problem.

Remove the trailing space in the format string used in
ut = time.strftime("%a, %d %b %Y %H:%M:%S ", time.gmtime())
The line should read:
ut = time.strftime("%a, %d %b %Y %H:%M:%S", time.gmtime())
Without the space after %S, it runs on my side!

There are several steps:
Parse rfc 5322 time string into an object that represent the broken-down time
import email.utils
time_tuple = email.utils.parsedate("Fri, 19 Dec 2014 03:55:24")
Convert the broken-down time into "seconds since Epoch" number
import calendar
timestamp = calendar.timegm(time_tuple) # assume input time is in UTC
Print the number in hex format
print('%08X' % timestamp)
# -> 5493A1AC

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

chrono parse including timezone - c++

Related

Define and convert datetime in AWS Athena

Athena - Convert String based timestamp to ISO time

How can I correctly convert this?

Finding difference between string time objects in python

Convert gmtime() to 4 byte hex

Categories

Resources