Cannot parse UTC date in Athena - amazon-web-services

I have the date string in the form: 2019-02-18 09:17:31.260000+00:00 and I am trying to convert it into date in Athena.
I have tried converting into timestamp as suggested in the SO answers but failed.
There is a discussion in https://github.com/prestodb/presto/issues/10567 but no answer to this particular date format.
I tried several format like 'YYYY-MM-dd HH:mm:ss.SSSSSSZ' but doesn't work and get error like INVALID_FUNCTION_ARGUMENT: Invalid format:..is malformed at "+00:00".
Been stuck for a while, any help is appreciated!

Athena is based on a very old version of Presto, and there is no straightforwad way of doing that with some string manipulation trick. For instance, you can use regexp_replace to extract the part of the string that's compatible with the built-in timestamp with timezone type and do:
SELECT cast(regexp_replace('2019-02-18 09:17:31.260000+00:00','(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\d{3}(.*)', '$1$2') AS timestamp with time zone)
Recent versions of Trino (formerly known as PrestoSQL) introduced support for variable-precision temporal types with up to nanosecond precision (12 decimals).
With that feature, you can just do:
trino> select cast('2019-02-18 09:17:31.260000+00:00' as timestamp(6) with time zone);
_col0
--------------------------------
2019-02-18 09:17:31.260000 UTC
(1 row)

A shorter version to Martin Traverso's answer is to sub string the extra characters:
select cast(substr('2019-02-18 09:17:31.260000+00:00',1,23) as timestamp);

Related

AWS Athena BIGINT with ddmmyyyyhhmmss to date time

i have a bigint data type value 10062019192751 it is said to me that it is a datetime formated as ddmmyyyyhhmmss (10-06-2019 19:27:51)
how can i convert or parse it to datetime in AWS Athena
using syntax from_unixtime, is giving me different value
Amazon Athena is based on Presto, so you can use Date and Time Functions and Operators — Presto.
The date_parse() command can convert a string into a date by defining the format of the string (consult the above link to see the syntax).
Here is a solution, which first converts the number into a string (varchar) and then converts it into a date:
select date_parse(cast(10062019192751 as varchar),'%d%c%Y%k%i%s')
The output is:
2019-06-10 19:27:51.000

AWS Firehose dynamic partitioning and date parsing

I'm trying to do dynamic data partitioning by date with a kinesis delivery/firehose stream. The payload I'm expecting is JSON, with this general format
{
"clientId": "ASGr496mndGs80oCC97mf",
"createdAt": "2022-09-21T14:44:53.708Z",
...
}
I don't control the format of this date I'm working with.
I have my delivery firehose set to have "Dynamic Partitioning" and "Inline JSON Parsing" enabled (because both are apparently required per the AWS console UI).
I've got these set as "Dynamic Partitioning Keys"
year
.createdAt| strptime("%Y-%m-%dT%H:%M:%S.%fZ")| strftime("%Y")
month
.createdAt| strptime("%Y-%m-%dT%H:%M:%S.%fZ")| strftime("%m")
day
.createdAt| strptime("%Y-%m-%dT%H:%M:%S.%fZ")| strftime("%d")
hour
.createdAt| strptime("%Y-%m-%dT%H:%M:%S.%fZ")| strftime("%h")
But that gives me errors like date \"2022-09-21T18:30:04.431Z\" does not match format \"%Y-%m-%dT%H:%M:%S.%fZ.
It looks like strptime expects decimal seconds to be padded out to 6 places, but I have 3. I don't control the format of this date I'm working with. This seems to be JQ expressions, but I have exactly zero experience using it, and the AWS documentation for this stuff leaves an awful lot to be desired.
Is there a way to get strptime to successfully parse this format, or to just ignore the minute, second, and millisecond part of the time (I only care about hours)?
Is there another way to achieve what I'm trying to do here?
You can try following :
.createdAt | strptime("%Y-%m-%dT%H:%M:%S%Z") | strftime("%Y")
It is trimming the milliseconds whereas retaining rest of the information in the datetime.
Here is the jq snippet example

How to correctly select date field using libpqxx?

I am trying to select date field from PostgreSQL database using libpqxx and C++.
I would use this code, but I don't know if it is legal. I have searched in the documentation but I haven't any documented way.
using time_point = std::chrono::steady_clock::time_point;
pqxx::work txn(c);
auto&& rst = txn.exec("SELECT date FROM table");
for(auto&& row : rst)
time_point date = row[0].as<time_point>();
Is this okey please? Do you know any better alternative?
I would like the same with date time and time field. Is there any difference please?
Thank you.
--
the documentation for field type: https://libpqxx.readthedocs.io/en/6.4/a01063.html#a3a55f6b44040b68e70382d9db7dea457
The answer on Github by JadeMatrix:
field.as<>() will work with any type for which a specialization of pqxx::string_traits<> exists. libpqxx comes with support for std::string, builtin numerics (int, etc.), and maybe a few others I don't remember.
Support for std::chrono:: types are missing by default, unfortunately. You can implement your own, but be warned that they will only work for TIMESTAMP WITHOUT TIME ZONE, DATE, TIME WITHOUT TIME ZONE, and INTERVAL. To correctly support … WITH TIME ZONE you will need Howard Hinnant's date library, which is what I use (there is talk of adding it to the standard library).
If you want, I can share my code, which relies on date functionality for parsing Postgres date/time strings (ISO 8601 format).

Extract Date from epoch in NiFi

I have a CSV file with an attribute having epoch values like '1517334599.906'.
I want to convert/update the Epoch values into ISO timestamp 'yyyy-MM-dd HH:mm:ss.SSS' via NiFi.
That conversion is for Kibana to recognize the field as Timestamp. Is there a way to do this? If there is can anyone help me with the configuration?
Using NiFi's record capabilities you can use UpdateRecord with a CsvReader and CsvWriter.
See the "format" function in expression language for converting an epoch to a date string:
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#format
In UpdateRecord you would do something like:
/eventDate = ${field.value:format("yyyy-MM-dd HH:mm:ss.SSS")}
This says take the value of /eventDate (change this to your field name) and set the value of that field to the result of the format function on the right.
The only thing I am not sure about is whether an epoch can have a decimal portion as shown in your example. I would expect it to be converted to a long which would be a whole number.

Postgresql date format

I have a web application (written with python/django) that (due a bad specification) Web Forms expecting "YYYY-mm-dd" date format and others using "dd/mm/yy" date format.
Is there a way to tell postgresql to accept dates in both formats? for example, to try "dd/mm/yy" and, if it fails, then try "yyyy-mm-dd".
That would be awesome.
From the fine manual:
Date and time input is accepted in almost any reasonable format, including ISO 8601, SQL-compatible, traditional POSTGRES, and others. For some formats, ordering of day, month, and year in date input is ambiguous and there is support for specifying the expected ordering of these fields. Set the DateStyle parameter to MDY to select month-day-year interpretation, DMY to select day-month-year interpretation, or YMD to select year-month-day interpretation.
PostgreSQL is more flexible in handling date/time input than the SQL standard requires. See Appendix B for the exact parsing rules of date/time input and for the recognized text fields including months, days of the week, and time zones.
So PostgreSQL should be able to deal with just about any date format you throw at it. Your "dd/mm/yy" format is, however, ambiguous. But, there is the DateStyle configuration parameter to help with such ambiguity.
For example:
=> create table x (d date not null);
=> insert into x values ('2001-01-10');
=> insert into x values ('Feb 2 2980');
=> insert into x values ('01/02/03');
=> select * from x;
d
------------
2001-01-10
2980-02-02
2003-02-01
That said, I'd recommend moving everything to ISO 8601 (YYYY-MM-DD) internally and handle the conversions at the edges of the application. OTOH, there is reality to contend with so you should do whatever you have to do to make it go.
If those are the only two formats possible then it may be better to explicitly allow only those, rather than rely on postgres to interpret. For example:
with w as (select '2011-12-13' as input_date union select '13/12/2011')
select case when input_date~'^\d\d\d\d-\d\d-\d\d$'
then to_date(input_date, 'yyyy-mm-dd')
when input_date~'^\d\d/\d\d/\d\d\d\d$'
then to_date(input_date, 'dd/mm/yyyy')
end
from w;
case
------------
2011-12-13
2011-12-13
(2 rows)