Presto SQL: TO_UNIXTIME - amazon-web-services

I want to convert a readable timestamp to UNIX time.
For example: I want to convert 2018-08-24 18:42:16 to 1535136136000.
Here is my syntax:
TO_UNIXTIME('2018-08-24 06:42:16') new_year_ut
My error is:
SYNTAX_ERROR: line 1:77: Unexpected parameters (varchar(19)) for function to_unixtime. Expected: to_unixtime(timestamp) , to_unixtime(timestamp with time zone)

You need to wrap the varchar in a CAST to timestamp:
to_unixtime(CAST('2018-08-24 06:42:16' AS timestamp)) -- note: returns a double
If your timestamp value doesn't have fraction of second (or you are not interested in it), you can cast to bigint to have integral result:
CAST(to_unixtime(CAST('2018-08-24 06:42:16' AS timestamp)) AS BIGINT)
If your readable timestamp value is a string in different format than the above, you would need to use date_parse or parse_datetime for the conversion. See https://trino.io/docs/current/functions/datetime.html for more information.
Note: when dealing with timestamp values, please keep in mind that: https://github.com/trinodb/trino/issues/37

Related

Amazon Athena CREATE EXTERNAL TABLE mismatched input 'external' invalidrequestexception

I am trying to create an external table in Amazon Athena. My query is the following:
CREATE EXTERNAL TABLE priceTable (
WeekDay STRING,
MonthDay INT,
price00 FLOAT,
price01 FLOAT,
price02 FLOAT,
price03 FLOAT,
price04 FLOAT,
price05 FLOAT,
price06 FLOAT,
price07 FLOAT,
price08 FLOAT,
price09 FLOAT,
price10 FLOAT,
price11 FLOAT,
price12 FLOAT,
price13 FLOAT,
price14 FLOAT,
price15 FLOAT,
price16 FLOAT,
price17 FLOAT,
price18 FLOAT,
price19 FLOAT,
price20 FLOAT,
price21 FLOAT,
price22 FLOAT,
price23 FLOAT,
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ';'
LINES TERMINATED BY '\n'
LOCATION 's3://myquicksighttestbucket/C1_SphdemDD_CANARIAS_20190501_20190531_v2'
Where the file in S3 is just a csv deliminted by semicolons.
However, I get the following error:
line 1:8: mismatched input 'external'. expecting: 'or', 'schema', 'table', 'view' (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: e524f7e6-39ca-4af7-9e39-f86a4d0a36c8; proxy: null)
Can anybody tell what I am doing wrong? Any help is much appreciated.
Oooh! I am sorry, the error was the comma after the last field!!
And, also, instead of:
FIELDS TERMINATED BY ';'
I should have used the delimiter's OCT code (073) like this:
FIELDS TERMINATED BY '073'
Make sure table name does not have "-", spaces, or any other character not allowed in table names.
I had invalid field names which included - chars. A rather easy mistake when copying names like flow-direction directly from flow logs definitions.
I had the same error today, and unlike others, I had a partitioned by clause where I didn't submit the type for the column:
CREATE EXTERNAL TABLE IF NOT EXISTS table_name(
creationtime string,
anumber bigint,
somearray array<struct<...>>,
somestring string)
PARTITIONED BY (creation_date string)
^^^^^^ <--- 'string' was missing
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
LOCATION
's3://location/';
Once I added the type, the error vanished and the query was successful.
Lots of answers here already, but I just wanted to summarize and say it seems like any syntax error in the statement can cause this error.
In my case I had a trailing comma after the last item of my TBLPROPERTIES
I got same error,changing column datatype INTEGER to INT resolved this error for me.
https://docs.aws.amazon.com/athena/latest/ug/data-types.html
int and integer – Athena uses different expressions for integer depending on the type of query.
int – In Data Definition Language (DDL) queries like CREATE TABLE, use the int data type.
integer – In DML queries like SELECT * FROM, use the integer data type. integer is represented as a 32-bit signed value in two's complement format, with a minimum value of -231 and a maximum value of 231-1.

How to convert text field with formatted currency to numeric field type in Postgres?

I have a table that has a text field which has formatted strings that represent money.
For example, it will have values like this, but also have "bad" invalid data as well
$5.55
$100050.44
over 10,000
$550
my money
570.00
I want to convert this to a numeric field but maintain the actual numbers that can be retained, and for any that can't , convert to null.
I was using this function originally which did convert clean numbers (numbers that didn't have any formatting). The issue was that it would not convert $5.55 as an example and set this to null.
CREATE OR REPLACE FUNCTION public.cast_text_to_numeric(
v_input text)
RETURNS numeric
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
declare v_output numeric default null;
begin
begin
v_output := v_input::numeric;
exception when others then return null;
end;
return v_output;
end;
$BODY$;
I then created a simple update statement which removes the all non digit characters, but keeps the period.
update public.numbertesting set field_1=regexp_replace(field_1,'[^\w.]','','g')
and if I run this statement, it correctly converts the text data to numeric and maintains the number:
alter table public.numbertesting
alter column field_1 type numeric
using field_1::numeric
But I need to use the function in order to properly discard any bad data and set those values to null.
Even after I run the clean up to set the text value to say 5.55
my "cast_text_to_numeric" function STILL sets this to null ? I don't understand why this sets it to null, but the above statement correctly converts it to a proper number.
How can I fix my cast_text_to_numeric function to properly convert values such as 5.55 , etc?
I'm ok with disgarding (setting to NULL) any values that don't end up with numbers and a period. The regular expression will strip out all other characters... and if there happens to be two numbers in the text field, with the script, they would be combined into one (spaces are removed) and I'm good with that.
In the example of data above, after conversion, the end result in numeric field would be:
5.55
100050.44
null
550
null
570.00
FYI, I am on Postgres 11 right now

How to read and output numeric values properly in BigQuery?

I'm trying to read the following rows out of a CSV file stored in GCS
headers: "A","B","C","D"
row1:"4000,0000000000000","15400000,000","12311918,400000","3088081,600"
row2:"5000,0000000000000","19250000,000","15389898,000000","3860102,000"
The issue here is how BigQuery is actually interpreting and thus outputting these numbers:
Results query number 1
It's interpreting A as FLOAT64, and B, C and D as INT64, which is okay since I decided to use autodetect schema. But when I try to convert it to a different type it's still outputting the numbers unproperly.
This is the query:
SELECT
CAST(quantity AS INT64) AS A,
CAST(expenses_2 AS FLOAT64) AS B,
CAST(cexpenses_3AS FLOAT64) AS C,
CAST(expenses_4 AS FLOAT64) AS D
FROM
`wide-gecko-289100.bqtest.expenses`
These are the results of query above:
Result query number 2
Either way, it's misinterpreting how to read the numbers, it should be as follows:
row1: [4000] [15400000] [12311918,4] [3088081,6]
row2: [5000] [19250000] [15389898] [3860102]
Is there a way to solve this?
This is due to BigQuery not understanding the localized format you're using for the numeric values. It expects the period (.) character for the decimal separator.
If you can't deal with this early in the process that produces the CSV files in BigQuery, another strategy is to instead use a string type for the columns, and then do some manipulation.
Here's a simple conversion example that shows some string manipulation and casting to get to the desired type. If you're using both commas and periods as part of the localized format, you'll need a more complex string manipulation.
WITH
sample_row AS (
SELECT "4000,0000000000000" as A, "15400000,000" as B,"12311918,400000" as C,"3088081,600" as D
)
SELECT
A,
CAST(REPLACE(A,",",".") AS FLOAT64) as A_as_float64,
CAST(CAST(REPLACE(A,",",".") AS FLOAT64) AS INT64) as A_as_int64
FROM
sample_row
You could also generalize this as a user defined function (temporary or persisted) to make it easier to reuse:
CREATE TEMPORARY FUNCTION parseAsFloat(instr STRING) AS (CAST(REPLACE(instr,",",".") AS FLOAT64));
WITH
sample_row AS (
SELECT "4000,0000000000000" as A, "15400000,000" as B,"12311918,400000" as C,"3088081,600" as D
)
SELECT
CAST(parseAsFloat(A) AS INT64) as A,
parseAsFloat(B) as B,
parseAsFloat(C) as C,
parseAsFloat(D) as D,
FROM
sample_row
I think this is an issue with how BigQuery interprets a comma. It seems to detect it as a thousands separator rather than a decimal.
https://issuetracker.google.com/issues/129992574
Is it possible to replace with a "." instead?

store infinity in postgres json via django

I have a list of tuples like below -
[(float.inf, 1.0), (270, 0.9002), (0, 0.0)]
I am looking for a simple serializer/deserializer that helps me store this tuple in a jsonb field in PostgreSQL.
I tried using JSONEncoder().encode(a_math_function) but didn't help.
I am facing the following error while attempting to store the above list in jsonb field -
django.db.utils.DataError: invalid input syntax for type json
LINE 1: ...", "a_math_function", "last_updated") VALUES (1, '[[Infinit...
DETAIL: Token "Infinity" is invalid.
Note: the field a_math_function is of type JSONField()
t=# select 'Infinity'::float;
float8
----------
Infinity
(1 row)
because
https://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-FLOAT
In addition to ordinary numeric values, the floating-point types have
several special values:
Infinity
-Infinity
NaN
yet, the json does not have such possible value (unless its string)
https://www.json.org/
value
string
number
object
array
true
false
null
thus:
t=# select '{"k":Infinity}'::json;
ERROR: invalid input syntax for type json
LINE 1: select '{"k":Infinity}'::json;
^
DETAIL: Token "Infinity" is invalid.
CONTEXT: JSON data, line 1: {"k":Infinity...
Time: 19.059 ms
so it's not the jango or postgres limitation - just Infinity is invalid token, yet 'Infinity' is a valid string. so
t=# select '{"k":"Infinity"}'::json;
json
------------------
{"k":"Infinity"}
(1 row)
works... But Infinity here is "just a word". Of course you can save it as a string, not as numeric value and check every string if it's not equal "Infinity", and if it is - launch your program logic to treat it as real Infinity... But in short - you can't do it, because json specification does not support it... same asyou can't store lets say red #ff0000 as colour in json - only as string, to be caught and processed by your engine...
update:
postgres would cast float to text itself on to_json:
t=# select to_json(sub) from (select 'Infinity'::float) sub;
to_json
-----------------------
{"float8":"Infinity"}
(1 row)
update
https://www.postgresql.org/docs/current/static/datatype-json.html
When converting textual JSON input into jsonb, the primitive types
described by RFC 7159 are effectively mapped onto native PostgreSQL
types
...
number numeric NaN and infinity values are disallowed

QDateTime::fromString returns invalid Date, what am I missing?

I have some code that reads a datetime from a sqlite database, the datetime is returned as a string. when I try to convert it to a date using QDateTime::FromString it returns an invalid date. Below is the time as returned from the database and conversion.
Why is this failing to parse?
// -this is the value returned from the DB currentServerTime=2012-01-17 19:20:27.0
QString format("yyyy/MM/dd hh:mm:ss");
QString qCurrentServerTime(currentServerTime);
now = QDateTime::fromString(qCurrentServerTime, format);
No expert in QT, but if QDateTime::fromString() works as one would (reasonably) expect and according to this, you're not using the correct pattern.
You indicate the string read from the sqllite database is like "2012-01-17 19:20:27.0", then your format should be like yyyy-MM-dd HH:mm:ss.z.
In detail:
Your separator should by '-' not '/' (as you show in the example)
The time seems to be in 24 hours format (19 -> 7 p.m.) (so use HH instead of hh)
You have one digit for milliseconds, so add .z.