How to do MD5 hashing of as string in athena? - amazon-athena

MD5 hashing function in athena is not working for string. However, athena's document shows that it does : https://docs.aws.amazon.com/redshift/latest/dg/r_MD5.html
Not sure what I am missing here. If I transform varchar to varbinary then the hash that gets generated are not correct.
Getting this error :
SYNTAX_ERROR: line 1:8: Unexpected parameters (varchar(15)) for function md5. Expected: md5(varbinary)
This query ran against the "temp" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: dd959e8a-7fa4-4170-8895-ce7cf58be6ea.```

The md5 function in Athena/Presto takes binary input. You can convert a string to a varbinary using the to_utf8 function:
SELECT md5(to_utf8('hello world'))

Related

AWS DynamoDB fails at The provided key element does not match the schema because of CSV format

I am working on importing data from S3 bucket to DynamoDB using datapipeline. The data is in csv format. I have been struggling with this for a week now, and finally came to know the real problem.
I have some fields, the important ones are (id (partitionKey), username (sortKey)) .
Now one of the entry in data is username has been set as a comma seperated value. ForExample: {"username": ""someuser,name"} . Now the irony of csv file is when mapping to dynamodb through csv (comma seperated) file. It takes comma as a new entry in the column. And so, it fails with the error The provided key element does not match the schema. Which will ofcourse right.
Is there any way I can overcome this issue? Thanks in advance for your suggestions.
EDIT:
The csv entry looks like this as an example.
1234567,"user,name",$123$,some#email.de,2002-05-28 14:07:04.0,2013-07-19 14:17:05.0,2020-02-19 15:32:18.611,2014-02-27 14:49:19.0,,,,

Can I get the resulting Big Query SQL from a query string and a list of positional parameters?

I'm using the Java library to connect to Big Query and get data. I'd like to display back to the user what the final SQL was, with its positional parameters inserted. Is that possible? I've looked through the QueryJobConfiguration Builder api and in some other classes, but I couldn't find anything.

We can’t parse this SQL syntax shows up in AWS Quicksight after applying toString()

I have a calculated field which computes a total based on a particular type:
sumIf(amount, type = "sale")
Now I'm trying to convert the result to string and then concatenate some text to it, but doing toString(sumIf(amount, type = "sale")) gives the following message:
We can’t parse this SQL syntax. If you are using custom SQL, verify the syntax and try again. Otherwise, contact support.
Is there any way to make this work?
Did you try using the correct bracket type? i.e.
toString(sumIf({amount}, {type} = "sale"))
I tried an example like this and that worked fine, Quicksight can have issues when calling fields without the right brackets.

How to get length of a string column in Athena?

How to get length of a VARCHAR or STRING column in AWS Athena? The AWS Documentation doesn't give any information on a length function, which works equivalent to the LEN() function in Redshift.
The Presto's length() functions works for getting the size of a STRING/VARCHAR column.
Usage : length(column_name)

How do I insert a time value in Redshift in a time column?

Given the following query
ALTER TABLE public.alldatatypes ADD x_time time ;,
how do I insert a value into x_time?
Time appears to be a valid column type according to the documentation.
https://docs.aws.amazon.com/redshift/latest/dg/r_Datetime_types.html#r_Datetime_types-time
https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html
However, when I try to do an insert, I always get an error.
Insert query: insert into public.alldatatypes(x_time) values('08:00:00');
Error:
SQL Error [500310] [0A000]: Amazon Invalid operation:
Specified types or functions (one per INFO message) not supported on
Redshift tables.;
I do not want to use another column type.
I am testing all the column types defined in the documentation.
That cryptic error message is the one Redshift gives when you try to use a leader-only function as source data for a compute node. So expect you aren't showing the exact code you ran to generated this error. I know that it can seem like you didn't change anything important to the issue but you likely have.
You see select now(); works just fine but insert into <table> select now(); will give the error you are showing. This is because now() is a leader only function. However insert into <table> select getdate(); works great - this is because getdate() is a function that runs on compute nodes.
Now the following SQL runs just fine for me:
create table fred (ttt time);
insert into public.fred(ttt) values('01:23:00'); -- this is more correctly written values('01:23:00':time)
insert into public.fred(ttt) select getdate()::time;
select * from fred;
While this throws the error you are getting:
insert into public.fred(ttt) select now()::time;
So if this doesn't help clear things up please post a complete test case that demonstrates the error.