How to print special characters in Athena/Presto - amazon-web-services

I'm looking for a way to print tab separated values in aws Athena/presto. The following query doesn't do it:
select 'fielf1\tfield2'
which gives (unsurprisingyl)
field1\tfield2
while I would like
field1 field2
where the two fields are separated by a tab character.
EDIT: The "standard" syntax proposed by Piotr Findeisen:
SELECT U&'field1\0009field2'
Returns:
Your query has the following error(s):
Queries of this type are not supported (Service: AmazonAthena; Status Code: 400; Error Code: InvalidRequestException; Request ID: [...])

For the record, a possible solution is:
select 'field1'||chr(9)||'field2'

SELECT U&'field1\0009field2'
See more examples at https://trino.io/docs/current/language/types.html#varchar

Related

AWS Athena, Erro when Len of String

I tried to show the len of one string..
SELECT length(fieldA) FROM "data_prod"."myscore" limit 10;
but I receive that error
Your query has the following error(s):
SYNTAX_ERROR: line 1:8: Unexpected parameters (bigint) for function length. Expected: length(varchar(x)) , length(char(x)) , length(varbinary)
This query ran against the "raw_public_data_prod" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 92f6401b-108a-4671-951e-3a8e882f3b20.
enter image description here
This is what you are looking for.
SELECT length(cast(fieldA as varchar)) FROM "data_prod"."myscore" limit 10;
length function doesn't work for bigint data type. So you have to cast it as varchar to do so.

AWS Athena select query to fetch error code from status column

AWS Athena trying to run a select query as below to fetch error code from the status column, but getting the below error
The query which I am trying:
select * from s3_accesslog where status = '404'
Error: SYNTAX_ERROR: line 1:78: '=' cannot be applied to integer, varchar(3)
select * from s3_accesslog where status like '%404%'
Error: SYNTAX_ERROR: line 1:71: Left side of LIKE expression must evaluate to a varchar (actual: integer)
Looks like your status codes are stored in the table as integers, if you remove the quotes the query should work.
So try:
select * from s3_accesslog where status = 404

Trying to Fetch only multiple column from select query where status column with error code

I'm trying to fetch only specific columns(uri,hostheader) from an Athena query where the status column is like 404.
When I execute the query I get the output for uri and hostheader unable to fetch the results for status 404 with the below query.
select
uri,
hostheader
from
accesslogs
where
CAST(status AS VARCHAR) like '%404%'
The solution was to not cast as varchar and instead use the native int type
select
uri,
hostheader
from
accesslogs
where
status = 404

"Where clause" is not working in AWS Athena

I used AWS Glue Console to create a table from S3 bucket in Athena. You can see a relevant part on the screenshot above. I obfuscated column name, so assume the column name is "a test column". I would like to select the records with value D in that column. The query I tried to run is:
SELECT
*
FROM
table
WHERE
"a test column" = "D"
Nothing is returned. I also tried to use IS instead of =, as well as to surround D with single quotes instead of double quotes within the WHERE clause:
-- Tried this
WHERE
"a test column" = 'D'
-- Tried this
WHERE
"a test column" IS "D"
-- Tried this
WHERE
"a test column" IS 'D'
Nothing works. Can someone help? Thank you.
The error message I got is
Mismatched input 'where' expecting (service: amazon athena; status code: 400; error code: invalid request exception; request id: 8f2f7c17-8832-4e34-8fb2-a78855e3c17d)
Problem with the query syntax. Use single quotes (') when you refer to a string values, because double quotes refer to a column name in your table.
SELECT
*
FROM
table
WHERE
"column_name" = 'D'
The unexpected answer (also apologize if I did not say it clearly in the original post) is that, I cannot add "limit 200" in front of the where clause. I have to add it in the end. Hope it helps others.

substring match in redshift database

I have a redshift table "person" in which a particular column has data something like this
[{"attributeName":"name","attributeMetadata":null,"attributeValue":"KitchenAid - 7-Speed Hand Mixer - White","attributeImageType":"PRODUCT","attributeStatusCodes":[]},
{"attributeName":"title","attributeMetadata":null,"attributeValue":"KitchenAid","attributeImageType":"PRODUCT","attributeStatusCodes":[]},
{"attributeName":"address","attributeMetadata":null,"attributeValue":"address","attributeImageType":"PRODUCT","attributeStatusCodes":[]},
{"attributeName":"PIN CODE","attributeMetadata":null,"attributeValue":"32110","attributeImageType":"IMG","attributeStatusCodes":[]}]
I would like to extract only the dictionary/json/substring containing PIN CODE (see below)
{"attributeName":"PIN CODE","attributeMetadata":null,"attributeValue":"32110","attributeImageType":"IMG","attributeStatusCodes":[]}
I tried the following query and it is giving the following error
select distinct regexp_substr(attributes,'.*({.*?"attributeName":"PIN CODE".*?}).*') from person ;
ERROR: Invalid content of repeat range
DETAIL:
-----------------------------------------------
error: Invalid content of repeat range
code: 8002
context: T_regexp_init
query: 528401
location: funcs_expr.cpp:130
process: query2_40 [pid=12603]
-----------------------------------------------
I guess the problem is occurring because of multiple attributeName in a single column. Is their a way to achieve the desired result.
I am not sure if I understood you correctly, but you can try to use LIKE:
select * from person where attributes LIKE '%"attributeName":"PIN CODE"%';