I am trying to create an external function in Athena using AWS Lambda function. I am able to do so and query successfully using Athena query editor. Code is below.
using external function s3signedurl(col1 VARCHAR) returns varchar lambda 'customudf'
select incident, pdfloc, s3signedurl(pdfloc) as s3_signed_url
from "lambdapoc"."lambdametadata"
However I want to create a View on top of this so I can query it easily like a table/view (Select * from MYVIEW) from a Reporting Tool like Tableau OR any other tool which connects to Athena.
I don't seem to find the documentation for it or any examples on the Internet.
Please note: Column 's3_signed_url' data is dynamically generated from Lambda function and it will change everytime the query is executed.
Can anyone help?
I just got to know that "Using connectors or UDFs in Views are not supported". It is mentioned in their GitHub Repo.
Closing this question
Might be worth exploring if a Create Table As Select (CTAS) table would meet your needs. I believe it is compatible with UDFs
Related
I have been working with AWS Athena for a while and need to do create a backup and version control of the views. I'm trying to build an automation for the backup to run daily and get all the views.
I tried to find a way to copy all the views created in Athena using boto3, but I couldn't find a way to do that. With Dbeaver I can see and export the views SQL script but from what I've seen only one at a time which not serve the goal.
I'm open for any way.
I try to find answer to my question in boto3 documentation and Dbeaver documentation. read thread on stack over flow and some google search did not took me so far.
Views and Tables are stored in the AWS Glue Data Catalog.
You can Query the AWS Glue Data Catalog - Amazon Athena to obtain information about tables, partitions, columns, etc.
However, if you want to obtain the DDL that was used to create the views, you will probably need to use SHOW CREATE TABLE [db_name.]table_name:
Analyzes an existing table named table_name to generate the query that created it.
Have you tried using get_query_results in boto3? get_query_results
I have a table in BigQuery containing consumers' reviews, some of them are in local languages and I need to use a translation API to translate them and create a new column to the existing table incorporating the transalted reviews. I was wondering whether I can automate this task? e.g. using Google Translate API in BigQuery....
An alter solution to achieve this if customer review has some limited review comments in response then you can create a Bigquery function to replace values.
A sample code is given over github repository.
If you want to use an external API in BigQuery, like a Language Translation API, you can use Remote Functions (a recent release).
In this GitHub repo you can see how to wrap the Azure Translator API (the same way you can use the Google Translate API) into a SQL function and use it in your queries.
Since you have created the Translation SQL function, you can write an update statement (and run it periodically - using scheduled queries) to achieve what you want.
UPDATE mytable SET translated_review_text=translation_function(review_text) WHERE translated_review_text IS NULL
Is there a bug in the information_schema.views implementation in AWS Athena?
I'm getting 0 rows returned when running the query SELECT * FROM information_schema.views in AWS Athena even though the database I'm running on has tables and views in it.
Wanted to check if anyone else is facing the same issue.
I'm trying to fetch the view definition script for ALL views in the AWS Athena database as a single result set instead of using the SHOW CREATE VIEW statement.
You should be using
SHOW TABLES IN example_database; to get information about tables in Athena database.
And the loop through use describe query to fetch details.
Hope this will help!
I've crawled a couple of XML files on S3 using AWS Glue, using a simple XML classifier:
However, when I try running any query on that data using AWS Athena, I get the following error (note that it's the simplest possible query I'm doing here):
HIVE_UNKNOWN_ERROR: Unable to create input format
Note that Athena can see my tables and it can see the columns, it just can't query them:
I noticed that there is someone with the same problem on the AWS Discussion forums: Athena XML Query Give HIVE Unknown Error but it got no love from anyone.
I know there is a similar question here about this error but the query in question targeted an RDS database, unlike an S3 bucket like I have here.
Has anyone got a solution for this?
Sadly at this time 12/2018 Athena cannot query XML input which is hard to understand when you may hear that Athena along with AWS Glue can query xml.
What output you are seeing from the AWS crawler is correct though, just not what you think its doing! For example after your crawler has run and you see the tables, but cannot execute any Athena queries. Go into your AWS Glue Catalog and at the right click tables, click your table, edit properties it will look something like this:
Notice how input format is null? If you have any other tables you can look at their properties or refer back to the input formatters documentation for Athena. This is the error you recieve.
Solutions:
convert your data to text/json/avro/other supported formats prior to upload
create a AWS glue job which converts a source to target from xml to target supported Athena format(compressed hopefully with ORC/Parquet)
I want to provide runtime values to the query in Select & Create table statements. What are the ways to parameterize Athena SQL queries?
I tried with PREPARE & EXECUTE statements from Presto however it is not working in Athena console. Do we need any external scripts like Python to call it?
PREPARE my_select1
FROM SELECT * from NATION;
EXECUTE my_select1 USING 1;
The SQL and HiveQL Reference documentation does not list PREPARE nor EXECUTE as available commands.
You would need to fully construct your SELECT statement before sending it to Amazon Athena.
You have to upgrade to Athena engine version 2 and now this seems to be supported as of 2021-03-12 but I can't find an official report:
https://docs.aws.amazon.com/athena/latest/ug/querying-with-prepared-statements.html
Athena does not support Parameterized queries. How ever you can create user-defined functions that you can call in the body of a query. Refer to this to know more about UDFs.