How to get an error of query to athena via boto3? - amazon-web-services

Does boto3 have any method which allows one to get the text of the error if the query failed? get_query_execution returns a status of the query only.

You can get the error message from 'StateChangeReason' field of your response['Status'].
As per get_query_execution documentation:
StateChangeReason (string) --
Further detail about the status of the query.
import boto3
client = boto3.client('athena')
failed_query_id = '08adbf00-5f14-4d54-9311-fd55e2024781'
response = client.get_query_execution(QueryExecutionId=failed_query_id)
print(response['Status']['StateChangeReason'])

Related

Athena query executed through boto3 python client gives smaller result compared to query executed through AWS cli

I want to execute a very simple query through Athena.
Query: select * from information_schema.tables
When I execute the query using the boto3 client with the following code:
...
def run_query(query_string):
query_execution_context = {"Catalog": "awsdatacatalog", "Database": "information_schema"}
response = athena_client.start_query_execution(
QueryString=query_string, QueryExecutionContext=query_execution_context, WorkGroup="primary"
)
return response
query_string_get_tables = "select * from information_schema.tables"
response = run_query(query_string_get_tables)
I get back a result of 9 rows in 0.6s.
When I then go to the AWS console and rerun the same query I get back a result of 500 rows in 6s.
The result from the AWS console is correct. How can I get the same result using the boto3 client?
EDIT:
I downloaded the query history and compared the query string. As you can see they are exactly the same. I also removed the QueryExecutionContext in the boto3 client call but this doesn't change anything. Besides, I tried all combinations of single/double quotes.
Query history:
37b72ac5-3223-496f-8293-79eab8a661a0,select * from information_schema.tables,2022-12-02T18:23:09.738-08:00,SUCCEEDED,6.503 sec,39.01 KB,Athena engine version 2,'-
9d3a274a-8109-4988-aaf8-bba9c8733208,select * from information_schema.tables,2022-12-02T18:14:11.385-08:00,SUCCEEDED,520 ms,0.67 KB,Athena engine version 2,'-
As mentioned in the comments using boto3 needs some efforts to start_query_execution, wait for its completion, and then get_query_results (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena.html#Athena.Client.get_query_results).
To make your life easier, you can use the open-source library AWSWrangler or AWS-SDK-Pandas. With this library to can get the results in a blocking manner:
# Retrieving the data from Amazon Athena
df = wr.athena.read_sql_query("SELECT * FROM my_table", database="my_db")

AWS Athena + boto3: How am I supposed to execute named queries?

Concerning the following draft script I would like to know: How can I execute the named query I created?
I can access the query via the browser interface, but would like to execute it via the Session.
Here the answer is to use the client.start_query_execution(...) command. But whats the point when it is not the named query I created but instead a non-named query with the same query_string. Or am I missing something essential in how to use this?
import boto3
sess = boto3.session.Session(
region_name=region,
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY
)
athenaclient = sess.client('athena')
query_string = '''
SELECT *
FROM "ccindex"."ccindex"
WHERE crawl = 'CC-MAIN-2020-34'
AND subset = 'warc'
AND url_host_tld = 'de'
AND url_query IS NULL
AND url_path like '%impressum%'
LIMIT 20000
'''
resp = athenaclient.create_named_query(
Name='filter-ccindex-de',
Description='Filter *.de/impressum websites of Common Crawl index',
Database='ccindex',
QueryString=query_string
)
I don't think there is a direct option to pass named query to your start_query_execution method.But this can be achieved by using get_named_query which accepts Name of the named query and returns QueryString in response.
Then you can parse this response and pass QueryString to start_query_execution method.

Sagemaker boto3 invoke_endpoint - I keep getting type errors for payload. using Blazingtext model endpoint

Let me frame the issue. I have trained a blazingtext model and have an endpoint deployed.
Within my Notebook instance I can call model.predict and get inferences from the endpoint.
I am now trying to set up a lambda and an API gateway for the endpoint. I am having trouble trying to figure out what the payload is supposed to be for Invoke_endpoint(endpoint_name = mymodel,
body = payload)
I keep getting invalid payload format errors
This is what my payload looks like when testing the lambda
{"instances":"string of text"}
the documentation says the body take b'bytes or file like objects. i have tinkered around with IO with no luck. No good blogs or tutorials out there for this particular issue. Only a bunch of videos going over the cookie cutter examples that are out there.
import io
import boto3
import json
import csv
# grab environment variables
ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
runtime= boto3.client('runtime.sagemaker')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
data = json.loads(json.dumps(event))
payload = data["instances"]
print(data)
#print(payload)
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=payload.getvalue())
#print(response)
#result = json.loads(response['Body'].read().decode())
#print(result)
#pred = int(result['predictions'][0]['score'])
#predicted_label = 'M' if pred == 1 else 'B'
return ```
"errorMessage": "An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (406) from model with message \"Invalid payload format\"
If your payload is what you describe, i.e.:
payload = {"instances":"string of text"}
then you can get it in the form of json string using:
json.dumps(payload)
# which gives:
'{"instances": "string of text"}'
If you want it in bate array, then you can do as follows:
json.dumps(payload).encode()
# which gives:
b'{"instances": "string of text"}'

Presigned URL for DynamoDB put_item

There are a few examples for the way to pre-sign the URL of an S3 request, but I couldn't find any working example to pre-sign other services in AWS.
I'm trying to write an item to DynamoDB using the Python SDK botos. The SDK included the option to generate the pre-signed URL here. I'm trying to make it work and I'm getting a URL, but the URL is responding with 404 and the Item is not appearing in the DynamoDB table.
import json
ddb_client = boto3.client('dynamodb')
response = ddb_client.put_item(
TableName='mutes',
Item={
'email': {'S':'g#g.c'},
'until': {'N': '123'}
}
)
print("PutItem succeeded:")
print(json.dumps(response, indent=4))
This code is working directly. But when I try to presign it:
ddb_client = boto3.client('dynamodb')
params = {
'TableName':'mutes',
'Item':
{
'email': {'S':'g#g.c'},
'until' : {'N': '1234'}
}
}
response = ddb_client.generate_presigned_url('put_item', Params = params)
and check the URL:
import requests
r = requests.post(response)
r
I'm getting: Response [404]
Any hint on how to get it working? I checked the IAM permissions, and they are giving full access to DynamoDB.
Please note that you can sign a request to DynamoDB using python, as you can see here: https://docs.aws.amazon.com/general/latest/gr/sigv4-signed-request-examples.html#sig-v4-examples-post . But for some reasons, the implementation in the boto3 library doesn't do that. Using the boto3 library is much easier than the code above, as I don't need to provide the credentials for the function.
You send an empty post request. You should add the data to the request:
import requests
r = requests.post(response, data = params)
I think you are having this issue, that's why you are recieving a 404.
They recommend using Cognito for authentication instead of IAM for this cases.

Boto - how to delete a spot request given a request id

I make a spot request using the below:
req = conn_spot.request_spot_instances(price=self.spot_price,instance_type=self.instance_type, ebs_optimized=self.ebs_optimized,image_id=self.ami,placement=self.zone,key_name=self.keypair,security_groups=[self.security_group])
I am able to get the spot request ID using the below:
request_id = req[0].id
I can check on the status of my request id usng the below:
reqs = conn_spot.get_all_spot_instance_requests()
Now...given the request_id I need to cancel the order e.g. if taking too long. How do I do that using boto given the request id?
cancel_spot_instance_requests(request_ids, dry_run=False)
See: boto documentation