Aws Athena: How to get RequestID from failed Start-Query-Execution - amazon-athena

I am getting empty response from failed start-query-execution.
I need to get the requestId of the failed query from start-query-execution.

By "failed query" I assume you mean a query that was started but ended up in a failed state. The way you discover that a query ends up in a failed state is to call Client#get_query_execution, which takes the query execution ID.
The query execution ID is the only property of the response object from the start query execution call:
response = athena.start_query_execution(…)
puts response.query_execution_id
If the call throws an error that means that the query never started.

Related

Informatica powercenter power exchange PWX-00267 DBAPI error

I am executing a workflow in informatica which is supposed to inset values in a target file.
Some of the records are getting inserted but i get an error after a few insertions saying:
[Informatica][ODBC PWX Driver] PWX-00267 DBAPI error for file……… Write error on record 119775 Requested 370 SQLSTATE [08S01]
Is this because of file constraints of how the record can be or due to some other reasons?
I'm not sure if this is exactly the case, but looking for the error code 08S01 I've found this site that lists Data Provider Error Codes. Under SQLCODE 370 (assuming this is what your error message indicates) I've found:
Message: There are insufficient resources on the target system to
complete the command. Contact your server administrator.
Reason: The resource limits reached reply message indicates that the
server could not be completed due to insufficient server resources
(e.g. memory, lock, buffer).
Action: Verify the connection and command parameters, and then
re-attempt the connection and command request. Review a client network
trace to determine if the server returned a SQL communications area
reply data (SQLCARD) with an optional reason code or other optional
diagnostic information.

What determines a "transient error" in AWS Athena query `FAILED` states?

According to https://docs.aws.amazon.com/athena/latest/APIReference/API_QueryExecutionStatus.html it states that
Athena automatically retries your queries in cases of certain transient errors. As a result, you may see the query state transition from RUNNING or FAILED to QUEUED.
As such, when a query execution is in a FAILED state, how can one determine (ideally from the API) if it is a transient error (and thus will transition back to RUNNING or QUEUED) or not?

google Cloud function :Error: memory limit exceeded. Function invocation was interrupted, but it works

I created a new python google function that schedule a Query in BigQuery every 10 minutes, I test it and it works.
deployment works fine
testing give this error : Error: memory limit exceeded. logs not available ( but I can see that the Query did run as expected in BigQuery)
using http trigger in cloud scheduler, I got failure with this the error message status: 503, but again, I can see in BigQuery console, it is running as expected
edit : here the code for the function
from google.cloud import bigquery
def load(request):
client = bigquery.Client()
dataset_id = 'yyyyyyyy'
table_id='xxxxxxxxxxx'
job_config = bigquery.QueryJobConfig()
job_config.use_legacy_sql = False
table_ref = client.dataset(dataset_id).table(table_id)
job_config.destination = table_ref
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE
sql = """
SELECT * FROM `xxxxxx.datastudio.today_view`;
"""
query_job = client.query(sql,location='asia-northeast1',job_config=job_config)
query_job.result()
print("Job finished.")
The BigQuery job is asynchronous. Your Cloud Function trigger it and wait up to completion. If the function fail between, it's not a problem, the 2 services aren't correlated.
If you do this by API, when you create a job (a query) you got immediately a JobID. Then you have to poll regularly this job ID to know its status. The client library do exactly the same!
Your out of memory issue come from the result which wait the completion and read the results. Set a page size or a max_result to limit the data returned.
But, you can also don't wait the end and exit immediately (skip the line query_job.result()). You will save Cloud Functions processing (useless wait) time, and thus money!

Historic external task logs

I have a test case which starts a process using Rest API POST /process-definition/key/{key}/start with one of invalid variable in the request body. From result response I took Id of the process instance to verify the external task logs using Rest API GET /history/external-task-log
In my testing I am expecting failureLog as true from the historic external task log for that processId, but actually the response has deletionLog as true.
Is this expected behavior or is it a bug ?

Step Functions: How to share context between Lambdas?

I have a data processing workflow like this. The Download task creates a session ID (GUID) and pass it to Parse task and then the Post task. If any exception occurs in these three tasks, the workflow jumps to Failed task. The Failed task would update the status of the process as failed in DynamoDB. To do that, it needs to get the session ID.
Is there any way to pass the session ID to the Failed task?
Or, if the session ID is created outside and passed in to the workflow, is it possible to share this ID to all the tasks?
Specify ResultPath property in the error catcher. By default it is $, which means that output of a failed Parallel State will be only error info. However, if you set ResultPath to, for example, $.error_info then you will preserve state and error data will be accessible under error_info property.
For more details, you may be interested in https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html (Error Handling).