Boto3 start_text_translation_job TooManyRequestsException - amazon-web-services

Error in Python running Boto3 start_text_translation_job
botocore.errorfactory.TooManyRequestsException: An error occurred (TooManyRequestsException) when calling the StartTextTranslationJob operation: Request failed due to too many requests.
I wrote a Python script to kick off batch translation from EN to 48 languages. The first 10 submitted fine, but the 11th one got the above error.

At first, I thought I had to slow down and put a sleep between the calls, but that was NOT the issue.
I tried to start a job using the AWS web console, and got a similar error:
Request failed due to too many requests.
This page on AWS Translate Limitations indicates that you can only have 10 translation jobs started at the same time.

Related

AWS CLI in WSL2: "RequestTimeTooSkewed"

I execute the command: aws s3 ls and got the following error message:
An error occurred (RequestTimeTooSkewed) when calling the ListBuckets operation: The difference between the request time and the current time is too large.
Please advise.
If you're using WSL, you can run wsl --shutdown in CMD or PowerShell. This ensures the next time you start a WSL session, it cold boots and fixes the time.
https://github.com/microsoft/WSL/issues/4245
AWS API requests are 'signed' and part of the information exchanged is a timestamp. If the timestamp is more than 900 seconds old the request will be rejected.
This is done to prevent "replay attacks" where old requests are sent again.
You can fix this by correcting the Date and Time on the system where you are sending the request.

BigQuery unable to insert job. Workflow failed

I need to run a batch job from GCS to BigQuery via Dataflow and Beam. All my files are avro with the same schema.
I've created a dataflow java application that is successful on a smaller set of data (~1gb, about 5 files).
But when I try to run it on a bigger set of data ( >500gb, >1000 files), i receive an error message
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException: Failed to create load job with id prefix 1b83679a4f5d48c5b45ff20b2b822728_6e48345728d4da6cb51353f0dc550c1b_00001_00000, reached max retries: 3, last failed load job: ...
After 3 retries it terminates with:
Workflow failed. Causes: S57....... A work item was attempted 4 times without success....
This step is the load to BigQuery.
Stack Driver says the processing is stuck in step ....for 10m00s... and
Request failed with code 409, performed 0 retries due to IOExceptions, performed 0 retries due to unsuccessful status codes.....
I looked up the 409 error code stating that I might have an existing job, dataset, or table. I've removed all the tables and re-ran the application but it still shows the same error message.
I am currently limited on 65 workers and I have them using n1-standard-4 cpus.
I believe there are other ways to move the data from gcs to bq, but i need to demonstrate dataflow.
"java.lang.RuntimeException: Failed to create job with prefix beam_load_csvtobigqueryxxxxxxxxxxxxxx, reached max retries: 3, last failed job: null.
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers$PendingJob.runJob(BigQueryHelpers.java:198)..... "
One of the possible cause could be the privilege issue. Ensure the user account which interacts with the BigQuery has privilege "bigquery.jobs.create" in the predefined role "*BigQuery User"
Posting the comment of #DeaconDesperado as community wiki, where they experienced the same error and what they did was remove the restricted characters (eg. Unicode letters, marks, numbers, connectors, dashes or spaces) in the table name and the error is gone.
I got the same problem using "roles/bigquery.jobUser", "roles/bigquery.dataViewer", and "roles/bigquery.user". But only when granting "roles/bigquery.admin" did the issue get resolved.

Authentication with Cognito - where to find logs

We have 2 React Native app are using AWS Cognito for authentication. We use library react-native-aws-cognito-js in our code. The apps are working fine until these 2 days. Apps are experiencing intermittent "Internal Server Error".
How can I find more information about this error? Any tool can help us pinpoint the cause?
Update
From CloudTrail, each API call has an event "CreateNetworkInterface". Many of such API calls have error code "Client.NetworkInterfaceLimitExceeded". What is the cause and solution to this?
According to this AWS Doc (in Chinese), CloudWatch will not write to log when error is due to insufficient IP/ENI. That explains the increase in error number but no logs in CloudWatch.
Upate 2
We have found a scheduled Lambda job which may exhausted IP addresses. We stopped the batch job. But still can't have too many user login to server due to "Client.NetworkInterfaceLimitExceeded" error. I realized that there are many "CreateNetworkInterface" event and few "DeleteNetworkInterface" event. How can I "clean up / reset" all network interface in VPC?
Short answer: Cloud Trail.
Long answer with a suggestion
Assuming your application code is fine, most likely the cause of your 500 error is based on Cognito's initial limitations (e.g., number of calls per user): https://docs.aws.amazon.com/cognito/latest/developerguide/limits.html.
AWS suggests to use Cloud Trail, for logging Api calls.
However I would suggest, to prove the limitations first, add some logs around the api call yourself, and in development you could call your app/api with a high number of calls; and most likely you will see the 500 error due to the limitations.
You could do the following in the terminal:
for i in `seq 1 1000`; do curl --cookie SecureCookie=TokenValueFromAWS http://localhost:desirablePort/SecuredPath; done

When using the Admin SDK directory API to insert Org Units a dailyLimitExceeded error is returned even though that quota has not been reached

I work for an Student Information System and we're using the Admin SDK directory API to create school districts Google Org Unit structures from within our software.
POST https://www.googleapis.com/admin/directory/v1/customer/customerId/orgunits
When generating these API requests we're consistently receiving dailyLimitExceeded errors even when the district's quota has not been reached.
This error can be bypassed by ignoring the error, and implementing an exponential back-off routine, but I believe this to be acting much more like the quotaExceeded error is intended to act rather than dailyLimitExceeded, in that the request succeeds afterward on the first retry of this request.
In detail, the test I just ran successfully completed 9 of these API calls and then I received this response on the 10th:
Google.Apis.Requests.RequestError
Quota limit exceeded for the day. [403]
Errors [Message[Quota limit exceeded for the day.] Location[ - ] Reason[dailyLimitExceeded] Domain[usageLimits]
From the start of the batch of API calls it took about 10 seconds to get to the point where the error occurred.
Thanks for your help!
What I would suggest is to slow down your API requests. Don't make like 10 requests in 1 second. Give it a space in between requests. You are correct to implement exponential backoff. Also, if you can, use other accounts as well to make requests.

How to remove RequestTimeTooSkewed check from Amazon?

I have a Java 7 "agent" program running on several client machines (mostly Windows XP). My "agent" uploads client files to Amazon S3 and often I get this error:
RequestTimeTooSkewed
I know this is because the client's computer system time difference is too large compared to Amazon's. Here's my problem: I can't control the client's computer (system) time! So, I don't want Amazon to care about time differences.
I heard about jets3t, but I'm hoping not having to resort to yet another tool (agent footprint must remain small).
Any ideas how to remove this check and get rid of this pesky error?
Error detail:
Status Code: 403, AWS Service: Amazon S3, AWS Request ID: 59C9614D15006F23, AWS Error Code: RequestTimeTooSkewed, AWS Error Message: The difference between the request time and the current time is too large., S3 Extended Request ID: v1pGBm3ed2J9dZ3sG/3aDrG3DUGSlt3Ac+9nduK2slih2wyaAnc1n5Jrt5TkRzlV
The error is coming from the S3 service, not from the client, so there really isn't anything you can do other than correct the clock on the client. That check is being done on the service to help detect and prevent replay attacks so it's an important part of the overall security of the service.
Trying a different client-side SDK won't help.