Speech-Api daily requests quota counter increses too much after one request

Speech-Api daily requests quota counter increses too much after one request - google-cloud-platform

As I am aware of the limitations listed here, I need to receive some clarifications on the quota limit.
I'm using the Node.js library to make a simple asynchronous speech-to-text API, using a .raw file which is stored in my bucket.
After the request is done, when checking the API Manager Traffic, the daily requests counter is increased by 50 to 100 requests.
I am not using any kind of requests libraries or other frameworks. Just the code from the gCloud docs.
var file = URL.bucket + "audio.raw"; //require from upload.
speech.startRecognition(file, config).then((data) => {
var operation = data[0];
operation.on('complete', function(transcript) {
console.log(transcript);
});
})

I believe this has to do with the operation.on call, which registers a listener for the operation to complete, but continues to poll the service until it does finally finish.
I think, based on looking at some code, that you can change the interval at which the service will poll using the longrunning.initialRetryDelayMillis setting, which should reduce the number of requests you see in your quota consumption.
Some places to look:
Speech client constructor: https://github.com/GoogleCloudPlatform/google-cloud-node/blob/master/packages/speech/src/index.js#L70
GAX Speech Client's constructor: https://github.com/GoogleCloudPlatform/google-cloud-node/blob/master/packages/speech/src/v1/speech_client.js#L67
Operations client: https://github.com/googleapis/gax-nodejs/blob/master/lib/operations_client.js#L74
GAX's Operation: https://github.com/googleapis/gax-nodejs/blob/master/lib/longrunning.js#L304

Related

AWS S3 Client throws "Operator called default onErrorDropped\njava.util.concurrent.CompletionException"

I am using the below AWS S3 AsyncClient from my Spring webflux application.As part of performance testing I had to introduce eventLoopGroup in S3 Client Config below to enable the S3 operations when the request rate was really high with 50 concurrent users triggering 20 request each having 5 documents to upload to S3.Before this I was getting the below error
reactor.core.publisher.Operators : Operator called default onErrorDropped\njava.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.\nConsider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.\nIncreasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process
Introducing eventLoopGroup with numberOfThreads as 15 drastically reduced the error count to 6 from around a 1000. However once it fails, it didn't allow the next requests and failed with the same error.I was assuming that after sometime the used threads will be returned to the thread pool and would be available for use.But I have to restart the spring webflux application to get rid of the errors.Kindly let me know that the changes I have made below are correct.
#Bean
public S3AsyncClient s3client() {
SdkAsyncHttpClient httpClient =
NettyNioAsyncHttpClient.builder()
.readTimeout(Duration.ofSeconds(30))
.writeTimeout(Duration.ZERO)
.maxConcurrency(300)
.eventLoopGroup(SdkEventLoopGroup.builder().numberOfThreads(15).build())
.connectionAcquisitionTimeout(Duration.ofSeconds(30))
.build();
S3Configuration serviceConfiguration =
S3Configuration.builder()
.checksumValidationEnabled(false)
.chunkedEncodingEnabled(true)
.build();
S3AsyncClientBuilder s3AsyncClientBuilder =
S3AsyncClient.builder()
.httpClient(httpClient)
.region(Region.AP_SOUTHEAST_2)
.serviceConfiguration(serviceConfiguration);
return s3AsyncClientBuilder.build();
}

Latency in calls to Google Cloud APIs from Cloud Run instance

When I make the calls from Cloud Run instance to other cloud APIs for some reason there are huge delays in responses.
Everything works within 1 project.
Even from local machine the calls are much faster (couple of secs) - but deployed in the cloud it takes couple of mins for some requests to complete. As I see it is relevant for all APIs (apart from Firestore, Translate and TTS APIs as well). This is not related to cold starts for sure.
Code example (Node JS) and logs are below:
console.log('Received the request for stats');
const usersCollection = this.firestore.collection('users')
const snapshot = await this.usersCollection.get();
console.log('Fetched all users from Firestore');

Well after some further investigation I figured out what the problem was.
The thing is that all the operations I perform happen not before the response is sent but after (this is the way chatbot is architectured).
So the flow looks like this:
request to do smth - response 200 that the request is accepted
all the business logic and work
chatbot sends the message with the results
According to the docs the CPU is allocated only during the request processing by default so the only thing I had to change is to enable CPU allocation for background activities: https://cloud.google.com/run/docs/tips/general#background-activity

API Management - Response Time

We are working on setting up an API Management portal for one of our Web API. We are using eventhubs for logging the events and we are transferring the event messages to Azure Blob storage using Azure functions.
We would like to know how can we find the Time taken by API Management portal for providing the response for a message (we are capturing the time taken at the back end api layer but not from the API Management layer).
Regards,
John

The simpler solution is to enable Azure Monitor Diagnostic Logs for the Apimanagement service. You will get raw logs for each request including
durationMs - interval between receiving request line and headers from a client and writing last chunk of response body to a client. All writes and reads include network latency.
BackendTime - time spent waiting on backend response
ClientTime - time spent with client for request and response
CacheTime - time spent on fetching from cache
You can also refer this video.

Not the correct way of doing this, but still get an idea of how much time each request is taking. We can actually use the context variable to set the start time in the inbound policy node and then calculate the end time in the outbound node.

Amazon API gateway timeout

I have some issue with API gateway. I made a few API methods, sometimes they work longer than 10 seconds and Amazon returns 504 error. Here is screenshot below:
Please help! How can I increase timeout?
Thanks!

Right now the default limit for Lambda invocation or HTTP integration is 30s according to http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html and this limit is not configurable.

As of Dec/2017, the maximum value is still 29 seconds, but should be able to customize the timeout value.
https://aws.amazon.com/about-aws/whats-new/2017/11/customize-integration-timeouts-in-amazon-api-gateway/
This can be set in "Integration Request" of each method in APIGateway.

Finally in 2022 we have a workaround. Unfortunately AWS did not change the API Gateway so that's still 29 seconds but, you can use a built-in HTTPS endpoint in the lambda itself: Built-in HTTPS Endpoints for Single-Function Microservices
which is confirmed to have no timeout-so essentially you can have the full 15 minute window of lambda timeout: https://twitter.com/alex_casalboni/status/1511973229740666883
For example this is how you define a function with the http endpoint using aws-cdk and typescript:
const backendApi = new lambda.Function(this, 'backend-api', {
memorySize: 512,
timeout: cdk.Duration.seconds(40),
runtime: lambda.Runtime.NODEJS_16_X,
architecture: Architecture.ARM_64,
handler: 'lambda.handler',
code: lambda.Code.fromAsset(path.join(__dirname, '../dist')),
environment: {
...parsedDotenv
}
})
backendApi.addFunctionUrl({
authType: lambda.FunctionUrlAuthType.NONE,
cors: {
// Allow this to be called from websites on https://example.com.
// Can also be ['*'] to allow all domain.
allowedOrigins: ['*']
}
})

You can't increase the timeout, at least not now. Your endpoints must complete in 10 seconds or less. You need to work on improving the speed of your endpoints.
http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html

Lambda functions will timeout after a max. of 5 min; API Gateway requests will timeout after 29 sec. You can't change that, but you can workaround it with asynchronous execution pattern, I wrote I blog post about:
https://joarleymoraes.com/serverless-long-running-http-requests/

I wanted to comment on "joarleymoraes" post but don't have enough reputation. The only thing to add to that is that you don't HAVE to refactor to use async, it just depends on your backend and how you can split it up + your client side retries.
If you aren't seeing a high percentage of 504's and you aren't ready for async processing, you can implement client side retries with exponential backoff on them so they aren't permanent failures.
The AWS SDK automatically implements retries with backoff, so it can help to make it easier, especially since Lambda Layers will allow you to maintain the SDK for your functions without having to constantly update your deployment packages.
Once you do that it will result in less visibility into those timeouts, since they are no longer permanent failures. This can buy you some time to deal with the core problem, which is that you are seeing 504's in the first place. That certainly can mean refactoring your code to be more response, splitting up large functions into more "micro service" type concepts and reducing external network calls.
The other benefit to retries is that if you retry all 5xx responses from an application, it can cover a lot of different issues which you might see during normal execution. It is generally considered in all applications that these issues are never 100% avoidable so it's best practice to go ahead and plan for the worst!
All of that being said, you should still work on reducing the lambda execution time or going async. This will allow you to set your timeout values to a much smaller number, which allows you to fail faster. This helps a lot for reducing the impact on the front end, since it doesn't have to wait 29 seconds to retry a failed request.

Timeouts can be decreased but cannot be increased more than 29 seconds. The backend on your method should return a response before 29 seconds else API gateway will throw 504 timeout error.
Alternatively, as suggested in some answers above, you can change the backend to send status code 202 (Accepted) meaning the request has been received successfully and the backend then continues further processing. Of course, we need to consider the use case and it's requirements before implementing the workaround

Lambda functions have 15 mins of max execution time, but since APIGateway has strict 29 second timeout policy, you can do following things to over come this.
For an immediate fix, try increasing your lambda function size. Eg.: If your lambda function has 128 MB memory, you can increase it to 256 MB. More memory helps function to execute faster.
OR
You can use lambdaInvoke() function which is part of the "aws-sdk". With lambdaInvoke() instead of going through APIGateway you can directly call that function. But this is useful on server side only.
OR
The best method to tackle this is -> Make request to APIGateway -> Inside the function push the received data into an SQS Queue -> Immediately return the response -> Have a lambda function ready which triggers when data available in this SQS Queue -> Inside this triggered function do your actual time complex executions -> Save the data to a data store -> If call is comes from client side(browser/mobile app) then implement long-polling to get the final processed result from the same data store.
Now since api is immediately returning the response after pushing data to SQS, your main function execution time will be much less now, and will resolve the APIGateway timeout issue.
There are other methods like using WebSockets, Writing event driven code etc. But above methods are much simpler to implement and manage.

While you cannot increase the timeout, you can link lambda's together if the work is something that could be split up.
Using the aws sdk:
var aws = require('aws-sdk');
var lambda = new aws.Lambda({
region: 'us-west-2' //change to your region
});
lambda.invoke({
FunctionName: 'name_of_your_lambda_function',
Payload: JSON.stringify(event, null, 2) // pass params
}, function(error, data) {
if (error) {
context.done('error', error);
}
if(data.Payload){
context.succeed(data.Payload)
}
});
Source: Can an AWS Lambda function call another
AWS Documentation: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html

As of May 21, 2021 This is still the same. The hard limit for the maximum time is 30 seconds. Below is the official document on quotas for API gateway.
https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html#http-api-quotas

The timeout limits cannot be increased so a response should be returned within 30 seconds. The workaround I usually do :
Send the result in an async way. The lambda function should trigger
another process and sends a response to the client saying
Successfully started process X and the other process should notify
the client in async way once it finishes (Hit an endpoint, Send a
slack notification or an email..). You can found a lot of interesting resources concerning this topic
Utilize the full potential of the multiprocessing in your lambda
function and increase the memory for a faster computing time
Eventually, if you need to return the result in a sync way and one
lambda function cannot do the job, you could integrate API gateway
directly with step function so you would have multiple lambda
function working in parallel. It may seem complicated but in fact it
is quite simple

Custom timeout between 50 and 29,000 milliseconds for WebSocket APIs and between 50 and 30,000 milliseconds for HTTP APIs. The default timeout is 29 seconds for WebSocket APIs and 30 seconds for HTTP APIs

Google App Engine - http request/response

I have a Java web app hosted on Google App Engine (GAE). The User clicks on a button and he gets a data table with 100 rows. At the bottom of the page, there is a "Make Web service calls" button. Clicking on that, the application will take one row at a time and make a third party web-service call using the URLConnection class. That part is working fine.
However, since there is a 60 second limit to the HttpRequest/Response cycle, all the 100 transactions don't go through as the timeout happens around row 50 or so.
How do I create a loop and send the Web service calls without the User having to click on the 'Make Webservice calls' more than once?
Is there a way to stop the loop before 60 seconds and then start again without committing the HttpResponse? (I don't want to use asynchronous Google backend).
Also, does GAE support file upload (to get the 100 rows from a file instead of a database)
Thank you.
Adding some code as per the comments:
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url
.openConnection();
connection.setDoOutput(true);
connection.setRequestMethod("POST");
connection.setConnectTimeout(35000);
connection.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
connection.setRequestProperty("Authorization", encodedCredentials);
// Send post request
DataOutputStream wr = new DataOutputStream(
connection.getOutputStream());
wr.writeBytes(submitRequest);

It all depends on what happens with the results of these calls.
If results are not returned to a UI, there is no need to block it. You can use Tasks API to create 100 tasks and return a response to a user. This will take a few seconds at most. The additional benefit is that you can make up to 10 calls in parallel by using tasks.
If results have to be returned to a user, you can still use up to 10 threads to process as many requests in parallel as possible. Hopefully, this will bring your time under 1 minute, but you cannot guarantee it since you depend on responses from third-party resources which maybe unavailable at the moment. You will have to implement your own retry mechanism.
Also note that users are not accustomed to waiting for several minutes for a website to respond. You may consider a different approach when a user is notified after the last request is processed without blocking your client code.
And yes, you can load data from files on App Engine.

Try using asynchronous urlfetch calls:
LinkedList<Future<HttpResponse>> futures;
// Start all the request
for (Url url : urls) {
HttpRequest request = new HttpRequest(url, HTTPMethod.POST);
request.setPayload(...)
futures.add(urlfetchservice.fetchAsync(request);
}
// Collect all the results
for (Future<HttpResponse> future : futures) {
HttpResponse response = future.get()
// Do something with future
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js