{
"code": 429,
"message": "Resource has been exhausted (e.g. check quota).",
"status": "RESOURCE_EXHAUSTED"
}
I don't completely understand. Why I get this.
As I can see my quotas I very or extremely far from the limits.
As docs says I have 1000 request per minute. I use no more than 150-200 or even less.
Quotas of using Text-to-Speech shows me usage around 5%.
I start 3-4 processes started at the same time to generate 15-20 short phrases 100-150 symbols in each process. Just each process has 15-20 phrases and generate it one by one in loop. So we have 4 requests at the same time with a small gap and the gap of coure depends on phrases length.
I expect that all phrases will be generated with no any errors, however, all request fail.
Related
We have configured AWS for distributed load testing using - https://aws.amazon.com/solutions/implementations/distributed-load-testing-on-aws/
Our requirement includes achieving 5k RPS.
Please help me in understanding the inputs that needs to be provided here
When we consider the system supports 5k RPS then - What should be the Task Count, Concurrency, Ramp Up and Hold For values in order to achieve 5k RPS using AWS DLT.
We are also trying to achieve it using jmeter concurrent threads. Hoping someone could help with values and explaining the usage for the same.
We don't know.
Have you tried read the documentation the link to which you provided yourself? I.e. for the Concurrency there is a chapter called Determine the number of users which suggests start with 200 and increase/decrease depending on the resources consumption.
The same applies to the Task Count, you may either go with a single container with default resources, increase container resources or increase the number of containers.
The number of hits per second will mostly depend on your application response time, for example given 200 recommended users if response time will be 1 second - you will have 200 RPS, if response time will be 2 seconds - you will get 100 RPS, if response time will be 0.5 seconds - you will get 400 RPS, etc. See What is the Relationship Between Users and Hits Per Second? article for more comprehensive explanation if needed. The throughput can also be controlled on JMeter side using Concurrency Thread Group and Throughput Shaping Timer but again, the container(s) must have sufficient resources in order to produce the desired load.
With regards to ramp-up - again, we don't know. Personally I tend to increase the load gradually so I could correlate increasing load with other metrics. JMeter documentation recommends starting with ramp-up period in seconds equal to number of users.
The same for the time to hold the load, i.e. after ramping up to the number of users required to conduct the load of 5K RPS I would recommend holding the load for the duration of ramp-up period to see how the system behaves, whether it stabilizes when the load stops increasing, are the response times static or they go up, etc.
i am making api request to create disks to the Google Cloud platform and get status code as 200.so but when i check if disk is ready i get that "error":{"code":404 ,"reason":"notFound","domain":"global"}. when i check google cloud logs i see for request the below error code. "status": { "code": 8, "message": "RATE_LIMIT_EXCEEDED" } -can anyone help possible solutions for this like which exact quota limit should be increased? i have tried retry mechanism with pause included abt 3 sec's with that i was able to reduce the probability but the real issue still there.
you can request for an increase in the quota allocation using the GCP Console -> IAM & Admin -> Quotas. Please find the Compute Engine Quota that is showing up as exceeded and click on it to drill down to the specific operation types. I believe you were hitting the limit "Operation read requests"
you may have hit an operation read request limit.
We have an AWS Elasticsearch cluster setup. However, our Error rate alarm goes off at regular intervals. The way we are trying to calculate our error rate is:
((sum(4xx) + sum(5xx))/sum(ElasticsearchRequests)) * 100
However, if you look at the screenshot below, at 7:15 4xx was 4, however ElasticsearchRequests value is only 2. Based on the metrics info on AWS Elasticsearch documentation page, ElasticsearchRequests should be total number of requests, so it should clearly be greater than or equal to 4xx.
Can someone please help me understand in what I am doing wrong here?
AWS definitions of these metrics are:
OpenSearchRequests (previously ElasticsearchRequests): The number of requests made to the OpenSearch cluster. Relevant statistics: Sum
2xx, 3xx, 4xx, 5xx: The number of requests to the domain that resulted in the given HTTP response code (2xx, 3xx, 4xx, 5xx). Relevant statistics: Sum
Please note the different terms used for the subjects of the metrics: cluster vs domain
To my understanding, OpenSearchRequests only considers requests that actually reach the underlying OpenSearch/ElasticSearch cluster, so some the 4xx requests might not (e.g. 403 errors), hence the difference in metrics.
Also, AWS only recommends comparing 5xx to OpenSearchRequests:
5xx alarms >= 10% of OpenSearchRequests: One or more data nodes might be overloaded, or requests are failing to complete within the idle timeout period. Consider switching to larger instance types or adding more nodes to the cluster. Confirm that you're following best practices for shard and cluster architecture.
I know this was posted a while back but I've additionally struggled with this issue and maybe I can add a few pointers.
First off, make sure your metrics are properly configured. For instance, some responses (4xx for example) take up to 5 minutes to register, while OpensearchRequests are refershed every minute. This makes for a very wonky graph that will definitely throw off your error rate.
In the picture above, I send a request that returns 400 every 5 seconds, and send a response that returns 200 every 0.5 seconds. The period in this case is 1 minute. This makes it so on average it should be around a 10% error rate. As you can see by the green line, the requests sent are summed up every minute, whereas the the 4xx are summed up every 5 minute, and in between every minute they are 0, which makes for an error rate spike every 5 minutes (since the opensearch requests are not multiplied by 5).
In the next image, the period is set to 5 minutes. Notice how this time the error rate is around 10 percent.
When I look at your graph, I see metrics that look like they are based off of a different period.
The second pointer I may add is to make sure to account for when no data is coming in. The behavior the alarm has may vary based on your how you define the "treat missing data" parameter. In some cases, if no data comes in, your expression might make it so it stays in alarm when in fact there is only no new data coming in. Some metrics might return no value when no requests are made, while some may return 0. In the former case, you can use the FILL(metric, value) function to specify what to return when no value is returned. Experiment with what happens to your error rate if you divide by zero.
Hope this message helps clarify a bit.
I have a python program which query youtube to get the video details. I use the version-3 api. I have multiple processes m and a python pool of 10 processes in each python process.
songs_pool = Pool()
songs_pool =Pool(processes=10)
return_pool = songs_pool.map(getVideo,songs_list)
I get some client errors when the value of m is increased to more than 2 and the pool is increased to >5. I get forbidden errors. When I check the number of requests in the google analytics,it shows that the number of requests are 250 per sec. But according to the documentation the limit is 3000 requests per sec. I dont understand why am I getting the client errors. Can you tell me if there is a way to not get this errors and run the program quicker.
if m = 2 and process = 10 , i get no errors but it takes so much time to complete.
But if I increase them , then I get client errors which are ~ 5% of the total requests.
The per-user-limit is 3000 requests per second from a single IP address, and as soon as you go above that in a given second you'll start getting the forbidden errors. The analytics you see in the developers console will only report your average number of requests over a 5 minute period; therefore, if you had zero requests for 4 minutes, then started running your routine, the console may show only 250 requests per second (as an average) but your app likely is overrunning the limit in a given period of time or two.
It seems that you're handling it in the best way possible if speed is your concern; you'll want to run it fast enough to get a very small number of errors (so you know you're staying up there at your limit). Another option, though, might be to look into using etags; if you find yourself requesting info on the same videos a lot, you can let etags tell you whether or not any info has changed (and if the API responds that nothing has changed, it doesn't count against either your quota or your reqests/sec.)
We have a critical system that is highly dependent on Appfabric Caching. The setup we use is three nodes which serves around 2000 simultaneous connections and 150-200 requests/second.
Configurations are the default ones. We receives maybe 5-10 "ErrorCode:SubStatus" each day which is unacceptable.
I have added some performance counters but I can't see anything weird except that we sometimes see values on "Total Failure Exceptions / sec" and "Total Failure Exceptions" is increasing but one 2-3 times a day.
I would like to see what these errors comes from but I can't find them in any logs in the Event Viewer (enabled them all according to documentation). Does anyone know if these errorc could be logged somewhere and/or if it possible to seem them in any other way?
We receives maybe 5-10 "ErrorCode:SubStatus" each day which is
unacceptable.
Between 5 or 10 errors per day, with 150 requests/sec per day ?. It's quite anecdotic. Your cache client have to always handle properly caching errors. A network failure can always occurs.
5-10 "ErrorCode:SubStatus" is quite obsur. There are more than 50 error codes in AppFabric Caching. Try to get exactly these error codes. See full list here.
would like to see what these errors comes from but I can't find them
in any logs in the Event Viewer (enabled them all according to
documentation). Does anyone know if these errorc could be logged
somewhere and/or if it possible to seem them in any other way?
The only documentation available is here. The event viewer is useful to regularly monitor the health of the cache cluster. However, when troubleshooting an error, it is possible to get an even more detailed log of the cache cluster activities. I'm not sure, this will help you a lot because it's sometimes too specific.