Setting up Alarms for AWS IoT Throttling Limits - amazon-web-services

I am using AWS IoT and using the JITP for device/thing registration.
https://aws.amazon.com/blogs/iot/setting-up-just-in-time-provisioning-with-aws-iot-core/
When I carry out the load testing, some of the things are not registered. But I don't see any error either.
I feel this might be happening because of the AWS IoT Throttling Limits.
I suspect one of the below APIs calls, throttling is happening.
CreateThing
AttachPolicy
AttachThingPrincipal
How can I set an alarm to check if for any of this API calls, throttling is happening?

I don't believe there's a way to do this currently.
Some alternatives:
Don't use JITP and call the APIs yourself.
The response of the APIs (including throttling) will be returned directly to you. Then you can log when this happens.
Connect your devices within the API request limits .
Provision your Things (CreateThing, etc.) before starting the load test.

Related

How to monitor entire AWS environment?

I am looking for a way to monitor any changes that occur to my production envrionment. Such as security group changes, ec2 create/stop/deletes, database changes, s3 bucket changes, route table changes, subnet changes, etc... I was looking at using cloudtrail for this and monitoring all api calls. However, when testing, my subscribed SNS topic was not receiving any notifications when i was making some changes for a test. Curious if anyone else has a work around for this or if I am missing something? Maybe lambda? Just looking for the easiest way to receive email notifications when any changes are made within my prod environment. Thank you.
If you're looking to audit the entire event history of AWS API calls then you would use CloudTrail, remembering to create a trail and enabling the options if you want to audit S3 or Lambda API calls.
By itself CloudTrail will provide auditing, but it can be combined with CloudWatch/EventBridge to automate actions based on specific API calls such as triggering a Lambda or triggering an SNS topic.
Regarding your own implementation so far using SNS always ensure you've accepted the subscription first on the subscriber(s).
In addition you can use AWS Config with many resources in AWS providing 2 benefits to you. You will be able to maintain a history of changes to you resources, whilst also being able to configure compliance and resolution rules for your resources.

Throttling ,Resource limit exceeded when i try to get credential report in AWS

I am trying to generate credential report. I get following error
aws iam generate-credential-report
An error occurred (Throttling) when calling the GenerateCredentialReport operation (reached max retries: 4): Rate exceeded
Also , from boto3 API , I am not getting the report.
Is there any way to set limit?
I opened a support case with AWS about it, here is their response:
Thank you for contacting AWS about your GetCredentialReport issue.
According to our IAM team, we have observed an increase in the call
volume of the IAM GenerateCredentialReport API. In order to avoid any
impact that increase in call volume might have on the service and our
customers, we blocked that API. Callers will receive LimitExceeded
exception. We are actively investigating a solution that will lead to
unblocking the API.
The API seems to be working now. This is the latest response from AWS Support regarding the issue:
"We have deployed a fix to the GenerareCredentialReport API issue
which will protect the IAM service from elevated latencies and error
rates. We are going to ramp up the traffic to the API over the next
few days. In the meanwhile, clients calling the API might receive
“LimitExceed Exception”. In this case, we recommend that the clients
retry with exponential back off."

AWS API Gateway + Lamda - how to handle 1 million requests per second

we would like to create serverless architecture for our startup and we would like to support up to 1 million requests per second and 50 millions active users. How can we handle this use case with AWS architecture?
Regarding to AWS documentation API Gateway can handle only 10K requests/s and lamda can process 1K invocations/s and for us this is unacceptable.
How can we overcome this limitation? Can we request this throughput with AWS support or can we connect somehow to another AWS services (queues)?
Thanks!
Those numbers you quoted are the default account limits. Lambda and API Gateway can handle more than that, but you have to send a request to Amazon to raise your account limits. If you are truly going to receive 1 million API requests per second then you should discuss it with an AWS account rep. Are you sure most of those requests won't be handled by a cache like CloudFront?
The gateway is NOT your API Server. Lambda's are the bottleneck.
While the gateway can handle 100000 messages/sec (because it is going through a message queue), Lambdas top out at around 2,200 rps even with scaling (https://amido.com/blog/azure-functions-vs-aws-lambda-vs-google-cloud-functions-javascript-scaling-face-off/)
This differs dramatically from actually API framework implementations wherein the scale goes up to 3,500+ rps...
I think you should go with Application Load Balancer.
It is limitless in terms of RPS and can potentially be even cheaper for a large number of requests. It does have fewer integrations with AWS services though, but in general, it has everything you need for a gateway.
https://dashbird.io/blog/aws-api-gateway-vs-application-load-balancer/

Does AWS CloudWatch support metric whitelisting?

It looks like CloudWatch gives customers 10 custom metrics under the free plan, then each additional one costs $0.50. Does anyone know how to enforce PutMetric accept only a set of custom metrics?
I'm interested in limiting the custom metrics coming from mobile clients or possibly adding a layer of protection against abuse.
Is the only solution to implement my own service which does the validation against a whitelist?
One option you could look at is placing AWS Gateway in front of Cloudwatch and making the calls through the api.
This example shows you how to do this for S3, but there's not reason why you couldn't do something similar for Cloudwatch.
This shows you how to do it for dynamo: https://aws.amazon.com/blogs/compute/using-amazon-api-gateway-as-a-proxy-for-dynamodb/
I ended up running a simple tomcat service which validates metrics against a whitelist (stored in s3) and publishes them to CloudWatch.

AWS Lambda using API Gateway error message

Everything was working yesterday and I'm simply still testing so my capacity shouldn't be high to begin with but I keep receiving these errors today:
{
Message = "We currently do not have sufficient capacity in the region you requested. Our system will be working on provisioning
additional capacity. You can avoid getting this error by temporarily
reducing your request rate.";
Type =Service;
}
What is this error message and should I be concerned that something like this would happen when I go into production? This is a serious error because my users are mandated to login using calls to api gateway (utilizing aws lambda).
This kind of error should not last long as it will immediately trigger AWS provision request.
If you concern about your api gateway availbility, consider to create redundant lambda function on other regions and switch whenever this error occurs. However calling lambda from a remote region can introduce long latency.
Another suggestion is, please review the aws limits for API gateway and Lambda services in your account. If your requests do exceed the limits, raise ticket to aws to extend it.
Amazon API Gateway Limits
Resource Default Limit
Maximum APIs per AWS account 60
Maximum resources per API 300
Maximum labels per API 10
Increase the limits is free service in aws.
Refer: Amazon API Gateway Limits
AWS Lambda posted an event on the service health dashboard, so please follow this for further details on that specific issue.
Unfortunately, if you want to return a custom code when Lambda errors in this way you would have to write a mapping template and attach it to every integration response where you used a Lambda integration.
We recognize that this is suboptimal and is work most customers would prefer API Gateway just handle for them. With that in mind, we already have a high priority item on our backlog to make it easier to pass through the status codes from the Lambda integration. I cannot, however, commit to a timeframe as to when this would be available.