Retrieving AWS SSM Parameter taking a long time - amazon-web-services

I have a python lambda that forwards requests to an external API. The lambda is part of a target group that an ALB targets. The lambda goes through surges where it has to handle hundreds of invocations per second.
Everything works well for the most part except for when we hit some odd issue where it will take upwards of 20 seconds or so to retrieve a secure string param from parameter store. When that delay of 20 seconds occurs, the system making the call to our alb times out and throws an error.
I was thinking that I could do the ssm param retrieval in an init method of the lambda and then keep the lambda always warm but that seems like a waste of resources just to manage the ssm param reading issue.
Are there any suggestions on how this should be done or configured (or if perhaps I'm overlooking something that I should be doing)?

Every AWS API has a request limit - https://aws.amazon.com/premiumsupport/knowledge-center/ssm-parameter-store-rate-exceeded/
So, yes you should cache your parameters - How do I cache multiple AWS Parameter Store values in an AWS Lambda?

Related

lambda and fargate errors/timeouts

i have a python api that i have tried on vms, fargate, and lambda.
vms - less errors when capacity is large enough
fargate - second less errors when capacity is large enough, but when autoscaling, i get some 500 errors. looks like it doesn't autoscale quick enough.
lambda - less consistent. when there are a lot of api calls, less errors. but from cold start, it may periodically fail. i do not pre-provision. when i do, i get less errors too.
i read on the below post, cold start for lambda is less than 1 sec? seems like it's more. one caveat is that each lambda function will check for an existing "env" file. if it does not exist, it will download from s3. however this is done only when hitting the api. the lambda function is listening and responding. when you hit api, the lambda function will respond and connect, download the .env file, and process further the api call. fargate also does the same, but less errors again. any thoughts?
i can pre-provision, but it gets kind of expensive. at that point, i might as will go back to VMs with autoscaling groups, but it's less cloud native. the vms provide the fastest response by far and harder to manage.
Can AWS Lambda coldout cause API Gateway timeout(30s)?
i'm using an ALB in front of lambda and fargate. the vms simply use round robin dns.
questions:
am i doing something wrong with fargate or lambda? are they alright for apis or should i just go back to vms?
what or who maintains api connection while lambda is starting up from a cold start? can i have it retry or hold on to the connection longer?
thanks!
am i doing something wrong with fargate or lambda? are they alright for apis or should i just go back to vms?
The one thing that strikes me is downloading env from s3. Wouldn't it be easier and faster to keep your env data in SSM Parameter Store? Or perhaps, passing them as env variables to the lambda function itself.
what or who maintains api connection while lambda is starting up from a cold start? can i have it retry or hold on to the connection longer?
API gateway. Sadly you can't extend 30 s time limit. Its hard limit.
i'm using an ALB in front of lambda and fargate.
It seems to me that you have API gateway->ALB->Lambda function. Why would you need ALB in that? Usually there is no such need.
i can pre-provision, but it gets kind of expensive.
Sadly, this is the only way to minimize cold-starts.

Lambda execution time out after 15 mins what I can do?

I have a script running on Lambda, I've set the timeout to maximum 15 mins but it's still giving me time out error, there is not much infomation in the lofs, how I can solve this issue and spot what is taking soo much time? I tested the script locally and it's fairly quick.
Here's the error:
{
"errorMessage": "2020-09-10T18:26:53.180Z xxxxxxxxxxxxxxx Task timed out after 900.10 seconds"
}
If you're exceeding the 15 minutes period there are a few things you should check to identify:
Is the Lambda connecting to resources in a VPC? If so is it attached via VPC config, and do the target resources allow inbound access from the Lambda.
Is the Lambda connecting to a public IP but using VPC configuration? If so it will need a NAT attached to allow outbound access.
Are there any long running processes as part of your Lambda?
Once you've ruled these out consider increasing the available resources of your Lambda, perhaps its hitting a cap and is therefore performing slow. Increasing the memory will also increase the available CPU for you.
Adding comments in the code will log to CloudWatch logs, these can help you identify where in the code the slowness starts. This is done by simply calling the general output/debug function of your language i.e. print() in Python or console.log() in NodeJS.
If the function is still expected to last longer than 15 minutes after this you will need to break it down into smaller functions performing logical segments of the operation
A suggested orchestrator for this would be to use a step function to handle the workflow for each stage. If you need shared storage between each Lambda you can make use of EFS to be attached to all of your Lambdas so that they do not need to upload/download between the operations.
Your comment about it connecting to a SQL DB is likely the key. I assume that DB is in AWS in your VPC. This requires particular setup. Check out
https://docs.aws.amazon.com/lambda/latest/dg/configuration-vpc.html
https://docs.aws.amazon.com/lambda/latest/dg/services-rds-tutorial.html
Another thing you can do is enable debug level logging and then look at the details in CloudWatch after trying to run it. You didn't mention which language your lambda uses, so how to do this could be different for the language you use. Here's how it would be done in python:
LOGGER = logging.getLogger()
LOGGER.setLevel(logging.getLevelName('DEBUG'))

Is there a way to turn on SageMaker model endpoints only when I am receiving inference requests

I have created a model endpoint which is InService and deployed on an ml.m4.xlarge instance. I am also using API Gateway to create a RESTful API.
Questions:
Is it possible to have my model endpoint only Inservice (or on standby) when I receive inference requests? Maybe by writing a lambda function or something that turns off the endpoint (so that it does not keep accumulating the per hour charges)
If q1 is possible, would this have some weird latency issues on the end users? Because it usually takes a couple of minutes for model endpoints to be created when I configure them for the first time.
If q1 is not possible, how would choosing a cheaper instance type affect the time it takes to perform inference (Say I'm only using the endpoints for an application that has a low number of users).
I am aware of this site that compares different instance types (https://aws.amazon.com/sagemaker/pricing/instance-types/)
But, does having a moderate network performance mean that the time to perform realtime inference may be longer?
Any recommendations are much appreciated. The goal is not to burn money when users are not requesting for predictions.
How large is your model? If it is under the 50 MB size limit required by AWS Lambda and the dependencies are small enough, there could be a way to rely directly on Lambda as an execution engine.
If your model is larger than 50 MB, there might still be a way to run it by storing it on EFS. See EFS for Lambda.
If you're willing to wait 5-10 minutes for SageMaker to launch, you can accomplish this by doing the following:
Set up a Lambda function (or create a method in an existing function) to check your endpoint status when the API is called. If the status != 'InService', call the function in #2.
Create another method that when called launches your endpoint and creates a metric alarm in Cloudwatch to monitor your primary lambda function's invocations. When the threshold falls below your desired invocations / period, it will call the function in #3.
Create a third method to delete your endpoint and the alarm when called. Technically, the alarm can't call a Lambda function, so you'll need to create a topic in SNS and subscribe this function to it.
Good luck!

AWS Lambda inside VPC. 504 Gateway Timeout (ENI?)

I have a Serverless .net core web api lambda application deployed on AWS.
I have this sitting inside a VPC as I access ElasticSearch service inside that same VPC.
I have two API microservices that connect to the Elasticsearch service.
After a period of non use (4 hours, 6 hours, 18 hours - I'm not sure exactly but seems random), the function becomes unresponsive and I get a 504 gateway timeout error, "Endpoint cannot be found"
I read somewhere that if "idle" for too long, the ENI is released back into the AWS system and that triggering the Lambda again should start it up.
I can't seem to "wake" up the function by calling it as it keeps timing out with the above error (I have also increased the timeouts from default).
Here's the kicker - If I make any changes to the specific lambda function, and save those changes (this includes something as simple as changing the timeout value) - My API calls (BOTH of them, even though different lambdas) start working again like it has "kicked" back in. Obviously the changes do this, but why?
Obviously I don't want timeouts in a production environment regardless of how much, OR how little the lambda or API call is used.
I need a bulletproof solution to this. Surely it's a config issue of some description but I'm just not sure where to look.
I have altered Route tables, public/private subnets, CIDR blocks, created internet gateways, NAT etc. for the VPC. This all works, but these two lambdas, that require VPC access, keeps falling "asleep" somehow.
The is because of Cold Start of Lambda.
There is a new feature which was release in reInvent 2019, where in there is a provisioned concurrency for lambda (don't get confused with reserved concurrency).
Ensure the provisioned concurrency to minimum 1 (or the amount of requests to be served in parallel) to have lambda warm always and serve requests
Ref: https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/
To get more context, Lambda in VPC uses hyperplane ENI and functions in the same account that share the same security group:subnet pairing use the same network interfaces.
If Lambda functions in an account go idle for sometime (typically no usage for 40 mins across all functions using that ENI, as I got this time info from AWS support), the service will reclaim the unused Hyperplane resources and so very infrequently invoked functions may still see longer cold-start times.
Ref: https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/

AWS - how do you share an access token between lambda processes?

First i have a question about the way Lambda works:
If it's only triggered by 1 SQS queue and that queue now contains 100 messages, would it sequentially create and tear down 100 lambdas processes? Or would it do it in parallel?
My second question is the main one:
The job of my lambda is to request an access token (for an external service) that expires every hour and using it, perform some action on that external service.
Now, i want to be able to cache that token and only ask for it every 1 hour, instead of every time i make the request using the lambda.
Given the nature of how Lambda works, is there a way of doing it through code?
How can i make sure all Lambdas processes use the same access token?
(I know i can create a new Redis instance and make them all point to it, but i'm looking for a "simpler" solution)
You can stuff the token in the SSM parameter store. You can encrypt the value. Lambdas can check the last modified date on the value to monitor when expiration is pending and renew. No Redis instance to maintain, and the value would be encrypted.
https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-paramstore.html
You could also use DynamoDB for this. Lower overhead than Redis since it’s serverless. If you have a lot of concurrent Lambda, this may be preferable to SSM because you may run into rate limiting on the API. A little more work because you have to setup a DynamoDB table.
Another option would be to have a “parent” Lambda function that gets the API token and calls the “worker” Lambdas and passes the token as a parameter.