aws fallback option while fargate is scaling - amazon-web-services

Given that fargate takes a little while to spinup a new healthy container that takes requests, what's the best way to handle sudden spikes in traffic that require additional processing power?
I was thinking of using lambda as a fallback, BUT I haven't seen this mentioned as a "design pattern" anywhere AND for this to work every request would have to additionally go through a Step Function or something (?) that will switch to lambda - which cost wise might be really heavy.

Related

Calling lambda functions programatically every minute at scale

While I have worked with AWS for a bit, I'm stuck on how to correctly approach the following use case.
We want to design an uptime monitor for up to 10K websites.
The monitor should run from multiple AWS regions and ping websites if they are available and measure the response time. With a lambda function, I can ping the site, pass the result to a sqs queue and process it. So far, so good.
However, I want to run this function every minute. I also want to have the ability to add and delete monitors. So if I don't want to monitor website "A" from region "us-west-1" I would like to do that. Or the other way round, add a website to a region.
Ideally, all this would run serverless and deployable to custom regions with cloud formation.
What services should I go with?
I have been thinking about Eventbridge, where I wanted to make custom events for every website in every region and then send the result over SNS to a central processing Lambda. But I'm not sure this is the way to go.
Alternatively, I wanted to build a scheduler lambda that fetches the websites it has to schedule from a DB and then invokes the fetcher lambda. But I was not sure about the delay since I want to have the functions triggered every minute. The architecture should monitor 10K websites and even more if possible.
Feel free to give me any advise you have :)
Kind regards.
In my opinion Lambda is not the correct solution for this problem. Your costs will be very high and it may not scale to what you want to ultimately do.
A c5.9xlarge EC2 costs about USD $1.53/hour and has a 10gbit network. With 36 CPU's a threaded program could take care of a large percentage - maybe all 10k - of your load. It could still be run in multiple regions on demand and push to an SQS queue. That's around $1100/month/region without pre-purchasing EC2 time.
A Lambda, running 10000 times / minute and running 5 seconds every time and taking only 128MB would be around USD $4600/month/region.
Coupled with the management interface you're alluding to the EC2 could handle pretty much everything you're wanting to do. Of course, you'd want to scale and likely have at least two EC2's for failover but with 2 of them you're still less than half the cost of the Lambda. As you scale now to 100,000 web sites it's a matter of adding machines.
There are a ton of other choices but understand that serverless does not mean cost efficient in all use cases.

lambda and fargate errors/timeouts

i have a python api that i have tried on vms, fargate, and lambda.
vms - less errors when capacity is large enough
fargate - second less errors when capacity is large enough, but when autoscaling, i get some 500 errors. looks like it doesn't autoscale quick enough.
lambda - less consistent. when there are a lot of api calls, less errors. but from cold start, it may periodically fail. i do not pre-provision. when i do, i get less errors too.
i read on the below post, cold start for lambda is less than 1 sec? seems like it's more. one caveat is that each lambda function will check for an existing "env" file. if it does not exist, it will download from s3. however this is done only when hitting the api. the lambda function is listening and responding. when you hit api, the lambda function will respond and connect, download the .env file, and process further the api call. fargate also does the same, but less errors again. any thoughts?
i can pre-provision, but it gets kind of expensive. at that point, i might as will go back to VMs with autoscaling groups, but it's less cloud native. the vms provide the fastest response by far and harder to manage.
Can AWS Lambda coldout cause API Gateway timeout(30s)?
i'm using an ALB in front of lambda and fargate. the vms simply use round robin dns.
questions:
am i doing something wrong with fargate or lambda? are they alright for apis or should i just go back to vms?
what or who maintains api connection while lambda is starting up from a cold start? can i have it retry or hold on to the connection longer?
thanks!
am i doing something wrong with fargate or lambda? are they alright for apis or should i just go back to vms?
The one thing that strikes me is downloading env from s3. Wouldn't it be easier and faster to keep your env data in SSM Parameter Store? Or perhaps, passing them as env variables to the lambda function itself.
what or who maintains api connection while lambda is starting up from a cold start? can i have it retry or hold on to the connection longer?
API gateway. Sadly you can't extend 30 s time limit. Its hard limit.
i'm using an ALB in front of lambda and fargate.
It seems to me that you have API gateway->ALB->Lambda function. Why would you need ALB in that? Usually there is no such need.
i can pre-provision, but it gets kind of expensive.
Sadly, this is the only way to minimize cold-starts.

How to auto-scale an AWS lambda using auto-scaling service

I am trying to add the lambda in the Auto-scaling target, and getting the error "No scalable resources found" while trying to fetch by tag.
Is it possible or allowed to add lambda to the auto-scaling target?
UPDATE:
I am trying to figure out how to change the provisional concurrency during non-peak hours on the application that will help to save some cost so I was exploring the option of auto-scaling
Lambda automatically scales out for incoming requests, if all existing execution contexts (lambda instances) are busy. There is basically nothing you need to do here, except maybe set the maximum allowed concurrency if you want to throttle.
As a result of that, there is no integration with AutoScaling, but you can still use an Application Load Balancer to trigger your Lambda Function if that's what you're after.
If you're building a purely serverless application, you might want to look into the API Gateway instead of the ALB integration.
Update
Since you've clarified what you want to use auto scaling for, namely changing the provisioned concurrency of the function, there are ways to build something like that. Clément Duveau has mentioned a solution in the comments that I can get behind.
You can create a Lambda Function with two CloudWatch events triggers with Cron-Expressions. One for when you want to scale out and another one for when you want to scale in.
Inside the lambda function you can use the name of the rule that triggered the function to determine if you need to do a scale out or scale in. You can then use the PutFunctionConcurrency API-call through one of the SDKs mentioned at the bottom of the documentation to adjust the concurrency as you see fit.
Update 2
spmdc has mentioned an interesting blog post using application auto scaling to achieve this, I had missed that one - you might want to check it out, looks promising.

AWS EC2 vs Serverless Cost Comparison

I am currently using AWS EC2 for my workloads.
Now I want to convert the EC2 server to the Serverless Platform i.e(API Gateway and Lambda).
I have also followed different blogs and I am ready to go with the serverless. But, my one concern is on pricing.
How can I predict per month cost for the serverless according to my use of EC2? Will the EC2 Cloudwatch metrics help me to calculate the costing?
How can I make cost comparison?
Firstly, there is no simple answer to your question as a simple lift and shift from a VM to Lambda is not ideal. To make the most of lambda, you need to architect your solution to be serverless. This means making use of the event-driven nature of Lambda.
Now to answer the question simply, you are charged only for the time it takes to serve a request (to the next 100ms). So if your lambda responds to the request in 70ms you pay for 100ms of execution time. If you serve the request in 210ms then you pay for 300ms.
You also pay for the memory allocated to the function on the order of GB per/month.
If you have a good logging or monitoring strategy you could check how long it takes to serve each type of request and how often they occur. If your application is fairly low-scale and is not accessed often (say all requests come within an 8 hour window) then you may end up saving money with lambda as you are only paying AWS for the time spent serving the request.
Also, it may help to read the following article on common pitfalls:
https://medium.com/#emaildelivery/serverless-pitfalls-issues-you-may-encounter-running-a-start-up-on-aws-lambda-f242b404f41c

Serverless Database Connection Pooling

I’m trying to build an application on aws that is 100% serverless (minus the database for now) and what I’m running into is that the database is the bottleneck. My application can scale very well but my database has a finite number of connections it can accommodate and at some point, my lambdas will run into that limit. I can do connection pooling outside of the handler in my lambdas so that there is a database connection per lambda container instead of per invocation and while that does increase the number of concurrent invocations before I hit my connection limit, the limit still exists.
I have two questions.
1. Does serverless aurora solve this by autoscaling to increase the number of instances to meet the need for more connections.
2. Are there any other solutions to this problem?
Also, from other developers interested in serverless, am I trying to do something that’s not worth doing? I love how easy deployment is with serverless framework but is it better just to work with Microservices in something like Kubernetes instead?
I believe there are two potential solutions to that problem:
The first and the simplest option is to take advantage of "lambda hot state", it's the concept when Lambda reuses the execution context for subsequent invocations. As per AWS suggestion
Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.
Basically, while the lambda function is the hot stage it "might/should" reuse opened connection(s).
The limitations of the following:
you only reuse connection for single lambda type, so if you have 5 lambda functions invoked all the time you still will be using 5 connections
when you have a spike in lambda invocations, including parallel executions this approach becomes less effective since, lambda will be executed in a new execution context for majority of requests
The second option would be to use a connection pool, connection pool is an array of established database connections, so that the connections can be reused when future requests to the database are required.
While the second option provides a more consistent solution it requires much more infrastructure.
you would be required to run a separate instance for the pool, and if you want to do things properly probably at least two instances and a load balancer (unless use containers).
While it might be overwhelming to provision that much additional infrastructure for connection pooler, it still might be a valid option depending on the scale of the project, your existing infrastructure (may be you already using containers) and cost benefits
Best practices by AWS recommends to take advantage of hot start. You can read more about it here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.BestPracticesWithDynamoDB.html