Adding time delay in Step Functions vs Lambda - amazon-web-services

I need to write some endpoint that simulate the response of third party server to run performance test regularly. I have decided to go with API Gateway to Step Function (waits 2 seconds) to Lambda that returns response based on the request. This needs to be done to emulate latency of 2 seconds on third party service. The question is what would be the cheapest solution to solve the task using these services.
Should I go only with Lambda alone (no Step Functions and even API Gateway)? If I use only Lambda we would need many concurrent Lambdas each running for 2 seconds. What is the cheapest solution?

When working with AWS services and trying to work out pricing, you should use the AWS Price Calculator. Each service also has pricing breakdowns:
AWS Step Functions Pricing
AWS Lambda Pricing
Here's a quick work-up I did:
Service
Price
Request/day
Notes
Lambda
$32/mo
10k
5 second runtime
Step Function
$76/mo
10k
n/a
From these numbers, the most cost effective solution would be to remove the Step Functions from the orchestration and just go with Lambdas.
A nice addition of using Lambdas are the ability to pass in a parameter to specify the 3rd party end-point latency. Or even a range or latency, i.e., random delay between 1 second and 3 seconds.
You can review this estimate here.

Consider a SQS Delay Queue in front of the Lambda to simulate the 2 second latency. The first million SQS messages are always free each month, the next million cost $0.40.
Your Lambdas will execute without any artificial delay, reducing your compute cost by (more than?) half.

Related

Software or managed service for AWS Lambda job scheduling

I have a relatively large number of tasks that need to be executed at certain intervals, hourly, daily, weekly etc. These tasks are easily defined as AWS Lambda functions and I can schedule them easily enough with AWS Eventbridge.
However, in many cases jobs can fail due to delayed or missing data or other micro services going down. Take, for example, a function that is configured to run every hour and process data from hour X to hour X+1 and serialize to some data store (the ETL use case). Suppose at 1am some service becomes unavailable and the job fails until engineering is able to address the issue at 10am, at which point the code for the lambda is updated.
The desired behavior would be for that job to pick up where it left off and quickly catch up and process data from 1am to 10am (sequentially).
It would be relatively straightforward to implement some state-tracking service manually, where interval success/fails are tracked and can be checked and registered via simple API calls. My question is whether there is existing software for this sort of application/service, as far as I can tell Apache Airflow can do this but it also comes with significantly more complexity and overhead than is needed.
Two options come to mind:
Track state of your application with AWS Step Functions. You can implement coordination between Lambda functions, add parallel or sequential processing etc. Step Functions also support error handling and have built-in retry mechanisms.
Depending on the volume and velocity of data you ingest, you could go with Amazon SQS or Amazon Kinesis to stream the data to Lambda functions. With SQS, you could use retry for every message. If the message couldn't be processed, you can put it into Dead-Letter Queue (DLQ) for further investigation. Also, this approach is highly scalable and allows parallel execution of jobs.

Calling lambda functions programatically every minute at scale

While I have worked with AWS for a bit, I'm stuck on how to correctly approach the following use case.
We want to design an uptime monitor for up to 10K websites.
The monitor should run from multiple AWS regions and ping websites if they are available and measure the response time. With a lambda function, I can ping the site, pass the result to a sqs queue and process it. So far, so good.
However, I want to run this function every minute. I also want to have the ability to add and delete monitors. So if I don't want to monitor website "A" from region "us-west-1" I would like to do that. Or the other way round, add a website to a region.
Ideally, all this would run serverless and deployable to custom regions with cloud formation.
What services should I go with?
I have been thinking about Eventbridge, where I wanted to make custom events for every website in every region and then send the result over SNS to a central processing Lambda. But I'm not sure this is the way to go.
Alternatively, I wanted to build a scheduler lambda that fetches the websites it has to schedule from a DB and then invokes the fetcher lambda. But I was not sure about the delay since I want to have the functions triggered every minute. The architecture should monitor 10K websites and even more if possible.
Feel free to give me any advise you have :)
Kind regards.
In my opinion Lambda is not the correct solution for this problem. Your costs will be very high and it may not scale to what you want to ultimately do.
A c5.9xlarge EC2 costs about USD $1.53/hour and has a 10gbit network. With 36 CPU's a threaded program could take care of a large percentage - maybe all 10k - of your load. It could still be run in multiple regions on demand and push to an SQS queue. That's around $1100/month/region without pre-purchasing EC2 time.
A Lambda, running 10000 times / minute and running 5 seconds every time and taking only 128MB would be around USD $4600/month/region.
Coupled with the management interface you're alluding to the EC2 could handle pretty much everything you're wanting to do. Of course, you'd want to scale and likely have at least two EC2's for failover but with 2 of them you're still less than half the cost of the Lambda. As you scale now to 100,000 web sites it's a matter of adding machines.
There are a ton of other choices but understand that serverless does not mean cost efficient in all use cases.

Maximizing number of parallel operation in AWS Lambda

I have an AWS Lambda which has to invoke an API endpoint for 2 million records. Considering that the maximum execution period of Lambda is 15 minutes. I have to somehow process all these records using one Lambda(that is in 15 minutes if possible). The API endpoint which I want to invoke can handle the TPS of 3000. I want to maximize/parallelize my calls so I can utilize the TPS provided and run the operations using a single Lambda. I have created my invocations within parallelStream in Java. Is is possible to do it using the current approach? If yes, What changes would I have to make in Lambda Runtime in order to use multi core?
Considering that the maximum execution period of Lambda is 15 minutes.
I have to somehow process all these records using one Lambda(that is
in 15 minutes if possible).
Why? This defeats the entire reason you would use AWS Lambda for this task. Why limit yourself to a single Lambda function invocation to do all this work?
If you wrote a script to take your 2 million records and add them to an SQS queue, then you could have the AWS Lambda service automatically feed these records into multiple, parallel instances of your AWS Lambda function. This would allow you to easily tune the number of Lambda functions you want to have running in parallel, and also automatically handle retries in the case of failures.

AWS EC2 vs Serverless Cost Comparison

I am currently using AWS EC2 for my workloads.
Now I want to convert the EC2 server to the Serverless Platform i.e(API Gateway and Lambda).
I have also followed different blogs and I am ready to go with the serverless. But, my one concern is on pricing.
How can I predict per month cost for the serverless according to my use of EC2? Will the EC2 Cloudwatch metrics help me to calculate the costing?
How can I make cost comparison?
Firstly, there is no simple answer to your question as a simple lift and shift from a VM to Lambda is not ideal. To make the most of lambda, you need to architect your solution to be serverless. This means making use of the event-driven nature of Lambda.
Now to answer the question simply, you are charged only for the time it takes to serve a request (to the next 100ms). So if your lambda responds to the request in 70ms you pay for 100ms of execution time. If you serve the request in 210ms then you pay for 300ms.
You also pay for the memory allocated to the function on the order of GB per/month.
If you have a good logging or monitoring strategy you could check how long it takes to serve each type of request and how often they occur. If your application is fairly low-scale and is not accessed often (say all requests come within an 8 hour window) then you may end up saving money with lambda as you are only paying AWS for the time spent serving the request.
Also, it may help to read the following article on common pitfalls:
https://medium.com/#emaildelivery/serverless-pitfalls-issues-you-may-encounter-running-a-start-up-on-aws-lambda-f242b404f41c

Is AWS Lambda good for real-time API Rest?

I'm learning about AWS Lambda and I'm worried about synchronized real-time requests.
The fact the lambda has a "cold start" it doesn't sounds good for handling GET petitions.
Imagine a user is using the application and do a GET HTTP Request to get a Product or a list of Products, if the lambda is sleeping, then it will take 10 seconds to respond, I don't see this as an acceptable response time.
Is it good or bad practice to use AWS Lambda for classic (sync responses) API Rest?
Like most things, I think you should measure before deciding. A lot of AWS customers use Lambda as the back-end for their webapps quite successfully.
There's a lot of discussion out there on Lambda latency, for example:
2017-04 comparing Lambda performance using Node.js, Java, C# or Python
2018-03 Lambda call latency
2019-09 improved VPC networking for AWS Lambda
2019-10 you're thinking about cold starts all wrong
In December 2019, AWS Lambda introduced Provisioned Concurrency, which improves things. See:
2019-12 AWS Lambda announces Provisioned Concurrency
2020-09 AWS Lambda Cold Starts: Solving the Problem
You should measure latency for an environment that's representative of your app and its use.
A few things that are important factors related to request latency:
cold starts => higher latency
request patterns are important factors in cold starts
if you need to deploy in VPC (attachment of ENI => higher cold start latency)
using CloudFront --> API Gateway --> Lambda (more layers => higher latency)
choice of programming language (Java likely highest cold-start latency, Go lowest)
size of Lambda environment (more RAM => more CPU => faster)
Lambda account and concurrency limits
pre-warming strategy
Update 2019-12: see Predictable start-up times with Provisioned Concurrency.
Update 2021-08: see Increasing performance of Java AWS Lambda functions using tiered compilation.
As an AWS Lambda + API Gateway user (with Serverless Framework) I had to deal with this too.
The problem I faced:
Few requests per day per lambda (not enough to keep lambdas warm)
Time critical application (the user is on the phone, waiting for text-to-speech to answer)
How I worked around that:
The idea was to find a way to call the critical lambdas often enough that they don't get cold.
If you use the Serverless Framework, you can use the serverless-plugin-warmup plugin that does exactly that.
If not, you can copy it's behavior by creating a worker that will invoke the lambdas every few minutes to keep them warm. To do this, create a lambda that will invoke your other lambdas and schedule CloudWatch to trigger it every 5 minutes or so. Make sure to call your to-keep-warm lambdas with a custom event.source so you can exit them early without running any actual business code by putting the following code at the very beginning of the function:
if (event.source === 'just-keeping-warm) {
console.log('WarmUP - Lambda is warm!');
return callback(null, 'Lambda is warm!');
}
Depending on the number of lamdas you have to keep warm, this can be a lot of "warming" calls. AWS offers 1.000.000 free lambda calls every month though.
We have used AWS Lambda quite successfully with reasonable and acceptable response times. (REST/JSON based API + AWS Lambda + Dynamo DB Access).
The latency that we measured always had the least amount of time spent in invoking functions and large amount of time in application logic.
There are warm up techniques as mentioned in the above posts.