How do I handle idle database connections made outside of a Lambda function handler? - amazon-web-services

Our current implementation is to open one database connection outside of the Lambda handler. When the backing Lambda container terminates, the connection is then left open/idle.
Can I make the new container close the previous old container's database connection?
Are there any hooks available like an onContainerClose()?
How can we close the previous open connection which cannot be used anymore, when the Lambda cold starts?

In the background, AWS Lambda functions execute in a container that isolates them from other functions & provides the resources, such as memory, specified in the function’s configuration.
Any variable outside the handler function will be 'frozen' in between Lambda invocations and possibly reused. Possibly because depending on the volume of executions, the container is almost always reused though this is not guaranteed.
You can personally test this by invoking a Lambda with the below source code multiple times & taking a look at the response:
let counter = 0
exports.handler = async (event) => {
counter++
const response = {
statusCode: 200,
body: JSON.stringify(counter),
};
return response;
};
This also includes database connections that you may want to create outside of the handler, to maximise the chance of reuse between invocations & to avoid creating a new connection every time.
Regardless of if the Lambda function is reused or not, a connection made outside of the handler will eventually be closed when the Lambda container is terminated by AWS. Granted, the issue of "zombie" connections are much less when the connection is reused but it is still there.
When you start to reach a high number of concurrent Lambda executions, the main question is how to end the unused connections leftover by terminated Lambda function containers. AWS Lambda is quite good at reliably terminating connections when the container expires but you may still run into issues getting close to your max_connections limit.
How can we close the previous open connection which cannot be used anymore, when the Lambda cold starts?
There is no native workaround via your application code or Lambda settings to completely getting rid of these zombie connections unless you handle opening and closing them yourself, and take the added duration hit of creating a new connection (still a very small number).
To clear zombie connections (if you must), a workaround would be to trigger a Lambda which would then list, inspect & kill idle leftover connections. You could either trigger this via an EventBridge rule operating on a schedule or trigger it when you are close to maxing out the database connections.
These are also great guidelines to follow:
Ensure your Lambda concurrent executions limit does not exceed your database maximum connections limit: this is to prevent the database from maxing out connections
Reduce database timeouts (if supported): limit the amount of time that connections can be idle & left open, for example in MySQL tweaking the wait_timeout variable from the default 28800s (8 hour) to 900 seconds (15 minutes) can be a great start
Reduce the number of database connections: try your best to reduce the connections you need to make to the database via good application design & caching
If all else fails, look into increasing the max connections limit on the databe

Related

AWS Lambda execution duration randomly spikes and causes time-outs

I'm building a server-less web-tracking system which serves its tracking pixel using AWS API Gateway, which calls a Lambda function whenever a tracking request arrives to write the tracking event into a Kinesis stream.
The Lambda function itself does not do anything fancy. It just a takes the incoming event (its own argument) and writes it to the stream. Essentially, it's just:
import boto3
kinesis_client = boto3.client("kinesis")
kinesis_stream = "my_stream_name"
def return_tracking_pixel(event, context):
...
new_record = ...(event)
kinesis_client.put_record(
StreamName=kinesis_stream,
Data=new_record,
PartitionKey=...
)
return ...
Sometimes I experience a weird spike in the Lambda execution duration that causes some of my Lambda function invocations to time-out and the tracking requests to be lost.
This is the graph of 1-minute invocation counts of the Lambda function in the in affected time period:
Between 20:50 and 23:10 I suddenly see many invocation errors (1-minute error counts):
which are obviously caused by the Lambda execution time-out (maximum duration in 1-minute intervals):
There is nothing weird going on neither with my Kinesis stream (data-in, number of put records, put_record success count etc., all looks normal), nor with my API GW (number of invocations corresponds to number of API GW calls, well within the limits of the API GW).
What could be causing the sudden (and seemingly randomly occurring) spike in the Lambda function execution duration?
EDIT: neither the lambda functions are being throttled, which was my first idea.
Just to add my 2 cents, because there's not much investigative work without extra logging or some X-Ray analysis.
AWS Lambda sometimes will force recycle containers which will feel like cold starts even though your function is being reasonably exercised and warmed up. This might bring all cold start related issues, like extra delays for ENIs if your Lambda has an attached VPC and so on... but even for a simple function like yours, 1 second timeout is sometimes too optimistic for a cold start.
I don't know of any documentation on those forced recycles, other than some people having evidence for it.
"We see a forced recycle about 7 times a day." source
"It also appears that even once warmed, high concurrency functions get recycled much faster than those with just a few in memory." source
I wonder how you could confirm this is the case. Perhaps you could check those errors appearing in Cloud Watch log streams to be from containers that never appeared before.

Handle timeout in AWS API Gateway

I'm working on a project were I'm using a lambda function to connect to a relational database and to DynamoDB at the same time. To access that function I'm using API Gateway, but I found a problem: My lambda function, written in Java takes more than 10 seconds to start due to the creation of both database connections.
I know API Gateway timeout is 10 seconds, and that's not a problem executing my function witch takes less than 1 second, but the problem is when it has to start.
I would like to know how to catch this timeout exception and notify to the user that he needs to start the request again.
Is there a way to do so without moving to Node.js or accessing lambda function directly?
Since the cost of establishing a connection to a relational database is so high, I would encourage you to open the connection in the initialization code of your Lambda function (outside of the handler).
The database connection will then be re-used across multiple invocations for the lifetime of the Lambda container. Within your Lambda function handler you may want to ensure the connection is alive and hasn't timed out, and re-open as required.
The first call through API Gateway may timeout, but subsequent calls will reuse the connection for the lifetime of the container.
Another trick is to create a scheduled function to periodically call your function to keep the container "warm".
Cheers,
Ryan

How we can use JDBC connection pooling with AWS Lambda?

Can we use JDBC connection pooling with AWS Lambda ? AS AWS lambda function get called on a specific event, so its life time persist even after it finishing one of its call ?
No. Technically, you could create a connection pool outside of the handler function but since you can only make use of any one single connection per invocation so all you would be doing is tying up database connections and allocating a pool of which you could only ever use 1.
After uploading your Lambda function to AWS, the first time it is invoked AWS will create a container and run the setup code (the code outside of your handler function that creates the pool- let's say N connections) before invoking the handler code.
When the next request arrives, AWS may re-use the container again (or may not. It usually does, but that's down to AWS and not under your control).
Assuming it reuses the container, your handler function will be invoked (the setup code will not be run again) and your function would use one of N the connections to your database from the pool (held at the container level). This is most likely the first connection from the pool, number 1 as it is guaranteed to not be in use, since it's impossible for two functions to run at the same time within the same container. Read on for an explanation.
If AWS does not reuse the container, it will create a new container and your code will allocate another pool of N connections. Depending on the turnover of containers, you may exhaust the database pool entirely.
If two requests arrive concurrently, AWS cannot invoke the same handler at the same time. If this were possible, you'd have a shared state problem with the variables defined at the container scope level. Instead, AWS will use two separate containers and these will both allocate a pool of N connections each, i.e. 2N connections to your database.
It's never necessary for a single invocation function to require more than one connection (unless of course you need to communicate to two independent databases within the same context).
The only time a connection pool would be useful is if it were at one level above the container scope, that is, handed down by the AWS environment itself to the container. This is not possible.
The best case you can hope for is to have a single connection per container. Even then you would have to manage this single connection to ensure the database server hasn't disconnect or rebooted. If it does, your container's connection will die and your handler will never be able to connect again (until the container dies), unless you write some code in your function to check for dropped connections. On a busy server, the container might take a long time to die.
Also keep in mind that if your handler function fails, for example half way through a transaction or having locked a table, the next request invocation will get the dirty connection state from the container. The first invocation may have opened a transaction and died. The second invocation may commit and include all the previous queries up to the failure.
I recommend not managing state outside of the handler function at all, unless you have a specific need to optimise. If you do, then use a single connection, not a pool.
Yes, the lambda is mostly persistent, so JDBC connection pooling should work. The first time a lambda function is invoked, the environment will be created and it may or may not get reused. But in practice, subsequent invocations will often reuse the same lambda process along with all program state if your triggering events occur often.
This short lambda function demonstrates this:
package test;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
public class TestLambda implements RequestHandler<String, String> {
private int invocations = 0;
public String handleRequest(String request, Context context) {
invocations++;
System.out.println("invocations = " + invocations);
return request;
}
}
Invoke this from the AWS console with any string as the test event. In the CloudWatch logs, you'll see the invocations number increment each time.
Kudos to the AWS RDS proxy, now you can used pooled MySql and postgrese connections without any extra configs in your Java or other any code specific to AWS Lambda. All you need is to create and Add a Database proxy your AWS Lambda function you want to reuse/pool connections. See how-to here.
Note: AWS RDS proxy is not included in the Free-Tier (more here).
It has caveat
There is no destroy method which ensures closing pool. One may say DB connection idle time would handle.
What if same DB being used for other use cases like pool maintain in regular machine Luke EC2.
As many say, if there is sudden spike in requests, create chaos to DB as there will be always some maximum connection setting at database side per user.

Is it possible to make an HTTP request from one Lambda function, and handle the response in another?

AWS Lambda functions are supposed to respond quickly to events. I would like to create a function that fires off a quick request to a slow API, and then terminates without waiting for a response. Later, when a response comes back, I would like a different Lambda function to handle the response. I know this sounds kind of crazy, when you think about what AWS would have to do to hang on to an open connection from one Lambda function and then send the response to another, but this seems to be very much in the spirit of how Lambda was designed to be used.
Ideas:
Send messages to an SQS queue that represent a request to be made. Have some kind of message/HTTP proxy type service on an EC2 / EB cluster listen to the queue and actually make the HTTP requests. It would put response objects on another queue, tagged to identify the associated request, if necessary. This feels like a lot of complexity for something that would be trivial for a traditional service.
Just live with it. Lambda functions are allowed to run for 60 seconds, and these API calls that I make don't generally take longer than 10 seconds. Not sure how costly it would to have LFs spend 95% of their running time waiting on a response, but "waiting" isn't what LFs are for.
Don't use Lambda for anything that interacts with 3rd party APIs that aren't lightning fast :( That is what most of my projects do these days, though.
It depends how many calls will this lambda execute monthly, and how many memory are you allocating for those lambda. The new timeout for lambda is 5 minutes, which should (hopefully :p) be more than enough for an API to respond. I think you should let lambda deal with all of it to not over complicate the workflow. Lambda pricing is generally really cheap.
E.g: a lambda executed 1 million times with 128 MB allocated during 10 seconds would cost approximatively 20$ - this without considering the potential free tier.

AWS Lambda Hot and Cold Start

Hello I am new to AWS Lambda.I want to know what do we mean by Hot Lambda function (Hot Start) and Cold Lambda function (Cold Start) ?? Can anyone please explain me in detail & what is the difference between Hot Lambda and Cold Lambda
After uploading your code or after periods of inactivity your Lambda is shut down or "cold". When a new event comes in there is a brief moment where Lambda spins up a new instance of your code - this includes whatever initializing AWS does to start up the "container" as well as initializing the code that you uploaded.
So an event that is able to hit an initialized("hot") Lambda will in theory be processed faster than hitting a cold one. There isn't a guarantee on how long a Lambda will stay hot after the last event but it could be as long as 5 minutes.
It's a common belief that when people refer to "warm starts" they mean that the same container/sandbox is ready to receive a new connection - but that's not accurate.
Warm Start - invoking a function using a warm container with a prebaked unused sandbox resources from previous invocations are not recycled.
Cold Start - invoking a function when no container/sandbox is ready to receive the request. A new container must be created and the runtime and user code loaded. Cold start's latency is mostly an internal metric, externally, cold starts are only a part of the total overhead that can affect the end-user experience. In some scenarios, we can encounter a portion of the full cold start, think about scale prediction and statistical algorithms
However, using the terms "cold start" and "warm start" might be misleading, as a developer you should care about "Invocation Overheads" - The time it takes to call the user's function and return the response.