I'm trying to find a good architecture for connecting to the database. It is required that the connection to the database is not repeated in each lambda function. In addition, this way will create many connections for individual lambdas instead of one common. Can I implement the structure as in the figure below, so that one lambda connects to the database, and everyone else uses its connection in its code
Your proposed architecture will not work because unless your Innovation of DB Lambda is too frequent to always keep it warm and that you are storing your connection in /tmp for reusing on subsequent innovations your DB Lambda will create new connections for each invocation. Moreover if your invocations of DB Lambda create multiple containers to serve simultaneous requests then you will anyways have those many connections instead of just one
Ideal solution will be to replace the DB Lambda with a tiny EC2 instance
The DB connection can be cached in your "DB Lambda" when the Lambda stays warm. If the Lambda does not stay warm, then invoking that lambda will suffer the price of a cold lambda, which could be dominated by having to recreate a DB connection, or could be dominated by what other work you do in "DB Lambda".
How frequently you expect your Lambdas to go cold is something to take into consideration. It depends on the statistics of your incoming traffic. Are you willing to suffer the latency of recreating a DB connection once in a while or not, another consideration?
Managing a tiny EC2 instance like someone else said, could be a lot of extra work depending on whether your cloud service is a complex set of backend services, whether that service shuts down during periods of inactivity. Managing EC2 instances is more work that Lambdas.
I do see one potential problem with your architecture. If for whatever reason your "DB Lambda" fails, the calling Lambda won't know. That could be a problem if you need to handle that situation and do cleanup.
Related
The Postgres query builder my Lambda functions use, Knex, uses prepared statements so I'm unable to fully take advantage of RDS Proxy since the sessions are pinned. I try to ensure that the lambdas run for as little time as possible so that the pinned session completes as quickly as possible and its connection is returned to the pool.
I was wondering how I might be able to make the sessions shorter and more granular and thinking about creating and closing a connection to AWS RDS Proxy with each query.
What performance considerations should I be considering to determine the viability of this approach?
Things I'm thinking of:
RDS Proxy connection overhead (latency and memory)
The time that RDS Proxy takes to return a closed connection back to the pool and make it reusable by others (haven't been able to find documentation on this)
Overhead of Knex's local connection pool
Using RDS proxy when building applications with Lambda functions is a recommended infrastructure pattern by AWS. Relational Databases are not built to handle tons of connections, while Lambdas can scale to thousands of instances.
RDS Proxy connection overhead (latency and memory)
This would definitely increase your latency, but you will see a great improvement in the CPU and memory usage of your database, which would ultimately prevent unnecessary failures. It's a good trade-off when you can do a lot of other optimizations on the lambda side.
The time that RDS Proxy takes to return a closed connection back to the pool and make it reusable by others (haven't been able to find documentation on this)
While working with Lambdas, you should drop the connection to your RDS proxy, as soon as you finish processing your logic without worrying about the time the RDS proxy would take to return the closed connection back. Once the connection is dropped, the RDS proxy keeps it warm in the pool of connections it maintains for a certain duration of time. If another lambda tries to make a connection meanwhile, it can share the same connection which is still warm in the pool. Dropping the database connection at the right time from your lambda would save you lambda processing time -> money.
Overhead of Knex's local connection pool
I would suggest not using Knex local connection pool with lambda as it won't do any good (Keep the pool max to 1). Every lambda execution is independent of another, the pool will never be shared and the connection doesn't persist after the execution completes unless you plan to use it with a serverless-offline kind of local framework for development purposes.
Read More about AWS Lambda + RDS Proxy usage: https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/
AWS Documentation on where to use RDS Proxy: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/rds-proxy-planning.html
I have a live production system with a google CloudSQL Postgres instance. The application will soon be undergoing a long running database schema modification to accommodate a change to the way the business operates. We've got a deployment plan that will allow the business to continue to operate during the schema change which essentially pauses replication to our read replica, and queues up API requests that would mutate the database for replay after the schema change is complete. Once the deployment is complete, the last step is to un-pause replication. But while the read replica is catching up, the schema changes will lock tables causing a lot of failing read requests. So before we un-pause the the read replication, we're going to divert all API db queries to the main instance which will have just finished the schema changes. So far so good, but I can't find a way to programmatically tell when the read replica is done catching up, so we can split our DB queries with writes going to the main instance and reads going to the replica.
Is there a PubSub topic or metric stream our application could subscribe to which would fire when replication catches up? I would also be happy with something that reports replication lag transaction count (or time) which the application could receive and when the trailing average comes below threshold, it switches over to reading from the replica again. The least desirable but still okay option would be continuous polling of an API or metric stream.
I know I can do this directly by querying the replica database itself for replication status, but that means we have to implement custom traffic directing in our application. Currently the framework we use allows us to route DB traffic in config. I know there should be metrics that are available from CloudSQL, but I cannot find them.
I know it's not fully answer your question, but maybe you will be able to use it. Seems that you might be interested in Cloud Monitoring and metric:
database/mysql/replication/seconds_behind_master
According to the reference it reflects the lag of the replica behind the master.
Either that or database/replication/replica_lag should work. I don't think you can board this through pub/sub. Anyway you should take a look at the reference as it contains all metrics.
I am going to mention my needs and what I have currently in place so bear with me. Firstly, a lambda function say F1 which when invoked will get 100 links from a site. Most of these links say about 95 are the same as when F1 was invoked the previous time, so further processing must be done with only those 5 "new" links. One solution was to write to a Dynamodb database the links that are processed already and each time the F1 is invoked, query the database and skip those links. But I found that the "database read" although in milliseconds is doubling up lambda runtime and this can add up especially if F1 is called frequently and if there are say a million processed links. So I decided to use Elasticache with Redis.
I quickly found that Redis can be accessed only when F1 runs on the same VPC and because F1 needs access to the internet you need NAT. (I don't know much about networking) So I followed the guidelines and set up VPC and NAT and got everything to work. I was delighted with performance improvements, almost reduced the expected lambda cost in half to 30$ per month. But then I found that NAT is not included in the free tier and I have to pay almost 30$ per month just for NAT. This is not ideal for me as this project can be in development for months and I feel like I am paying the same amount as compute just for internet access.
I would like to know if I am making any fundamental mistakes. Am I using the Elasticache in the right way? Is there a better way to access both Redis and the internet? Is there any way to structure my stack differently so that I retain the performance without essentially paying twice the amount after free tier ends. Maybe add another lambda function? I don't have any ideas. Any minute improvements are much appreciated. Thank you.
There are many ways to accomplish this, and all of them have some trade-offs. A few other ideas for you to consider:
Run F1 without a VPC. It will have connectivity directly to DynamoDB without need for a NAT, saving you the cost of the NAT gateway.
Run your function on a micro EC2 instance rather than in Lambda, and persist your link lookups to some file on local disk, or even a local Redis. With all the Serverless hype, I think people sometimes overestimate the difficulty (and stability) of simply running an OS. It's not that hard to manage, it's easy to set up backups, and may be an option depending upon your availability requirements and other needs.
Save your link data to S3 and set up a VPC endpoint to S3 gateway endpoint. Not sure if it will be fast enough for your needs.
I am accessing auroradb service from my java lambda code. Here I set my lambda concurrency as 1.
Since creating/closing database connection is an expensive process, I have created the mysql connection and made it static. So it will reuse the same connection every time. I haven't added the code to close the connection.
Will it cause any problems?
Will it automatically close after some days?
Most certainly yes! When your lambda "cools" down, your connection to the database will be broken. The next time you invoke your lambda, it goes through a cold start, and your lambda code should initialize the connection again. This is a standard issue with working with persisted connections from serverless infrastructure.
What you need to use is something like a REST API for your data apis, and that's something Aurora Serverless supports as beta.
https://aws.amazon.com/about-aws/whats-new/2018/11/aurora-serverless-data-api-beta/
Each request is a independent HTTP request, and you don't end up managing you persisted connections.
I’m trying to build an application on aws that is 100% serverless (minus the database for now) and what I’m running into is that the database is the bottleneck. My application can scale very well but my database has a finite number of connections it can accommodate and at some point, my lambdas will run into that limit. I can do connection pooling outside of the handler in my lambdas so that there is a database connection per lambda container instead of per invocation and while that does increase the number of concurrent invocations before I hit my connection limit, the limit still exists.
I have two questions.
1. Does serverless aurora solve this by autoscaling to increase the number of instances to meet the need for more connections.
2. Are there any other solutions to this problem?
Also, from other developers interested in serverless, am I trying to do something that’s not worth doing? I love how easy deployment is with serverless framework but is it better just to work with Microservices in something like Kubernetes instead?
I believe there are two potential solutions to that problem:
The first and the simplest option is to take advantage of "lambda hot state", it's the concept when Lambda reuses the execution context for subsequent invocations. As per AWS suggestion
Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.
Basically, while the lambda function is the hot stage it "might/should" reuse opened connection(s).
The limitations of the following:
you only reuse connection for single lambda type, so if you have 5 lambda functions invoked all the time you still will be using 5 connections
when you have a spike in lambda invocations, including parallel executions this approach becomes less effective since, lambda will be executed in a new execution context for majority of requests
The second option would be to use a connection pool, connection pool is an array of established database connections, so that the connections can be reused when future requests to the database are required.
While the second option provides a more consistent solution it requires much more infrastructure.
you would be required to run a separate instance for the pool, and if you want to do things properly probably at least two instances and a load balancer (unless use containers).
While it might be overwhelming to provision that much additional infrastructure for connection pooler, it still might be a valid option depending on the scale of the project, your existing infrastructure (may be you already using containers) and cost benefits
Best practices by AWS recommends to take advantage of hot start. You can read more about it here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.BestPracticesWithDynamoDB.html