Conceptual question regarding scaling REST API - amazon-web-services

I have a general question regarding how APIs are scaled. I have a basic RESTful API powered by Django rest framework, the backend uses RDS for database management. Right now, I'm deploying my django application to a digitalocean droplet but thinking of switching over to EC2 or potentially EKS. Is my understanding correct that I can effectively point my application to the RDS endpoint and spin up several EC2 instances with the same Django application fronted by an ELB? Would this take care of incoming traffic and the scalability of the django application?
This isn't exactly a coding question so I'm not sure if this is the best stackexchange site to ask this.

My two cents here:
I`ve been using lambda to serve Django and Flask apis for quite some time now, and it works great. You don't need to worry about scalability at all, unless there is a chance that your API would receive more than 10,000 requests per second (very unlikely on most scenarios). It will be way cheaper than EKS, even cheaper than EC2. I have a app with 400k active users which is served by an API running on lambda, I never paid more than $25 on invokations.
You can use Zappa (which is exclusively for python, I recommend) or Serverless framework, they will take care of most of the heavy work and make the deployment very easy.
But have in mind that lambda is not very good for long running tasks, like cronjobs. If you have have crons that might take some time to be executed your invokations can get a little expensive if you invoke it ofteen (lambdas can run up to 15 minutes, but those 15 minutes will be much more expensive than EC2). Also, the apigateway in front of the lambda function have a 30 seconds timeout, so your requests must be processed before that. If you think your requests will take longer, you will need to leverage some async requests. I think it is a very small price to have a full service without having to worry about the infrastructure.

You are right, but you can think not only about ec2 and EKS. You can also look into ECS and Fargate options. ELB distribute traffic across compute resources inside Target Group and it can be Autoscaling Group for EC2. Also, with RDS you can scale read replicas for handling mor read traffic independent from master node

Related

Optimizing latency between application server (EC2) and RDS

here's how the story goes.
We started transforming a monolith, single-machine, e-commerce application (Apache/PHP) to cloud infrastructure. Obviously, the application and the database (MySQL) were on the same machine.
We decided to move to AWS. And as the first step of transformation, we decided to split the database and application. Hosting application on a c4.xlarge machine. And hosting database to RDS Aurora MySQL on a db.r5.large machine, with default options.
This setup performed well. Especially the database performance went up high.
Unfortunately, when the traffic spiked up, we started experiencing long response times. Looked like RDS, although being really fast for executing queries, wasn't returning results fast enough over the network to the EC2 machine.
So that was our conclusion after an in-depth analysis of the setup including Apache/MySQL/PHP tuning parameters. The delayed response time was definitely due to the network latency between EC2 and RDS/Aurora machine, both machines being in the same region.
Before adding additional resources (ex: ElastiCache etc) we'd first like to look into any default configuration we can play around with to solve this problem.
What do you think we missed there?
One of the bigest strength with the cloud is the scalability and you should always design your application to utilise it and it sounds like your RDS instance is getting chocked due to nr of request more than the process time for the queries. So rather go more small instances with load balancing than one big doing all the job. And with Load Balancers you will get away from a singel point of failure due to you can have replicas of your database and they can even be placed in different AZ.
Here is a blogpost you can read on the topic:
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/
Good luck in your aws journey.
The Best answer to your question is using read replicas, but remember only your read requests could be sent to your read replicas so you would need to design your application that way
Also for some cost savings, you should try aurora serverless
One more option is passing traffic between ec2 and rds through a private network rather than using the public internet to connect your ec2 to rds that can be one of the mistakes that might be happening

AWS: How often Lambda face Cold Starts in production?

My company is developing a game using aws and I'm only backend developer.
Since I'm alone, I thought that going serverless would be easiest and safest solution for me.
I'm using .net core Lambdas with dynamodb as database as backend stack.
However, whenever cold start occurs response time of my login function takes around 16s 20s since dynamodb client initialization takes long. I load only 2mb data from db.
For development purposes, I have scheduled a lambda to avoid cold starts and I also know that I can provision some ready instances. But I think that scheduled calls will provision only one container and in production need will be a lot more. My company is also an advertising company so we expect a lot of users.
How much percentage of calls might face cold starts in production? Is is more then %20?
Also, I'm using terraform to manage infrastructure and I'm using route 53 + regional apigateways + custom domains setup to have multiple region latency based routing.
Would you recomment me to go beanstalk in stead of lambda? Its first time for me to create such environment that should be highly scalable.
Thank you
Kind regards

What is best way to performance and load test AWS cloud applications

Want to keep this question generic and expected answer in terms of best practice/approach/guidelines,
We need to know the best way to performance test and load test AWS cloud based applications.
What we have tried:
We used Gatling and Jmeter to execute our performance tests. These frameworks are pretty useful to test our functionality and to benchmark our applications latency and request rate.
Problem:
Performance benchmarks and limits of AWS managed services like Lambda and DDB are already specified by AWS e.g. Lambda concurrency behavior and DDB autoscaling under load etc. AWS also provides high availability and guaranteed performance of managed services.
Is it really worth executing expensive performance test and load test jobs for AWS managed services?
How to ensure that we are testing our application and not actually testing AWS limits which are already known.
What is the best practice and approach to performance test cloud based applications.
Any suggestions will help tremendously.
Thanks,
It depends on how you can use the AWS infrastructure
If you don't use Auto Scaling you can treat the AWS cloud-based application as a "normal" application just deployed not on your company premises, but somewhere at Amazon
If you use Auto Scaling you might want to come up with a some form of scalability testing. AWS instances can scale up in order to adapt to the increased load but your application must be able to scale as well. So you can test:
how fast are scale-up and scale-down processes, i.e. if you rapidly increase the load do AWS/your application react fast enough or there will be performance drop
scalability factor. For example with 100 virtual users you have 50 requests per second. Ideally with 200 virtual users you should have 100 requests per second, with 300 virtual users - 150 requests per second. Response time should remain the same. But normally the situation differs from the "ideal" so it would be good to know this scalability factor.
Also if your application is behind the AWS ELB you will need to add DNS Cache Manager to your JMeter test plan otherwise you will be hitting only one backend instance while others will be idle.

AWS: How to deploy docker instance near instantly on ec2 closest to my customer?

I have a basic docker image containing a python script that comes in under 100mb. I'm not sure what distro I'm going to use but preferably one that results in the smallest file size as possible.
The goal is to deploy a docker image on t2.nano ec2 instance but it must meet the following conditions:
from the time a customer requests access via URL, it should respond as quickly as possible, preferably under a few seconds.
the latency between the customer and newly deployed docker ec2 instance should be as small as possible, meaning ec2 on the closest availability region.
Is this possible?
It's not possible to deploy an EC2 instance in under a few seconds, especially not t2.nano type instances. EC2 instances follow the same rules as physical compute resources, thus larger instance types boot faster, and t2.nano is currently the smallest/least powerful/slowest instance size. That said, even the fastest instances would take a minute or so to be provisioned and fully booted.
It sounds like you should look into using AWS Lambda. It's their proprietary, managed, containerized compute resource service, and designed for the kind of workflow you are describing. You don't manage the containers themselves, but deploy the code (of which Python is a supported language) and its dependent libraries to the service, and it handles launching it in a container on demand, with overhead in the sub-second range.
Note that it is not designed for hosting "websites" directly, if that's your intention. Lambda functions are invoked by other AWS services, one of which is API Gateway, which would be the best route in providing a public interface to your Lambda functions. This could be used in conjunction with something like S3 static website hosting to provide the building blocks for a "serverless" web application.
As for your second question, Route 53 does support latency based routing, but I don't believe it supports API Gateway endpoints as targets yet. So if global latency is a big concern, your best bet may be to deploy a few full-time EC2 instances around the world and use latency based routing. If it's mainly static assets you're worried about, CloudFront can cache these at edge locations, as an alternative.

EC2 Architecture design for Website

I have a site that I will be launching soon. Not entirely sure how heavy the traffic will get.
I am using Django+Nginx+Gunicorn+Mysql. There will be support for SSL/HTTPS.
As a starting point, I am thinking of having two micro instances balanced by Elastic Load Balancing.
The MySql database will be on one of the instances. If traffic gets heavy, I might move static files to a CDN. The micro instances serve as front-end servers responsible for only churning out HTML/JSON and serving static files. Static files are mainly CSS/js and several images (not many). I foresee database will be read-heavy and less writes.
Questions:
Assuming the traffic rises to 100k page views per day, will the 2 micro instances suffice?
Do I have to move the database to a separate instance? And what instance type would be good?
What if the traffic is only 1k page views per day?
How many gunicorn processes to run on a micro instance?
In general, what type of metrics will help me determine what kind and how many instances I would need? What is the methodology to decide what kind of architecture I would need?
Thanks a lot!
Completely dependant on how dynamic the site is planning to be. Do users generate content towards the service or is it mostly static? If the former you're going to get a lot from putting stuff like avatars, images etc. into S3 and putting that on Cloudfront. Same with your static files... keeping your servers stateless will allow you scale with ease.
At 100k page views a day you will definitely struggle with just micros... you really should only use those in a development environment and aren't meant to handle stuff like serving users. I'd use at a minimum a single small instance in-front of a Load Balancer, may sound strange but you will be able to throw in another instance when things get busy without having to mess with Route 53 or potentially having your site fail. The stateless stuff is quite important now as user-generated assets may only be reference able from one instance and not the other.
At 1k page views I'd still use a small for web serving and another small for MySQL. You can look into RDS which is great if you're doing this solo, forget about needing to upgrade versions and stuff like maintenance, backups etc.
You will also be able to one-click spin up read replicas for peak. Look into the Amazon CLI as well to help automate those tasks. Cronjobs will make it a cinch if you're time stressed otherwise Opsworks, Cloudformation and Auto-Scaling will all help with the above.
Oh and just as a comparison, an Application server of mine running Apache, PHP with APC to serve our users starts to struggle with about 80 concurrent users. Runs on a small EC2 Instance with a Small RDS (which sits at about 15% at the same time as the Application Server is going downhill)
Probably not. Micro instances are not designed for heavy production loads. They use a burstable CPU profile. They can run at 2 ECU for a couple of minutes, and then they get locked at 0.1-0.2 ECU. I tend to like c1.medium, but small may be enough for you.
Maybe, as long as they are spread out during the day and not all in a short window.
1-2 per core. Micro only has 1 core.
Every application is different. The best thing to do is run your own benchmarks using tools like ab (Apache Bench)
Following the AWS best practices architecture diagram is always a good start.
http://media.amazonwebservices.com/architecturecenter/AWS_ac_ra_web_01.pdf
I strongly advise you to store all your files on Amazon S3, and use a Route 53 DNS (or any other DNS if you want) in front of it to distribute the files, because later on if you decide to use CloudFront CDN it will be very easy to change. And, just to mention using CloudFront as CDN will increase your cost only a little bit, not a huge thing.
Doesn't matter the scenario, if we're talking a about production, you should definitely go for separate instances, at least 1 EC2 for web and 1 EC2/RDS for database.
If you are geek and like to get into the nitty gritty details, create your own infrastructure and feel free to use any automation tool (puppet, chef) or not. Or if you just want to collect the profit, or have scarce resources to take care of everything, you should try Elastic Beanstalk (http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Python_django.html)
Anyway, going to create your own infrastructure or choose elastic beanstalk, always execute stress tests to have a better overview of your capacity planning needs. After you choose your initial environment, stress it using apache bench, siege or whatever other tool you may like.
Hope this helps.
I would suggest to use small instances instead of micro as micro instances often stop responding on heavy load and then it requires a stop-start. Use s3 for static files which helps in faster loading and have a look over cloud front.
The region for instance also helps in serving requests and if you target any specific region, create the instance selecting that region.
Create the database in new instance and attach ebs volume to that instance. Automate backup script to copy database files and store in ebs to avoid any issues. The instance selected here can be iops for faster processing over standard. Aws services provide lot of flexibility but you need to have scripts running to scale up and down the servers as per the timings.
Spot instance can help in future as they come cheap in case you are scaling up.