Load testing workflow for POST API calls - amazon-web-services

I have a few questions about server-cost estimations.
How do you decide what type of instance is required for X number of concurrent users? Is it totally based on experience or is there a certain rule that you follow for the same?
I was using JMeter for load testing, and I was wondering, how do you test POST APIs with separate data for each user? Or is there any other platform that you use?
In the case of POST API calls, do we need to create a separate DB for load testing (which I think, we should)? If yes, should we create a test DB in the same DB instance (i.e., in the same AWS RDS)? And does it needs to have some data present in it? As that might change its performance, right?
How to load test a workflow? Suppose we need to load test a case where we want 5,000 users to hit Auth API. It will consist of two APIs, one to request an OTP and the other to use that OTP to get the token.
Please help me out, on this. As I am quite new to scaling and was just wondering if someone with experience in this can help.
Thanks.

It doesn't look like a single "question" to me going forwards you might want to split it into 4 different ones.
Just measure it, I don't think it's possible to predict the resources usage, start load test with 1 virtual user and gradually increase the load to the anticipated number of users at the same time looking at resources consumption in AWS CloudWatch or other monitoring solution like JMeter PerfMon Plugin. In case if you detect that CPU or RAM is the bottleneck switch to higher instance and repeat the test.
There are multiple ways of doing parameterization in JMeter tests, the most commonly used approach is CSV Data Set Config so each user would read the next line from the CSV file containing the test data on each iteration
DB should live on a separate host as if you place it under the same machine as the application server they will be mutually interfering each other and you might face race conditions. With regards to the database size - if possible make a clone of production data
You should simulate real usage of the application with 100% accuracy, if user needs to authorize before making an API call your load test script should do the same.

Related

Recommended way to run a web server on demand, with auto-shutdown (on AWS)

I am trying to find the best way to architect a low cost solution to provide an on-demand web server for a certain amount of time.
The context is as follows: I have some large amount of data sitting on S3. From time to time, users will want to consult that data. I've written a Flask app that can display the data in a nice way for them. Beign poorly written, it really only accepts a single user session at the time. Currently therefore they have to download the Flask app and run it on their own machine.
I would like to find a way for users to request a cloud-based web server that would run the Flask app (through a docker container for example) on-demand, and give them access to it quickly, without having to do much if anything on their own machine.
Every user wanting to view the data would have their own web server created on demand (to avoid multiple users sharing the same web server, which wouldn't work with my Flask app)
Critically, and in order to avoid cost, the web server would terminate itself automatically after some (configurable) idle time (possibly with the Flask app informing the user that it's about to shut down, so that they can "renew" the lease).
Initially I thought that maybe AWS Fargate would be good: it can run docker instances, is quite configurable in terms of CPU/disk it can get (my Flask app is resource-hungry), and at least on paper could be used in a way that there is zero cost when users are not consulting the data (bar S3 costs naturally). But it's when it comes to the detail that I'm not sure...
How to ensure that every new user gets their own Fargate instance?
How to shut-down the instance automatically after idle time?
Is Fargate quick enough in terms of boot time?
The closest I can think is AWS App Runner. It's built on top of Fargate and it provides an intelligent scale out mechanism (probably you are not interested in this) as well as a scale to (almost) 0 capability. The way it works is that when the endpoint is solicited and it's doing work you pay for the entire fargate task (cpu/memory) you have selected in the configuration. If the endpoint is doing nothing you only pay for the memory (note the memory cost is roughly 20% of the entire cost so it's not scale to 0 but "quasi"). Checkout the pricing examples at the bottom of this page.
Please note you can further optimize costs by pausing/starting the endpoint (when it's paused you pay nothing) but in that case you need to create the logic that pauses/restarts it.
Another option you may want to explore is using Lambda this way (which would allow you to use the same container image and benefit from the intrinsic scale to 0 of Lambda). But given your comment "Lambda doesn’t have enough power, and the timeout is not flexible" may be a show stopper.

Display real time data on website that scales?

I am starting a project where I want to create a website which will display LIVE flight information and status. We all have seen this at airport. An example is given here - http://www.computronics.biz/productimages/prodairport4.jpg. As you can see this information changes continuously. The website will talk to a backend api and the this backend api will talk to database. Now the important part is that the flight information in the database will be updated by the airline itself. There could be several airlines and they will update their data respectively. I have drawn a diagram and uploaded here - https://imgur.com/a/ssw1S.
Now those airlines will obviously have an interface (website talking to some backend API) through which they will update the database.
Now here is my attempt to solve it. We need to have some sort of trigger such that if any airline updates a flight detail in the database between current time - 1 hour to current + 4 hours (website will only display few hours of flights), we need to call the web api and then send the update to the website in the real time. The user must not refresh the page at all. At the same time the website needs to scale well i.e. if 1 million users are on the website, and there is an update in the database in the correct time range, all 1 million user's website should get updated within a decent amount of time.
I did some research and it looks like we need to have an event based approach. For example - we need to create a function (AWS lambda or Azure function) that should be called whenever there is an update in the database (Dynamo DB for example) within the correct time range. This function then should call an API which should then update the website through web socket technology for example.
I am not looking for any code but just some alternative suggestions on how this can be solved in a scalable way. Also how do we test scalability?
Dont use serverless functions(Lambda/Azure functions)
Although I am a huge fan of serverless functions, and currently running a full web app in Lambda, I don't think its needed for your use case and doesn't make sense economically. As you've answered in the comments, each airline will not write directly to the database, they'll push to an API, meaning you are explicitly told when flights have changed. When an airline has sent you new data you can simply propagate this to all the browser endpoints via websockets. This keeps the design very simple. There is no need to artificially create a database event that then triggers a function that will then tell you a flight has been updated. Thats like removing your doorbell and replacing it with a motion detector that triggers a doorbell :)
Cost
Money always deserves its own section. Lambda is more of an economic break through than a technological one. You have to know when its cost effective. You pay per request so if your dealing with a process that handles 10,000 operations a month, or something that only fires 1,000 times a day, than lambda is dirt cheap and practically free. You also pay for the length of time the function is executing and the memory consumed while executing. Generally, it makes sense to use lambda functions where a dedicated server would be sitting idle for most of the time. So instead of a whole EC2 instance, AWS provides you with a container on demand. There are points at which high requests rates and constantly running processes makes lambda more expensive than EC2. This article discusses how generally its cheaper to use lambda up to a point -> https://www.trek10.com/blog/lambda-cost/ The same applies to Azure functions and googles equivalent. They are all just containers offered on demand.
If you're dealing with flight information I would imagine you will have thousands of flights being updated every minute so your lambda functions will be firing constantly as if you were running an EC2 instance. You will end up paying a lot more than EC2. When you have a service that needs to stay up 24/7 and run 24/7 with high activity that is most certainly a valid use case for a dedicated server or servers.
Proposed Solution
These are the components I would use below:
Message Queue of some sort (RabbitMQ or AWS SQS with SNS perhaps)
Web Socket Backend (The choice will depend on programming language)
Airline input API (REST,GraphQL, or maybe AWS Kinesis Data Firehose)
The airlines publish their data to a back-end api. The updates are stored on a message queue and the web applicaton that actually displays the results to users, via websockets, reads from the queue.
Scalability
For scalability you can run the websocket application on multiple EC2 instances (all reading from the same queuing service) in an autoscaling group, so with extra load more instances will be created automatically hence the name "autoscaling". And those instances can sit behind an elastic load balancer. Lots of AWS documentation on how to do this and its their flagship design pattern. If you use AWS SQS you don't have to manage the scalability details yourself, aws handles that. The only real components to scale are your websocket application and the flight data input endpoint. You can run the flight api in an autoscaling group as well but AWS does offer an additional tool for high traffic data processing. I detail that below.
Testing Scalability
It would be fairly easy to have a mock airline blast your service with thousands and thousands of fake updates and on the other end you can easily run multiple threads of selenium tests simulating browser clicks and validating that the UI is still operational.
Additional tools
If it ends up being large amounts of data, rather than using a conventional REST api for your flight update service you could consider a service AWS offers specifically for dealing with large amounts of real time updates (Kinessis Data Firehose) https://aws.amazon.com/kinesis/data-firehose/ But I've never used it.
First, please don't over think this. This is a trivial problem to solve and doesn't require any special techniques, technologies or trendy patterns & frameworks.
You actually have three functional areas you can address almost separately.
Ingestion - Collection and normalization of the data from the various sources. For this, you'll need a process and transformation engine, LogicApps or such.
Your databases. You'll quickly learn that not all flights are the same ;). While it might seem so, the amount of data isn't that much. Instances of MySQL/SQL Server tuned for a particular function will work just fine. Hint, you don't need to have data for every movement ready to present all the time.
Presentation. The data API and UIs. This, really, is the easy part. I would suggest you use basic polling at first. For reasons you will never have any control over, the SLA for flight data is ~5 minutes so a real-time client notification system is time you should spend elsewhere at first.

REST API or "direct" database access for remote Celery/Django workers?

I'm working on a project that will have multiple celery workers on machines in different locations in the US that will communicate over the internet.
Am I better off distributing my Django project to each machine and configuring them with the database credentials to my database host, or should I have a "main" Django/database host that presents a REST API for remote celery tasks and workers to hit for the database access?
Mostly looking for pros/cons and any factors I haven't thought of.
I can provide single simple API endpoints that provide all the data my tasks need to query and simple POST API endpoints that can create all the database entries my tasks need to create.
I'll have no more than say 10 remote workers that would maybe altogether being doing 1 request per minute.
I'm thinking this probably means my concerns are not so much about the request/response overhead but more about maintainability, architecture, security...
The answer depends on too many variables, concerns, forces and whatnots to be anything else than "it depends...".
I assume you already thought of the following pros and cons, but anyway:
Using an API will make for longer request/response cycles (obviously) and put quite some load on the Django project (front server, app server etc). Also it means that your tasks won't be able to use all of the database features (complex queries, aggregations, whatever).
OTHO adding an API layer will isolate the workers from the inner db schema, which can make migrations (on the Django side) and deployment easier as you won't have to stop all the workers, deploy to everyone and restart the workers. Well it even makes it possible to change the API side technology without impacting the workers (not that I see much reason to do so but anyway...). But it also mean you have a whole API to maintain, and chances are that model changes - or at last part of them - will impact your API and/or your tasks code anyway (if the changes are about adding features that the workers should use etc).
IOW, it really depends (yeah, I already said so, didn't I ?) on your project's needs and constraints, and only you/your team know which solution will best match your project.

Pattern for sharing a large amount of data between the web application and a backend service in a Service Oriented Application

I have a web application which performs CRUD operations on a database. At times, I have to run a backend job to do a fair amount of number crunching/analytics on this data. This backend job will be written as a different service in a concurrent language, which will be independent of the main web application.
But actually sharing the DB between the 2 applications is probably not a best practice as it will lead to tight coupling. What is the right pattern to use here? Since this data might amount to millions of DB rows, I'm not sure using a message queue / REST APIs would be the best way to go.
This is perhaps a very common scenario and many companies/devs have already solved this problem. Any pointers will be helpful.
From the question, it would seem that the background job does not modify state of database.
Simplest way to avoid performance hit on main application, while there is a background job running, is to take database dump and perform analysis on that dump.

Speeding up page loads while using external API's

I'm building a site with django that lets users move content around between a bunch of photo services. As you can imagine the application does a lot of api hits.
for example: user connects picasa, flickr, photobucket, and facebook to their account. Now we need to pull content from 4 different apis to keep this users data up to date.
right now I have a a function that updates each api and I run them all simultaneously via threading. (all the api's that are not enabled return false on the second line, no it's not much overhead to run them all).
Here is my question:
What is the best strategy for keeping content up to date using these APIs?
I have two ideas that might work:
Update the apis periodically (like a cron job) and whatever we have at the time is what the user gets.
benefits:
It's easy and simple to implement.
We'll always have pretty good data when a user loads their first page.
pitfalls:
we have to do api hits all the time for users that are not active, which wastes a lot of bandwidth
It will probably make the api providers unhappy
Trigger the updates when the user logs in (on a pageload)
benefits:
we save a bunch of bandwidth and run less risk of pissing off the api providers
doesn't require NEARLY the amount of resources on our servers
pitfalls:
we either have to do the update asynchronously (and won't have
anything on first login) or...
the first page will take a very long time to load because we're
getting all the api data (I've measured 26 seconds this way)
edit: the design is very light, the design has only two images, an external css file, and two external javascript files.
Also, the 26 seconds number comes from the firebug network monitor running on a machine which was on the same LAN as the server
Personally, I would opt for the second method you mention. The first time you log in, you can query each of the services asynchronously, showing the user some kind of activity/status bar while the processes are running. You can then populate the page as you get the results back from each of the services.
You can then cache the results of those calls per user so that you don't have to call the apis each time.
That lightens the load on your servers, loads your page fast, and provides the user with some indication of activity (along with incrimental updates to the page as their content loads). I think those add up to the best User Experience you can provide.