Is it possible to roughly estimate how many concurrent requests an API might receive?
Let's say it's a super simple API that just returns "hello" to a GET request deployed on a 16gb machine. In general, how many concurrent requests could it support before it starts to melt or say nah?
If it failed because of too many concurrent requests, what would happen?
Requests over the threshold would time out
Machine would crash
As PiRocks suggested, I ran an experiment
Deployed a simple node.js api app to heroku
Deployed the app to heroku (machine specs TBD - looking around if they even list it)
Signed up for a free account on loader.io
Unfortunately, the maximum for free is 10k requests over 15s, aka 666 QPS. That resulted in a 2ms average response time, no timeouts, and no errors. Might upgrade to see what it looks like from there.
Update: seems like 2K QPS is where I started to see errors. More details here
Related
While load/performance testing of an API on DNS in AWS using JMeter, we observed relatively higher response times(~ 230 ms) in AWS windows machine. When this test is performed in my local machine, the response times are around 110 ms.The throughput/# of samples served does change widely due to this response time.
The tests were ran for 1 hour each with no delay for three times in both the machines. The only difference I see is my RAM size is 16 GB while AWS is 4 GB. Will this really make such a big difference? or is there something I am missing.
AWS Machine configuration:
My local machine configuration:
Can anyone share their thoughts?
I can think of 2 possible reasons:
Your AWS machine is located in the region which is geographically more far from the endpoint than your local machine
It might really be the case JMeter lacks resources on the AWS instance and hence cannot send requests fast enough so make sure to:
Monitor the resources available to JMeter using Windows PerfMon or Amazon Cloudwatch or JMeter Perfmon Plugin as JMeter might be very resources intensive and it should heave sufficient headroom to operate
Follow JMeter Best Practices
We have been maintaining a project internally which has both web and mobile application platform. The backend of the project is developed in Django 1.9 (Python 3.4) and deployed in AWS.
The server stack consists of Nginx, Gunicorn, Django and PostgreSQL. We use Redis based cache server to serve resource intensive heavy queries. Our AWS resources include:
t1.medium EC2 (2 core, 4 GB RAM)
PostgreSQL RDS with one additional read-replica.
Right now Gunicorn is set to create 5 workers (by following the 2*n+1 rule). Load wise, there are like 20-30 mobile users making requests in every minute and there are 5-10 users checking the web panel every hour. So I would say, not very much load.
Now this setup works alright for 80% days. But when something goes wrong (for example, we detect a bug in the live system and we had to switch off the server for maintenance for few hours. In the mean time, the mobile apps have a queue of requests ready in their app. So when we make the backend live, a lot of users hit the system at the same time.), the server stops behaving normally and started responding with 504 gateway timeout error.
Surprisingly every time this happened, we found the server resources (CPU, Memory) to be free by 70-80% and the connection pool in the databases are mostly free.
Any idea where the problem is? How to debug? If you have already faced a similar issue, please share the fix.
Thank you,
I have been doing load test for very long in my company but tps never passed 500 transaction per minute. I have more challenging problem right now.
Problem:
My company will start a campaing and ask a questiong to it's customers and first correct answer will be rewarded. Analists expect 100.000 k request in a second at global maximum. (doesnt seem to me that realistic but this can be negotiable)
Resources:
Jmeter,
2 different service requests,
5 x slave with 8 gb ram,
80 mbps internet connection,
3.0 gigahertz
Master computer with same capabilities with slaves.
Question:
How to simulete this scenario, is it possible? What are the limitations. How should be the load model. Are there any alternative to do that?
Any comment is important..
Your load test always need to represent real usage of application by real users so first of all carefully implement your test scenario to mimic real human using a real browser with all its stuff like:
cookies
headers
embedded resources (proper handling of images, scripts, styles, fonts, etc.)
cache
think times
etc.
Make sure your test is following JMeter Best Practices, i.e.:
being run in non-GUI mode
all listeners are disabled
JVM settings are optimised for maximum performance
etc.
Once done you need to set up monitoring of your JMeter engines health metrics like CPU, RAM, Swap usage, Network and Disk IO, JVM stats, etc. in order to be able to see if there is a headroom to continue. JMeter PerfMon Plugin is very handy as its results can be correlated with the Test Metrics.
Start your test from 1 virtual user and gradually increase the load until you reach the target throughput / your application under test dies / JMeter engine(s) run out of resources, whatever comes the first. Depending on the outcome you will either report success or defect or will need to request more hosts to use as JMeter engines / upgrade existing hardware.
I have an Apache2 and Django (mod_wsgi) setup that provides a RESTful API. I have a set of automated tests for this, that executes ~1000 API requests (pure http GET/POST/PUT/DELETE) in sequential order.
The problem is, for every 80 requests or so, I get a strange lag/timeout for exactly 5s or 10s. See timestamp examples here:
Request 1: 2013-08-30T03:49:20.915
Response 1: 2013-08-30T03:49:30.940
Request 2: 2013-08-30T03:50:32.559
Response 2: 2013-08-30T03:50:37.597
I can't figure out why this happens. I have an apache config with KeepAlive Off (recommended setup setting for Django) but otherwise standard install for Ubuntu 12.04 LTS.
I'm running the tests from the same server where the webserver is, I first thought this was some kind of DNS cache thing, but I've added the hostname I'm requesting to /etc/hosts but the problem persists.
The system is idle and have lots of cpu and mem when this lag/timeouts happens.
The lag is not specific to a certain request (URL), it seems kinda random.
Considering that it's always exactly to the millisecond 5s or 10s, it feels like this is some specific setting somewhere causing this.
In case it provides some insight, watch my talk from PyCon US.
http://lanyrd.com/2013/pycon/scdyzk/
The talk deals with things like process churn and startup costs. One thing you shouldn't do is set maximum requests if you don't really need it.
Also consider trying New Relic to help diagnose where the issue is. That will save a lot of guessing if it is a web application of backend service infrastructure issue.
As far as seeing how such monitoring can help, watch another one of my PyCon talks.
http://lanyrd.com/2012/pycon/spcdg/
This was a DNS issue, adding the domainname I used locally to /etc/hosts actually solved the problem. I just hadn't reboot the server for the changes to take effect, thought restarting networking would take care of that, but apparently not.
I'm developing a web application in python for which each user request makes an API call to an external service and takes about 20 seconds to receive response. As a result, in the event of several concurrent requests being made, the CPU load goes crazy (>95%) with several idle processes.
The server consists of a 1.6 GHz dual core Atom 330 with 2GB RAM.
The web app is developed in python which is served through Apache with mod_wsgi
My question is the following. Will a non-blocking webserver such as Tornado improve CPU load and thus handle more concurrent users (I'm also interested why) ? Can you suggest any other scalable solution ?
This really doesn't have anything to do with blocking; it does, but it doesn't. The 20 sec request is blocking one thread, so another has be utilized for the next request. Whereas with quick requests, the threads basically round-robin.
However, this really shouldn't be spiking your CPU output. Webservers have an upward limit of "workers" that get spawned and when they're all tied up, they're all tied up. It won't extend past the limit, so unless you've set or the default setting is higher than the box you have is capable of running, it shouldn't push your CPU that high.
Regardless, all that is merely informational, and doesn't really solve your problem. With such a long running request though, you should be offloading this from your webserver as quick as possible. The webserver should merely hand off the request to another process that can asyncronously handle it and then employ polling to notify the client when the response is ready. Node.js is used a lot in similar scenarios, but I really don't have enough experience with it to give you any real guidance beyond that.
You should look into using message queues to offload tasks so that your user requests are not blocked.
You could look into python libs kombu and celery to handle messages and tasks.
You are likely using prefork MPM with Apache and mod_wsgi embedded mode. This is a bad combination by default because Apache is setup for PHP and not fat Python web applications. Read:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
which explains this exact sort of issue.
Use mod_wsgi daemon mode at the minimum, and preferably also change to worker MPM for Apache.