I have a django app without views, I only use it to provide a REST API using django-piston package.
Since I have deployed it to amazon-ec2 with mod-wsgi, after some requests it freezes, and the CPU goes to 100% of usage divided by python and httpd processes.
I'm using Postgres 8.4, Python 2.5 and Django 'ENGINE': 'django.contrib.gis.db.backends.postgis'.
Logs don't show me any problem. How can I debug the problem?
Sounds like you're in a micro instance. Micro instances are able to burst large amounts of cpu for a VERY short amount of time, after that they must drop to very low background levels for an extended duration or else amazon with harshly throttle it. If you're getting concurrent requests most likely even a lightly cpu intensive app would cause the throttling to kick in.
Micro instances are only usable for very very light traffic on something like a very basic blog and that's like it.
Their user guide goes into this in detail: Micro Instance guide.
Related
Hi everyone at this point I am at a loss. I am in the process of brining our application up in Cloud Run. I did have a little bit of worry running "serverless" for this application stack. I have 1 frontend NextJS application and Backend GraphQL (node) application. I use schema stitching against the backend to connect to our managed CMS service.
I bumped the number of max instances the Serverless VPC access connector and this really helped the requests from 5/6 slowdown to 1/6.
Right now, 1/6 requests are being very slow (nearly 4minutes to resolve). The slow request is when the browser asks for the API to return information from the CMS + some information on the CMS data from our internal DB. I have this same application running in heroku with less resources and it is running smooth so to me that rules out a code issue.
Areas I have checked:
Cloud SQL CPU and connection pool (very low)
Container Image size (small as a node app can be)
Keeping min and max number of containers at the same number
Allocating CPU
I am at the point of thinking the backend application is just not a proper fit for Cloud Run but I do like the simplicity of getting applications running in cloud run and their approach to serverless.
Any help appreciated
Is it possible to roughly estimate how many concurrent requests an API might receive?
Let's say it's a super simple API that just returns "hello" to a GET request deployed on a 16gb machine. In general, how many concurrent requests could it support before it starts to melt or say nah?
If it failed because of too many concurrent requests, what would happen?
Requests over the threshold would time out
Machine would crash
As PiRocks suggested, I ran an experiment
Deployed a simple node.js api app to heroku
Deployed the app to heroku (machine specs TBD - looking around if they even list it)
Signed up for a free account on loader.io
Unfortunately, the maximum for free is 10k requests over 15s, aka 666 QPS. That resulted in a 2ms average response time, no timeouts, and no errors. Might upgrade to see what it looks like from there.
Update: seems like 2K QPS is where I started to see errors. More details here
I have a working Django application that is running locally using an sqlite3 database without problem. However, when I change the Django database settings to use my external AWS RDS database all my pages start taking upwards of 40 seconds to load. I have checked my AWS metrics and my instance is not even close to being fully utilized. When I make a request to a view with no database read/write operations I also get the same problem. My activity monitor shows my local CPU spiking with each request. It shows a process named 'WindowsServer' using most of the CPU during each request.
I am aware more latency is expected when using a remote database but I don't think this should result in 40 second page lags. What other problems that could be causing this behaviour?
AWS database monitoring
Local machine
So your computer has connection to the server in Amazon, that's the problem with latency. Production servers should be in the same place as DB servers(or should have very very good connection, so the latency is lowered as much as possible.)
--edit--
So we need more details. What is your ISP? What is your connection properties? Uplink, downlink? What are pings to servers in AWS?
We have been maintaining a project internally which has both web and mobile application platform. The backend of the project is developed in Django 1.9 (Python 3.4) and deployed in AWS.
The server stack consists of Nginx, Gunicorn, Django and PostgreSQL. We use Redis based cache server to serve resource intensive heavy queries. Our AWS resources include:
t1.medium EC2 (2 core, 4 GB RAM)
PostgreSQL RDS with one additional read-replica.
Right now Gunicorn is set to create 5 workers (by following the 2*n+1 rule). Load wise, there are like 20-30 mobile users making requests in every minute and there are 5-10 users checking the web panel every hour. So I would say, not very much load.
Now this setup works alright for 80% days. But when something goes wrong (for example, we detect a bug in the live system and we had to switch off the server for maintenance for few hours. In the mean time, the mobile apps have a queue of requests ready in their app. So when we make the backend live, a lot of users hit the system at the same time.), the server stops behaving normally and started responding with 504 gateway timeout error.
Surprisingly every time this happened, we found the server resources (CPU, Memory) to be free by 70-80% and the connection pool in the databases are mostly free.
Any idea where the problem is? How to debug? If you have already faced a similar issue, please share the fix.
Thank you,
I'm developing a web application in python for which each user request makes an API call to an external service and takes about 20 seconds to receive response. As a result, in the event of several concurrent requests being made, the CPU load goes crazy (>95%) with several idle processes.
The server consists of a 1.6 GHz dual core Atom 330 with 2GB RAM.
The web app is developed in python which is served through Apache with mod_wsgi
My question is the following. Will a non-blocking webserver such as Tornado improve CPU load and thus handle more concurrent users (I'm also interested why) ? Can you suggest any other scalable solution ?
This really doesn't have anything to do with blocking; it does, but it doesn't. The 20 sec request is blocking one thread, so another has be utilized for the next request. Whereas with quick requests, the threads basically round-robin.
However, this really shouldn't be spiking your CPU output. Webservers have an upward limit of "workers" that get spawned and when they're all tied up, they're all tied up. It won't extend past the limit, so unless you've set or the default setting is higher than the box you have is capable of running, it shouldn't push your CPU that high.
Regardless, all that is merely informational, and doesn't really solve your problem. With such a long running request though, you should be offloading this from your webserver as quick as possible. The webserver should merely hand off the request to another process that can asyncronously handle it and then employ polling to notify the client when the response is ready. Node.js is used a lot in similar scenarios, but I really don't have enough experience with it to give you any real guidance beyond that.
You should look into using message queues to offload tasks so that your user requests are not blocked.
You could look into python libs kombu and celery to handle messages and tasks.
You are likely using prefork MPM with Apache and mod_wsgi embedded mode. This is a bad combination by default because Apache is setup for PHP and not fat Python web applications. Read:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
which explains this exact sort of issue.
Use mod_wsgi daemon mode at the minimum, and preferably also change to worker MPM for Apache.