How do I make highly scalable chat app with django? - django

I am currently using ajax , on set interval page would refresh giving a real time feel but i fear on long run if if it scale will clients have the problem, as page refresh every second I think it may overload server, which is the best option for a real time chat app using django, except redis(not free and it has limit)
The app should be scalable upto 1000 concurrent connections
Server is vps with 4 core 4gb ram and 4tb bandwidth

Related

504 gateway timeout for any requests to Nginx with lot of free resources

We have been maintaining a project internally which has both web and mobile application platform. The backend of the project is developed in Django 1.9 (Python 3.4) and deployed in AWS.
The server stack consists of Nginx, Gunicorn, Django and PostgreSQL. We use Redis based cache server to serve resource intensive heavy queries. Our AWS resources include:
t1.medium EC2 (2 core, 4 GB RAM)
PostgreSQL RDS with one additional read-replica.
Right now Gunicorn is set to create 5 workers (by following the 2*n+1 rule). Load wise, there are like 20-30 mobile users making requests in every minute and there are 5-10 users checking the web panel every hour. So I would say, not very much load.
Now this setup works alright for 80% days. But when something goes wrong (for example, we detect a bug in the live system and we had to switch off the server for maintenance for few hours. In the mean time, the mobile apps have a queue of requests ready in their app. So when we make the backend live, a lot of users hit the system at the same time.), the server stops behaving normally and started responding with 504 gateway timeout error.
Surprisingly every time this happened, we found the server resources (CPU, Memory) to be free by 70-80% and the connection pool in the databases are mostly free.
Any idea where the problem is? How to debug? If you have already faced a similar issue, please share the fix.
Thank you,

Slow initial response time

I have a website that is experiencing some slow initial response time. The site is built with Django, and runs on an Apache2 server on Ubuntu. I have been using the Django Debug Toolbar for debugging and optimization.
When making a request to a user profile page, the browser is 'waiting' for ~800ms and receiving for ~60ms for the initial request. However, the Django Debug Toolbar shows that time spent on the CPU and time spent on SQL queries only adds up to ~425ms.
Chrome devtools:
Django debug toolbar:
Even a request to the index page (which has no SQL queries and almost no processing - it just responds with the template) shows ~250ms wait time.
I tried temporarily upgrading the VM to a much more powerful CPU, but that did not (noticeably) change this metric.
This leads me to believe that the wait is not due to inefficient code or database latency, but instead due to some Apache or Ubuntu settings.
After the initial response, the other requests to load page resources (js files, images etc) have a much more reasonable wait time of ~20ms.
What could account for the relatively large initial 'waiting' time?
What tools can I use to get a better picture of where that time is going?

Should I use MSMQ or IIS

I have a web site that exposes a web service to all my desktop clients.
Randomly, these clients will invoke the web service which in turn will add a message jpeg in byte array format to the MSMQ.
I have a service application that reads from this queue and performs an enhancement on this jpeg and saves it to the hard drive.
The number of clients uploading at anyone time is unpredictable.
I choose this method because I do not want to put any strain on IIS. The enhancements my service application performs is not much 'erg' but it exists nevertheless.
However, after realizing that my service application had stopped for sometime and required restarting I noticed the RAM leap up to clear the backlog. Whilst I have corrected this and the service is now coded to restart automatically on fail I surmised that a backlog could exists at busy times which again give a higher RAM usage.
Now, should I accept to do the processing all within my web service and then save to the hard drive or am I correct in using a MSMQ?
I am using C# and asp.net

Apache to slow to responde, but CPU and memory not max out

The problem
2 apache servers have a long response time, but I do not see CPU or memory max out.
Details
I have 2 apache server servering static content for client.
This web site has a lot of traffic.
At high traffic I have ~10 request per second (html, css, js, images).
Each HTML is making 30 other request to the servers for loading js, css, and images.
Safari developer tool show that 2MB of that is getting transfer each time I hit a html page
These two server are running on Amazon Web Service
both instances are m1.large (2 CPUS, 7.5 RAM)
I'm serving images in the same server
server are in US but a lot of traffic comes from Europe
I tried
changing from prefork to worker
increasing processses
increasing threads
increasing time out
I'm running benchmarks with ab (apachebench) and I do not see improvement.
My question are:
Is it possible that serving the images and large resorouces like js (400k) might be slowing down the server?
Is it possible that 5 request per second per server is just too much traffic and there is no tuning I can do, so only solution is to add more servers?
does amazon web services have a problem with bandwidth?
New Info
My files are being read from a mounted directory on GlusterFS
Metrics collected with ab (apache bench) run on a EC2 instance on same network
Connections: 500
Concurrency: 200
Server with files on mounted directory (files on glusterfs)
Request per second: 25.26
Time per request: 38.954
Transfer rate: 546.02
Server without files on mounted directory (files on local storage)
Request per second: 1282.62
Time per request: 0.780
Transfer rate: 27104.40
New Question
Is it possible that a reading the resources (htmls, js, css, images) from a mounted directory (NFS or GlusterFS) might slow down dramatically the performance of Apache?
Thanks
It is absolutely possible (and indeed probable) that serving up large static resources could slow down your server. You have to have Apache worker threads open the entire time that each one of these pieces of content are being downloaded. The larger the file, the longer the download, and the longer you have to hold a thread open. You might be reaching your max threads limits before reaching any sort of memory limitations you have set for Apache.
First, I would recommend getting all of your static content off of your server and into Cloudfront or similar CDN. This will make it to where your web server will only have to worry about the primary web requests. This might take the requests per second (and related number of open Apache threads) down from 10 request/second to like .3 requests/second (based on your 30:1 ratio of primary requests to secondary content requests).
Reducing the number of requests you are serving by over an order of magnitude will certainly help server performance and possibly allow you to reduce down to a single server (or if you still want multiple servers - which is a good idea) possibly reduce the size of your servers.
One thing you will find that basically all high volume websites have in common is that they leave the business of serving up static content to a CDN. Once you get to the point of being a high volume site, you must absolutely consider this (or at least serve static content from different servers using Nginx, Lighty, or some other web server better suited for serving static content than Apache is).
After offloading your static traffic, then you can really start with worrying about tuning your web servers to handle the primary requests. When you get to that point, you will need to know a few things:
The average memory usage for a single request thread
The amount of memory that you have allocated to Apache (maybe 70-80% of overall instance memory if this is dedicated Apache server)
The average amount of time it takes your application to respond to requests
Based on that, it is a pretty simple formula to make a good starting point for tuning your max thread settings.
Say you had the following:
Apache memory: 4000KB
Avg. thread memory: 20KB
Avg. time per request: 0.5 s
That means your configuration could handle request throughput as follows:
100 requests/second = 4000kb / (20kb * 0.5 seconds/request )
Since each request averages 0.5s, you could assume that you would need 50 threads to handle this throughput.
Obviously, you would want to set you max threads higher then 50 to account for request spikes and such, but at least this gives you a good place to start.
Try to start/stop the instance. This will move you to a different host. If the host your instance is on is having any issues, that will mitigate it.
Beyond checking system load numbers, take a look at memory usage, IO and CPU usage.
Look at your system log to see if anything produced an error that may explain the current situation.
Checkout Eric J. answer in this thread Amazon EC2 Bitnami Wordpress Extremely Slow

Django app freezing with a few concurrent requests

I have a django app without views, I only use it to provide a REST API using django-piston package.
Since I have deployed it to amazon-ec2 with mod-wsgi, after some requests it freezes, and the CPU goes to 100% of usage divided by python and httpd processes.
I'm using Postgres 8.4, Python 2.5 and Django 'ENGINE': 'django.contrib.gis.db.backends.postgis'.
Logs don't show me any problem. How can I debug the problem?
Sounds like you're in a micro instance. Micro instances are able to burst large amounts of cpu for a VERY short amount of time, after that they must drop to very low background levels for an extended duration or else amazon with harshly throttle it. If you're getting concurrent requests most likely even a lightly cpu intensive app would cause the throttling to kick in.
Micro instances are only usable for very very light traffic on something like a very basic blog and that's like it.
Their user guide goes into this in detail: Micro Instance guide.