Django - What happens when setting WEB_CONCURRENCY too high in heroku 1xdyno - django

I have a Django app running in Heroku with GUnicorn, I have 1Xdyno and just found that you can set your WEB_CONCURRENCY.
What's the optimal WEB_CONCURRENCY?

Article Deploying Python Applications with Gunicorn tells about various parameters of Gunicorn and their effect on Heroku.
Below is the text copied from this article regarding WEB_CONCURRENCY
Gunicorn forks multiple system processes within each dyno to allow a Python app to support multiple concurrent requests without requiring them to be thread-safe. In Gunicorn terminology, these are referred to as worker processes (not to be confused with Heroku worker processes, which run in their own dynos).
Each forked system process consumes additional memory. This limits how many processes you can run in a single dyno. With a typical Django application memory footprint, you can expect to run 2–4 Gunicorn worker processes on a 1X dyno. Your application may allow for a variation of this, depending on your application’s specific memory requirements.
We recommend setting a configuration variable for this setting, so you can tweak it without editing code and redeploying your application.
$ heroku config:set WEB_CONCURRENCY=3

Related

Prevent multiple instances of Flask app on IIS

We developed a Flask webapp, and want to deploy it on IIS. During development, we started the app via flask run, which lanches a single instance of our app. On IIS, however, we observed (via the task manager) that our app runs multiple instances concurrently.
The problem is that our app is not designed to run in parallel. For example, our app reads a file from the file system and keeps it in memory for efficiency. This optimization is correct only if it is guaranteed that no other process changes the content of the file.
Is there a way to prevent IIS from starting multiple instances?
In IIS, you can go to FastCGI settings, in there you can see all the applications used by websites on your server. In the column "Max. Instances", the script you are talking about is probably set to 0 (or some value greater than 1), meaning it can be started multiple times. Limiting this to 1 will solve your problem.
you could use below code to run only one instance of a program:
from tendo import singleton
me = singleton.SingleInstance() # will sys.exit(-1) if other instance is running
the command to install:
pip install tendo
Reference link:
Check to see if python script is running
How to make only one instance of the same script running at any time
https://github.com/pycontribs/tendo/blob/master/tendo/singleton.py

running nginx/postgres with supervisord - required?

In all standard django productions setup templates I've seen, gunicorn is run with supervisor, whereas nginx/postgres are not configured under supervisor.
Any reason? Is this required for a production system? If not, why not?
In this architecture, Gunicorn works as the application server which runs our Django code. Supervisor is just a process management utility which restarts the Gunicorn server if it crashes. The Gunicorn server may crash due to our bad code, but nginx and postgres remain intact. So in the basic config we only look after the gunicorn process through supervisor. Though we could do the same for nginx and postgres too.
You need supervisor for gunicorn because it's an simply server without any tools to restart it, run it at system startup, stop it at system shutdown or reload when it crashes.
Postgresql and nginx can take care of themselves in that aspect, so there is no need for them to be running under supervisor.
Actually, you can just use init.d, upstart or system.d to start, stop and restart gunicorn, supervisor is just easier way to handle such small servers like gunicorn.
Consider also that it is common to run multiple django apps on one system, and that requires multiple separated instances of gunicorn. Supervisor will handle them better than init, upstart or system.d
There is also uWSGI server that won't need supervisor, because it has built-in features to handle multiple instances, starting, stopping and also auto-reloading on code change. Look at uWSGI emperor system.

Heroku, Django and celery on RabbitMQ

I'm building a Django project on Heroku.
I understand that gunicorn is recommended as a webserver so I need an event loop type of worker and I use gevent for that.
It seems that monkey patching gevent does most of the work for me so I can have concurrency, but how am I supposed to connect to the RabbitMQ without real threads or jamming the whole loop?
I am baffled by this since Heroku themselves recommend gunicorn, celery and RabbitMQ but I don't see how all of these work together.
Do you understand that celery and gunicorn are used for different purposes?
Gunicorn is the webserver responding to requests made by users, serving them web pages or JSON data.
Celery is an asynchronous task manager, i.e. it lets you run arbitrary python code irrespective of web requests to your server.
Do you understand this distinction?

DO i need to execute separate celeryd if i run multiple sites in Django

Currently i have one site on Django. But i am planning to run more Django sites.
So i want to know that do i need to run celeryd for every new site or one is enough.
I am running `celeryd daemon via supervisor
If each site is going to be running on different code, and you plan on using a different celery backend so that the messages don't collide, then you will need to use one celeryd for each new site.
Here is some more info, not a lot but it is something:
http://groups.google.com/group/celery-users/browse_thread/thread/85e5aa8458310439

Different methods to deploy Django project and their pros and cons?

I am quite a noob when it comes to deploying a Django project. I'd like to know what are the various methods to deploy Django project and which one is the most preferred.
The Django documentation lists Apache/mod_wsgi, Apache/mod_python and FastCGI etc.
mod_python is deprecated now, one should use mod_wsgi instead.
Django with mod_wsgi is easy to setup, but:
you can only use one python version at a time [edit: you even can only use the python version mod_wsgi was compiled for]
[edit: seems if I'm wrong on mod_wsgi not supporting virtualenv: it does]
So for multiple sites (targeting different django/python versions) on a server mod_wsgi is not
the best solution.
FastCGI can be used with virtualenv, also with different python versions, as you run it with
./manage.py runfcgi …
and then configure your webserver to use this fcgi interface.
The new, hot stuff about django deployment seems to be gunicorn. It's a webserver that implements wsgi and is typically used as backend with a "big" webserver as proxy.
Deployment with gunicorn feels a lot like fcgi: you run a process doing the django processing stuff with manage.py, and a webserver as frontend to the world.
But gunicorn deployment has some advantages over fcgi:
speed - I didn't find the sources, but benchmarks say fcgi is not as fast as the f suggests
config files, for fcgi you must do all configuration on the commandline when executing the manage.py command. This comes unhandy when running multiple django instances via an init.d (unix-like OS' system service startup). It's always the same cmdline, with just different configuration files
gunicorn can drop privileges: no need to do this in your init.d script, and it's easy to switch to one user per django instance
gunicorn behaves more like a daemon: writing pidfile and logfile, forking to the background etc. makes again using it in an init.d script easier.
Thus, I would suggest to use the gunicorn solution, unless you have a single site on a single server with low traffic, than you could use the wsgi solution. But I think in the long run you're more happy with gunicorn.
If you have a django only webserver, I would suggest to use nginx as frontendproxy, as it's the best performing (again this is based on benchmarks I read in some blogposts - don't have the url anymore).
Personally I use apache as frontendproxy, as I need it for other sites hosted on the server.
A simple setup instruction for django deployment could be found here:
http://ericholscher.com/blog/2010/aug/16/lessons-learned-dash-easy-django-deployment/
My init.d script for gunicorn is located at github:
https://gist.github.com/753053
Unfortunately I did not yet blog about it, but an experienced sysadmin should be able to do the required setup.
Use the Nginx/Apache/mod-wsgi and you can't go wrong.
If you prefer a simple alternative, just use Apache.
There is a very good deployment document: http://lethain.com/entry/2009/feb/13/the-django-and-ubuntu-intrepid-almanac/
I myself have faced a lot of problems in deploying Django Projects and automating the deployment process. Apache and mod_wsgi were like curse for Django Deployment. There are several tools like Nginx, Gunicorn, SupervisorD and Fabric which are trending for Django deployment. At first I used/configured them individually without Deployment automation which took a lot of time(I had to maintain testing as well as production servers for my client and had to update them as soon as a new feature was tested and approved.) but then I stumbled upon django-fagungis, which totally automates my Django Deployment from cloning my project from bitbucket to deploying on my remote server (it uses Nginx, Gunicorn, SupervisorD, Fabtic and virtualenv and also installs all the dependencies on the fly), all with just three commands :) You can find more about it in my blog post here. Now I even don't have to get involved in this process(which used to take a lot of my time) and one of my junior developers runs those three commands of django-fagungis mentioned here on his local machine and we get a crisp new copy of our project deployed in minutes without any hassle:)