Can you daemonize Celery through your django site? - django

Reading the daemonization Celery documentation, if you're running Celery on Linux with systemd, you set it up with two files:
/etc/systemd/system/celery.service
/etc/conf.d/celery
I'm using Celery in a Django site with django-celery-beat, and the documentation is a little confusing on this point:
Example Django configuration
Django users now uses [sic] the exact same template as above, but make sure that the module that defines your Celery app instance also sets a default value for DJANGO_SETTINGS_MODULE as shown in the example Django project in First steps with Django.
The docs don't just come out and say, put your daemonization settings in settings.py and it will all work out, bla, bla. From another SO posts this user seems to have run into the same confusion where Django instructions imply you use init.d method.
Bonus point if you can answer if it's possible to run Celery and RabbitMQ both configured and with the Django instance (if that makes sense).
I'm thinking not Celery if only because daemon variables include CELERYD_ and first steps with django say: "...all Celery configuration options must be specified in uppercase instead of lowercase, and start with CELERY_"

Related

Decouple and Dockerize Django and Celery

I am wondering what is the best way to decouple Celery from Django in order to dockerize the two parts and use docker swarm service? Typically one starts their celery workers and celery beat using a command that references there Django application:
celery worker -A my_app
celery beat -A my_app
From this I believe celery picks up config info from settings file and a celery.py file which is easy to move to a microservice. What I don't totally understand is how the tasks would leverage the Django ORM? Or is that not really the microservices mantra and Celery should be designed to make GET/POST calls to Django REST Framework API for the data it needs to complete the task?
I use a setup where the code for both the django app and its celery workers is the same (as in a single repository).
When deploying I make sure to have the same code release everywhere, to avoid any surprises with the ORM, etc...
Celery starts with a reference to the django app, so that it has access to the models, etc...
Communication between the workers and the main app happens either through the messaging queue (rabbitmq or redis...) or via the database (as in, the celery worker works directly in the db, since it knows the models, etc...).
I'm not sure if that follows the microservices mantra, but it does work :)
Celery's .send_task or .signature might be helpful:
https://www.distributedpython.com/2018/06/19/call-celery-task-outside-codebase/

Django and loading config file before server starts

My django app is communicating with external server and before running django server i would like load some config file. Variables from this file are going to be used by some module while app is running
Problem is that config file can be located in many places.
My dream would be to run manage.py --cfg "/path/to/cfg/file.cfg" or
manage.py runserver --cfg "/path/to/cfg/file.cfg"
and some variables (like globals?) are going to be loaded and they are going to be avaible for django modules to be used. After django server shutdown those variables can dissapear
Is there some nice way to accomplish this?
There seem to be two parts to your problem:
How do I support changing which set of variables (as defined in a config file) are used for a given run
How can I load these variables such that they are visible to all the modules of my application.
The standard mechanism for doing the 2nd is to put stuff in settings.py.
If you do FOO="bar" in settings.py, in your module you can do:
from django.conf import settings
if settings.FOO == "bar":
# Do something
As far as how you can support multiple configurations, the first thing I could come up with is to rename your real settings.py to be real_settings.py and then create a series of config1_settings.py, config2_settings.py, config3_settings.py ... which look like:
from real_settings.py import *
from path_to_configX.py import *
where configX.py has all the values for whatever variables you want for configuration X.
You would then start django's built in server via:
manage.py runserver --settings=configX_settings.py
Note that doing this for a production server (where you can't as easily just pass something on the command line to kick it off) may be a bit trickier, but you're going to need to provide more use case details for that.

Custom celery settings file for django-celery

Django-celery seems to use the current django project's settings.py file to read the configuration for celery, and, correct me if I am wrong, there is no way to override this behavior. The problem is that my celery configuration is in a different file outside of the django project, and I need to somehow tell django-celery to use that file instead. How do I do this?
Generally, django-celery used for one project(but many applications). Of course, you can make symlink of the settings.py from one project to other and run Django-celery as:
./manage.py celeryd --settings=path.to.symlink
but in my opinion would be better decision to use celery as daemon with common settings and CELERY_IMPORTS to tasks from any django projects

Running Django with Gunicorn - Best Practice

There are 3 ways to run a django application with gunicorn:
Standard gunicorn + wsgi (ref django doc)
gunicorn project.wsgi:application
Using gunicorn django integration (ref gunicorn doc and django doc):
python manage.py run_gunicorn
Using gunicorn_django command (ref gunicorn doc)
gunicorn_django [OPTIONS] [SETTINGS_PATH]
Django's documentation suggests using 1., which is not even listed as an option on Gunicorn documentation.
Is there any best practice on the best way to run a django app with gunicorn, and what are the foreseable advantages/disadvantages of these different solutions?
Taking a glimpse at gunicorn's code it looks like they pretty much all do the same: 2. seems to be creating a wsgi app using django's internals, and 3. uses 2.
If that's the case, I wouldn't even understand what's the reason for not simply using "1." all the time, especially since a wsgi.py file is autocreated for you since django 1.4; if that's true maybe simply a documentation improvement should be suggested...
Also, best practice for gunicorn settings with django would be great. Using 1., does it make sense to set some defaults in the wsgi file and avoid additional settings?
References:
Should I use django-gunicorn integration or wsgi? only concerns choices 1. and 3., there's no hint for the settings and the answer gives no rationale
Deploying Django with gunicorn and nginx give some broader information but is not strictly related nor answer this question
Django Gunicorn wsgi about version "4", which is launching gunicorn -c configfile and configfile will point to django_settings to django
Django WSGI and Gunicorn is just a bit confusing :) mixing up 1. and 3. Of course wsgi.py is used only with 1.
After checking out I'd say that the best way is using gunicorn + wsgi
$ gunicorn project.wsgi:application
It's now both confirmed in gunicorn docs: if you run Django 1.4 or newer, it’s highly recommended to simply run your application with the WSGI interface using the gunicorn command and django as linked above.
It also avoids adding gunicorn as installed app, which means it's not a requirement to install gunicorn to test your app which might be useful from time to time.
About Settings
The Django settings file to be used can be passed through an ENV variable, or customized in the wsgi.py file. I sometimes create several wsgi.py files if I have multiple settings (eg. multiple websites) that have to run from the same project - See Django Doc for more info.
A one-liner solution that does not require any new file from Carl's comment:
DJANGO_SETTINGS_MODULE=project.settings.prod gunicorn project.wsgi:application
sounds like a nicer way (though I'll probably end up writing it in some shell commands to make it easy to "remember").
Gunicorn settings can be passed as -c settings_file, but I'm exploring other ways and will try to update this answer if I find any. Using environment variables seems a workaroud, but only for limited cases
In particular it would be nice to get/share some settings between django and gunicorn; gunicorn documentation says:
Currently, only Paster applications have access to framework
specific settings. If you have ideas for providing settings to WSGI
applications or pulling information from Django’s settings.py feel
free to open an issue to let us know.
(Update: haven't found any smarter way, but after all env variables are enough for my most-common cases).
I guess using run_gunicorn is the way to go, it's also the simplest way to use it.
It's basically the same as usign gunicorn project.wsgi:application but needs gunicorn to be added to INSTALLED_APPS so that django recognizes the run_gunicorn command, therefore it's probably not the default way...
Using gunicorn_django is more or less deprecated, as the documentation also states here...
Found a solution that works both for manage.py (on local machine) and with gunicorn
create a file that has all the evniroment stuff
# mysite/settings/set_env.py
import os
_environment = os.getenv('environment', None)
if _environment == "production":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings.production")
elif _environment == "development":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings.development")
else:
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings.local")
and import this file in your manage.py and wsgi.py files, no need to change anything else in gunicorn execution

Does django's runserver option provide a hook for running other restart scripts?

I've recently been playing around with django and celery. One annoying thing during development is the fact that I have to restart the celery daemon each time I modify a task. When I'm developing, I usually like to use 'manage.py runserver' which automatically reloads the django framework on modifications to my apps.
Is there a way to add a hook to the reloading process that runserver does so that it automatically restarts the celery daemon I have running?
Alternatively, does celery have a similar monitor-and-reload-on-change mode that I should be using for development?
Django-supervisor works very well for this purpose. You can have it start the Django server, Celery, and anything else you need, and have different configurations for development and production servers. It also knows to reload the celery daemon when your code changes.
https://github.com/rfk/django-supervisor
I believe you can set CELERY_ALWAYS_EAGER to true.
Yes. Django provides auto reload hook, which can be used to restart other scripts.
Here is a simple management command which prints a message on reload
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
def reload():
print('Code changed. Auto reloading...')
class Command(BaseCommand):
def handle(self, *args, **options):
autoreload.main(reload)
Now you can save to a reload.py and run it with python manage.py reload. A management command to reload celery workers is available here.
Celery didn't have any feature for reload code or for auto restart when the code change, than you have to restart it manually.
There isn't a way for add an hook, and I think not worthwhile of edit the source code of django just for perform a restart.
Personally while I'm developing i prefere to see the output shell of celery that is decorated with color instead of tail the logs, is more readable.
Celery 2.5 has an experimental runtime option --autoreload that could be used for this purpose, too. Here's more detail in the release notes. That being said, I think django-supervisor (via #Lee Semel) looks like the better way of doing things. I thought I would post this alternative here in case other readers do not want to have to configure another app for asynchronous processing.