How to configure Celery Daemon with Django - django

From what I can tell, there are two documents describing how to set up celery. There's "Running the worker as a daemon" and there's "First steps with Django".
In the Django docs, it says:
We also add the Django settings module as a configuration source for Celery. This means that you don’t have to use multiple configuration files, and instead configure Celery directly from the Django settings.
Which sounds awesome. However, from what I can tell, these are the files that are needed for a complete Celery daemonization:
/etc/init.d/celeryd
/etc/defaults/celery
/my-proj/celery.py
/my-proj/__init__.py
And possibly:
/my-proj/settings.py
Boy that's a lot of files. I think I've got them all set up properly:
/etc/init.d/celeryd has the default init.d script provided by celery.
/etc/defaults/celery has almost nothing. Just a pointer to my app:
export DJANGO_SETTINGS_MODULE='cl.settings'
/my-proj/celery.py has the recommended file from the First Steps with Django:
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
from django.conf import settings # noqa
app = Celery('proj')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
/my-proj/__init__.py has the recommended code from the First Steps with Django:
from __future__ import absolute_import
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
And I have all the celery-related settings like the following in my settings.py file:
CELERY_BIN = '/var/www/.virtualenvs/my-env/bin/celery'
CELERYD_USER = 'www-data'
CELERYD_GROUP = 'www-data'
CELERYD_CONCURRENCY = 20
BROKER_URL = 'redis://'
BROKER_POOL_LIMIT = 30
Yet, when I start celery using sudo service celeryd start, it doesn't work. Instead, it's clear that it hasn't picked up my settings from my Django project, because it says:
Nov 05 20:51:59 pounamu celeryd[30190]: celery init v10.1.
Nov 05 20:51:59 pounamu celeryd[30190]: Using config script: /etc/default/celeryd
Nov 05 20:51:59 pounamu celeryd[30190]: No passwd entry for user 'celery'
Nov 05 20:51:59 pounamu su[30206]: No passwd entry for user 'celery'
Nov 05 20:51:59 pounamu su[30206]: FAILED su for celery by root
Nov 05 20:51:59 pounamu su[30206]: - ??? root:celery
Nov 05 20:51:59 pounamu systemd[1]: celeryd.service: control process exited, code=exited status=1
Any ideas where the bailing wire isn't working? Am I missing something major?

You are attempting to run celery as the system user "celery" which is the default used by the init script. You should create this user or you can override this by setting CELERYD_USER in /etc/defaults/celery.
Personally I prefer to use supervisord to manage celery.

You are missing the CELERY_APP setting. Set it in the configuration file /etc/defaults/celery or as a parameter for the worker command:
celery worker -A my-proj
Otherwise, Celery has no idea it should look at /my-proj/celery.py. The django environment variable does not affect what Celery loads.

Related

Celery - Received unregistered task of type 'core.tasks.scrape_dev_to'

Trying to get a celery-based scraper up and running. The celery worker seems to function on its own, but when I also run the celery beat server, the worker gives me this keyerror.
File "c:\users\myusername\.virtualenvs\django-news-scraper-dbqk-dk5\lib\site-packages\celery\worker\consumer\consumer.py", line 555, in on_task_received
strategy = strategies[type_]
KeyError: 'core.tasks.scrape_dev_to'
[2020-10-04 16:51:41,231: ERROR/MainProcess] Received unregistered task of type 'core.tasks.scrape_dev_to'.
The message has been ignored and discarded.
I've been through many similar answers on stackoverflow, but none solved my problem. I'll list things I tried at the end.
Project structure:
core -tasks
newsscraper -celery.py -settings.py
tasks:
import time
from newsscraper.celery import shared_task, task
from .scrapers import scrape
#task
def scrape_dev_to():
URL = "https://dev.to/search?q=django"
scrape(URL)
return
settings.py:
INSTALLED_APPS = [
'django.contrib.admin',
...
'django_celery_beat',
'core',
]
...
# I Added this setting while troubleshooting, got a new ModuleNotFound error for core.tasks
#CELERY_IMPORTS = (
# 'core.tasks',
#)
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_BEAT_SCHEDULE = {
"ScrapeStuff": {
'task': 'core.tasks.scrape_dev_to',
'schedule': 10 # crontab(minute="*/30")
}
}
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'newsscraper.settings')
app = Celery('newsscraper')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
When I run debug for the celery worker, I see that celery doesn't have the task I want (scrape_dev_to) registered. Shouldn't the app.autodiscover_tasks() call in celery.py take care of this? Here's the output:
. celery.accumulate
. celery.backend_cleanup
. celery.chain
. celery.chord
. celery.chord_unlock
. celery.chunks
. celery.group
. celery.map
. celery.starmap
I also get a ModuleNotFoundError when I try to add core.tasks to a CELERY_IMPORTS setting. This is my best guess for where the problem is, but I don't know how to solve it.
Things I tried:
Add core.tasks to a celery_imports setting. This causes a new error when I try to run the celery beat: ‘no module named ‘core.tasks’ ‘.
Hardcoding the name in the task: name='core.tasks.scrape_dev_to'
Specified the celery config explicitly when calling the worker: celery -A newsscraper worker -l INFO -settings=celeryconfig
Playing with my imports (from newsscraper.celery instead of from celery, for instance)
Adding some config code to the init.py for the module containing tasks (already had it in the init.py for module containing settings and celery.py)
Python manage.py check identified no issues
Calling the work with core.tasks explicitly: celery -A core.tasks worker -l INFO
I had the same problem and this setup solved it for me.
in your settings
CELERY_IMPORTS = [
'app_name.tasks',
]
and
# app_name/tasks.py
from celery import shared_task
#shared_task
def my_task(*args, **kwargs):
pass
Docs ref for imports.
This can occur when you configured a celery task and then removed it.
Just deconfigure the tasks and configure again
$ celery -A proj purge
or
from proj.celery import app
app.control.purge()
In settings.py, I have added the below line:
CELERY_IMPORTS = [
'app_name.tasks',]
and it worked for me.

How do I change my CELERY_BROKER_URL in an already daemonized Celery process?

I'm daemonizing my Celery worker using Supervisord. The issue is I had a typo in my CELERY_BROKER_URL and the worker is not properly connecting to RabbitMQ.
When I run celery -A mysite report it shows the old environment variable.
My /etc/supervisor/conf.d/celery.conf file does not include the environment variables:
[program:celery]
command=/webapps/mysite/scripts/celery/celery_start
autostart=true
autorestart=true
user=myuser
stdout_logfile=/webapps/mysite/logs/celery.log
redirect_stderr = true
The environment variables are picked up via my virtual environment in the celery_start script:
#!/bin/sh
DJANGODIR=/webapps/mysite/mysite
# Activate the virtual environment.
cd $DJANGODIR
. /webapps/mysite/bin/activate
. /webapps/mysite/bin/postactivate
# Programs meant to be run under supervisor should not daemonize themselves
# (do not use --daemon).
exec celery -A mysite worker -E -l info --concurrency=2
When I check the CELERY_BROKER_URL environment variable after activating the environment it is correct. I've tried supervisorctl restart celery which doesn't pick up the new environment variable (celery -A mysite report shows the old CELERY_BROKER_URL). I've tried supervisorctl shutdown and then supervisord which also won't pick up the new environment variable.
When I run ps aux | grep 'celery worker' I don't see anything, presumably because Celery is daemonized by Supervisor, so I'm not sure of a way to completely destroy the current Celery process.
No matter what, it feels like Celery is not picking up the new environment variable. How could I make this happen?
[EDIT]
My Celery settings in settings.py are as follows:
# Celery settings.
CELERY_BROKER_URL = os.environ.get(
'BROKER_URL', 'amqp://guest:guest#127.0.0.1//')
CELERY_TASK_SOFT_TIME_LIMIT = 60
CELERY_RESULT_BACKEND = 'django-db'
And my mysite/celery.py file is:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'settings.local')
APP = Celery('mysite')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
APP.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
APP.autodiscover_tasks()
#APP.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
It turns out I was using a password for my broker that for some reason contained an invalid character.
The password was #qahrKscbW#3!HkMJg#jFcyaOR7HtK%j08Jt$yY2.
What was happening was that my broker_url was invalid, and so it was defaulting back to ampq://guest:password instead of ampq://myuser:#qahrKscbW#3!HkMJg#jFcyaOR7HtK%j08Jt$yY2#localhost/mysite.
I changed the password to only use alphanumeric characters and it worked.

Creating the first Celery task - Django. Error - "ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//:"

I'm trying to create my first Celery task. The task will send the same e-mail every one minute to the same person.
According to the documentation, I create my first task in my project.
from __future__ import absolute_import, unicode_literals
from celery import shared_task
from django.core.mail import send_mail
#shared_task
def send_message():
to = ['test#test.com', ]
send_mail('TEST TOPIC',
'TEST MESSAGE',
'test#test.com',
to)
Then, in my project's ja folder, I add the celery.py file, which looks like this:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from django.conf import settings
from celery.schedules import crontab
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app_rama.settings')
app = Celery('app_rama')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(settings.INSTALLED_APPS)
app.conf.beat_schedule = {
'send-message-every-single-minute': {
'task': 'app.tasks.send_message',
'schedule': crontab(), # change to `crontab(minute=0, hour=0)` if you want it to run daily at midnight
},
}
Then in the __int__.py file of my project I added:
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
__all__ = ('celery_app',)
And the last thing I try to do is run the command:
celery -A app_rama worker -l info
And then I receive the following error:
[2019-06-27 16:01:26,750: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [WinError 10061]
I tried many solutions from the forum, but I did not find the correct one.
I was also not helped by adding the following settings to my settings.py file:
CELERY_BROKER_URL = 'amqp://guest:guest#localhost:5672//'
How can I solve this error so that my task works in the background of the application?
Your Celery broker is probably misconfigured. Read the "Using RabbitMQ" document to find out how to setup RabbitMQ properly (I assumed you want to use RabbitMQ as you had "amqp" protocol in your example).
I recommend learning Celery with Redis, as it is easier to setup and manage. Then once you learn the basics you may decide to move to RabbitMQ or some other supported broker...
Also, verify that your RabbitMQ server is running properly. If you use Windows, make sure some software on it does not prevent user processes to connect to the localhost:5672.

celery daemon production local config file without django

I am newbie to Celery. I create a project as per instruction provided by the celery4.1 docs.Below is my project folder and files:
mycelery
|
test_celery
|
celery_app.py
tasks.py
__init__.py
1-celery_app.py
from __future__ import absolute_import
import os
from celery import Celery
from kombu import Queue, Exchange
from celery.schedules import crontab
import datetime
app = Celery('test_celery',
broker='amqp://jimmy:jimmy123#localhost/jimmy_v_host',
backend='rpc://',
include=['test_celery.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
result_expires=3600,
)
if __name__ == '__main__':
app.start()
app.name
2-tasks.py
from __future__ import absolute_import
from test_celery.celery_app import app
import time
from kombu import Queue, Exchange
from celery.schedules import crontab
import datetime
app.conf.beat_schedule = {
'planner_1': {
'task': 'test_celery.tasks.printTask',
'schedule': crontab(minute='*/1'),
},
}
#app.task
def longtime_add(x, y):
print 'long time task begins'
# sleep 5 seconds
time.sleep(5)
print 'long time task finished'
return x + y
#app.task
def printTask():
print 'Hello i am running'
time=str(datetime.datetime.now())
file=open('/home/hub9/mycelery/data.log','ab')
file.write(time)
file.close()
I copied celeryd and celerybeat file from Celery github project and copied to /etc/init.d/ and make them executables. Then i create celeryd and celerybeat file to /etc/default/.
I- /etc/default/celeryd
# Names of nodes to start
# most will only start one node:
#CELERYD_NODES="worker1"
# but you can also start multiple and configure settings
# for each in CELERYD_OPTS (see `celery multi --help` for examples).
CELERYD_NODES="worker1 worker2 worker3"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/usr/local/bin/celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"
# Where to chdir at start. path to folder containing task
CELERYD_CHDIR="/home/hub9/mycelery/test_celery/"
# App instance to use
# comment out this line if you don't use an app
#CELERY_APP = "file/locatin/of/app"
# or fully qualified:
CELERY_APP="test_celery.celery_app:app"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=3000 --concurrency=3 --config=celeryconfig"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists, e.g. nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
II- /etc/default/celerybeat
# Names of nodes to start
# most will only start one node:
#CELERYD_NODES="worker1"
# but you can also start multiple and configure settings
# for each in CELERYD_OPTS (see `celery multi --help` for examples).
CELERYD_NODES="worker1 worker2 worker3"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/usr/local/bin/celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"
# Where to chdir at start. path to folder containing task
CELERYD_CHDIR="/home/hub9/mycelery/test_celery/"
# App instance to use
# comment out this line if you don't use an app
#CELERY_APP = "file/locatin/of/app"
# or fully qualified:
CELERY_APP="test_celery.celery_app:app"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=3000 --concurrency=3 --config=celeryconfig"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists, e.g. nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
After that i create celery user and group.
Here is my problem i am successfully run this project using celery -A test_celery.celery_app worker -l info --beatcommand but when i start my project using sudo service celeryd start OR sudo service celerybeat start
It gives me import error that no module name test_celery.celery_app.
Please provide me a hint what i am doing wrong.

Celery-Django as Daemon: ImportError: No module named django.conf

I am working on Django project which uses Celery. In development, Celery is working fine. My tasks are getting scheduled properly in development. For Daemon, I have created /etc/init.d/celeryd and /etc/defaults/celeryd as per documentation. When I enter command bash -x /etc/init.d/celeryd start, I got error No module named django.conf
Here is my celery.py:
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'axonatorprj.settings')
app = Celery('axonatorprj')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
app.conf.update(
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
Here is my celeryd:
# Names of nodes to start
# most will only start one node:
CELERYD_NODES="worker1"
# but you can also start multiple and configure settings
# for each in CELERYD_OPTS (see `celery multi --help` for examples).
CELERYD_NODES="worker1 worker2 worker3"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/usr/local/bin/celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="axonatorprj"
# or fully qualified:
#CELERY_APP="proj.tasks:app"
# Where to chdir at start.
CELERYD_CHDIR="/home/projects/axonator"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists, e.g. nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
export DJANGO_SETTINGS_MODULE="axonatorprj.settings"
export PYTHONPATH=$PYTHONPATH:/home/projects/axonator