Django Celery Scheduling a manage.py command - django

I need to update the solr index on a schedule with the command:
(env)$ ./manage.py update_index
I've looked through the Celery docs and found info on scheduling, but haven't been able to find a way to run a django management command on a schedule and inside a virtualenv. Would this be better run on a normal cron? And if so how would I run it inside the virtualenv? Anyone have experience with this?
Thanks for the help!

Django Celery Task Scheduling
project structure
[appname]/
├── [appname]/
│ ├── __init__.py
│ ├── settings.py
│ ├── urls.py
│ ├── celery.py
│ └── wsgi.py
├── [project1]/
│ ├── __init__.py
│ ├── tasks.py
│
└── manage.py
add below configuration in settings.py file:
STATIC_URL = '/static/'
BROKER_URL = 'redis://localhost:6379/0'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_RESULT_BACKEND = 'redis'
from celery.schedules import crontab
CELERY_TIMEZONE = 'UTC'
celery.py : holds celery task scheduler
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
import django
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'appname.settings')
from django.conf import settings
app = Celery('appname')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
**#scheduler**
app.conf.beat_schedule = {
'add-every-30-seconds': {
'task': 'project1.tasks.cleanup',
'schedule': 30.0,
'args': ()
},
}
app.conf.timezone = 'UTC'
init.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
__all__ = ['celery_app']
tasks.py from project1
from celery import shared_task
import celery
import time
from django.core import management
#celery.task#(name='cleanup')
def cleanup():
try:
print ("in celery module")
"""Cleanup expired sessions by using Django management command."""
management.call_command("clearsessions", verbosity=0)
#PUT MANAGEMENT COMMAND HERE
return "success"
except:
print(e)
Task will run after every 30 seconds
Requirement fro windows:
redis server should be running
celery worker and celery beat should be running
run each below command on different terminal
celery -A appname worker -l info
celery -A appname beat -l info
Requirement fro Linux:
redis server should be running
celery worker and celery beat should be running
celery beat and worker can be started on same server
celery -A appname worker -l info -B
#tzenderman please let me know if I missed something.
For me this is working fine

To run your command periodically from a cron job, just wrap the command in a bash script that loads the virtualenv. For example, here is what we do to run manage.py commands:
django_cmd.sh:
#!/bin/bash
cd /var/www/website/
source venv/bin/activate
/var/www/website/manage.py $1 --settings=$2
Crontab:
MAILTO=webmaster#website.com
SETTINGSMODULE=website.settings_prod
5 * * * * /var/www/website/django_cmd.sh update_index $SETTINGSMODULE >> /dev/null
0 10 * * * /var/www/website/django_cmd.sh update_accounts $SETTINGSMODULE

I actually found a nice way of doing this using fabric + celery and I'm working on it now:
In app/tasks.py, create a fabric function with the manage.py commands you need, then decorate it with #periodic_task, add it to your celery schedule and it should be good to go.
UPDATE: I wasn't able to actually use Fabric + Celery because using fabric in the module caused it be recognized as a fabric file and the celery calls in the file didn't work.

Related

Can't run Celery task in Django - I either get "AppRegistryNotReady: Apps aren't loaded yet" or "RuntimeError: populate() isn't reentrant"

I'm trying to setup a task with Celery in Django to run every day at 23:00.
app = Celery('App.tasks', broker='redis://localhost')
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "App.settings")
django.setup() <== PROBLEM
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
crontab(hour=23),
calc_average_rating.s(),
)
#app.task
def calc_average_rating(final_content_id):
The problem is that in this function, I have Rating = apps.get_model(app_label='App', model_name='Rating'), and If I don't call django.setup() then I get django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet..
However, If I call django.setup(), the tasks are running fine but I can't do manage.py runserver as I get RuntimeError: populate() isn't reentrant.
Any solutions?
I'm not sure exactly how to reproduce the environment you're in, so here are some observations from my environment, I hope they help
The only place I have a Celery() object is in a standalone file, kept within the "manage.py startproject" generated package,
I think the way I layout a django app is unusual compared to most django users, so to describe it:
# .git/ # top folder is my vcs
# setup.py # packaging for exampleapp
# env/ # python venv created to this service
# exampleapp/ # package generated from startapp
# exampleapp/tasks.py # package generated from startapp
# exampleproject/ # folder generated from startproject
# exampleproject/exampleproject/ # package generated by startproject
# exampleproject/exampleproject/settings.py # generated
# exampleproject/exampleproject/celery.py # created based on celery docs
# exampleproject/exampleproject/celery.py
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'exampleproject.settings')
app = Celery('exampleproject')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {self.request!r}'.format(self=self))
if __name__ == '__main__':
app.start()
and I start the celery jobs like as follows, where my python virtual env folder 'env' is a sibling of the generated exampleproject package
(
cd exampleproject
../env/bin/python3 -m celery -A exampleproject worker -l INFO
# or
../env/bin/python3 -m celery -A exampleproject beat -l INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler
)
# and for django
./env/bin/python3 exampleproject/manage.py runserver
maybe of interest as well
# exampleapp/tasks.py
from celery import shared_task
#shared_task
def add(x, y):
return x+y
# exampleproject/exampleproject/settings.py
# suffixed to end of generated file
INSTALLED_APPS.extend([
'django_celery_results',
'django_celery_beat',
])
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 30 * 60
CELERY_RESULT_BACKEND = 'django-db'
#CELERY_RESULT_BACKEND = 'django-cache'
With these parts, I haven't noticed any issues loading the entry points

Celery - Received unregistered task of type 'core.tasks.scrape_dev_to'

Trying to get a celery-based scraper up and running. The celery worker seems to function on its own, but when I also run the celery beat server, the worker gives me this keyerror.
File "c:\users\myusername\.virtualenvs\django-news-scraper-dbqk-dk5\lib\site-packages\celery\worker\consumer\consumer.py", line 555, in on_task_received
strategy = strategies[type_]
KeyError: 'core.tasks.scrape_dev_to'
[2020-10-04 16:51:41,231: ERROR/MainProcess] Received unregistered task of type 'core.tasks.scrape_dev_to'.
The message has been ignored and discarded.
I've been through many similar answers on stackoverflow, but none solved my problem. I'll list things I tried at the end.
Project structure:
core -tasks
newsscraper -celery.py -settings.py
tasks:
import time
from newsscraper.celery import shared_task, task
from .scrapers import scrape
#task
def scrape_dev_to():
URL = "https://dev.to/search?q=django"
scrape(URL)
return
settings.py:
INSTALLED_APPS = [
'django.contrib.admin',
...
'django_celery_beat',
'core',
]
...
# I Added this setting while troubleshooting, got a new ModuleNotFound error for core.tasks
#CELERY_IMPORTS = (
# 'core.tasks',
#)
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_BEAT_SCHEDULE = {
"ScrapeStuff": {
'task': 'core.tasks.scrape_dev_to',
'schedule': 10 # crontab(minute="*/30")
}
}
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'newsscraper.settings')
app = Celery('newsscraper')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
When I run debug for the celery worker, I see that celery doesn't have the task I want (scrape_dev_to) registered. Shouldn't the app.autodiscover_tasks() call in celery.py take care of this? Here's the output:
. celery.accumulate
. celery.backend_cleanup
. celery.chain
. celery.chord
. celery.chord_unlock
. celery.chunks
. celery.group
. celery.map
. celery.starmap
I also get a ModuleNotFoundError when I try to add core.tasks to a CELERY_IMPORTS setting. This is my best guess for where the problem is, but I don't know how to solve it.
Things I tried:
Add core.tasks to a celery_imports setting. This causes a new error when I try to run the celery beat: ‘no module named ‘core.tasks’ ‘.
Hardcoding the name in the task: name='core.tasks.scrape_dev_to'
Specified the celery config explicitly when calling the worker: celery -A newsscraper worker -l INFO -settings=celeryconfig
Playing with my imports (from newsscraper.celery instead of from celery, for instance)
Adding some config code to the init.py for the module containing tasks (already had it in the init.py for module containing settings and celery.py)
Python manage.py check identified no issues
Calling the work with core.tasks explicitly: celery -A core.tasks worker -l INFO
I had the same problem and this setup solved it for me.
in your settings
CELERY_IMPORTS = [
'app_name.tasks',
]
and
# app_name/tasks.py
from celery import shared_task
#shared_task
def my_task(*args, **kwargs):
pass
Docs ref for imports.
This can occur when you configured a celery task and then removed it.
Just deconfigure the tasks and configure again
$ celery -A proj purge
or
from proj.celery import app
app.control.purge()
In settings.py, I have added the below line:
CELERY_IMPORTS = [
'app_name.tasks',]
and it worked for me.

"module not found" when running Celery with supervisor

I'm trying to run celery with django using supervisor.
supervisor_celery.conf
[program:supervisor-celery]
command=/home/user/project/virtualenvironment/bin/celery worker -A project --loglevel=INFO
directory=/home/user/project/project
user=nobody
numprocs=1
stdout_logfile=/home/user/project/logs/celery.log
stderr_logfile=/home/user/project/logs/celery.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
stopasgroup=true
priority=1000
on runing supervisor , i got following error in logs file:-
Unable to load celery application.
The module project was not found.
project structure is
project
|-project
|-settings
|-production.py
|-__init__.py
|-celery.py
|-urls.py
|-wsgi.py
|-app
The contents of __init__.py is:-
from __future__ import absolute_import, unicode_literals
from .celery import app as celery_app
__all__ = ('celery_app',)
The content of celery.py is
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings.production')
app = Celery('project')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
It would be helpful if anyone can tell me why it's not working?
It seems that your directory wrong(in supervisor conf), it should be
directory=/home/user/project

Running Management command with celery?

I have never used Celery before and am trying to configure it correctly. I am using redis as the broker and hosting on heroku. This is my first time trying to run asynchronous tasks and I'm struggling. I have a Management command that I would like to run periodically.
celery.py
from __future__ import absolute_import, unicode_literals
import os
import celery
from celery import Celery
import django
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'coffee.settings')
app = Celery('coffee')
app.config_from_object('django.conf:settings', namespace = 'CELERY')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind= True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
app.conf.beat_schedule = {
'add-every-30-seconds':{
'task': 'inventory.tasks.boarshead',
'schedule' : 30.0,
'args' : ()
},
}
settings.py
CACHES = {
"default": {
"BACKEND": "redis_cache.RedisCache",
"LOCATION": os.environ.get('REDIS_URL'),
}
}
tasks.py
from celery import shared_task
import celery
import time
from django.core import management
#celery.task
def boarshead():
try:
print("in celery module")
"""Boarshead expired sessions by using Django Management Command."""
management.call_command("clearsessions", verbosity=0)
CreateBoarsHeadList.py
return "success"
except:
print(e)
init.py
from __future__ import absolute_import, unicode_literals
from .celery import app as celery_app
procfile
worker: celery worker --app=tasks.inventory.app
On Celery+Rabbit (and REDIS, not used as backend for years) you will need a proc file for the "web" (Django) and one for the worker, did not see listed. Worker / dyno allocation allows use and access to the manage functionality. Here is the procfile from one of my apps:
web: gunicorn SOME_APP.wsgi --log-file -
worker: celery worker -A QUEUE_APP_NAME -l info --without-gossip --without-mingle --without-heartbeat
QUEUE_APP_NAME is the name of a module (app) where I have all my Celery work and code. worker is called via Procfile in the QUEUE_APP_NAME module (dir), similar code to your Celery file. May not solve you, but getting Celery working is a slow battle.

Celery, Django, Heroku -- ImportError: No module named tasks

I am trying to run celery with IronMQ and cache in a Django project on Heroku but I am receiving the following:
2013-04-14T22:29:17.479887+00:00 app[celeryd.1]: ImportError: No module named tasks
What am I doing wrong? The following is my relevant code and djcelery and my app are both in installed apps:
REQUIREMENTS (Rabbit AMQP is in there because I tried that before IronMQ):
Django==1.5.1
amqp==1.0.11
anyjson==0.3.3
billiard==2.7.3.27
boto==2.8.0
celery==3.0.18
dj-database-url==0.2.1
django-celery==3.0.17
django-storages==1.1.8
gunicorn==0.17.2
iron-cache==0.2.0
iron-celery==0.3.1
iron-core==1.0.2
iron-mq==0.4
iso8601==0.1.4
kombu==2.5.10
psycopg2==2.4.6
python-dateutil==2.1
pytz==2013b
requests==1.2.0
six==1.3.0
wsgiref==0.1.2
PROCFILE:
web: gunicorn myapp.wsgi
celeryd: celery -A tasks worker --loglevel=info -E
SETTINGS:
BROKER_URL = 'ironmq://'
CELERY_RESULT_BACKEND = 'ironcache://'
import djcelery
import iron_celery
djcelery.setup_loader()
TASKS:
from celery import task
#task()
def batchAdd(result_length, result_amount):
VIEWS:
from app import tasks
r = batchAdd.delay(result_length, result_amount)
return HttpResponse(r.task_id)
ALSO TRIED (in VIEWS):
from tasks import batchAdd
r = batchAdd.delay(result_length, result_amount)
return HttpResponse(r.task_id)
AND TRIED THIS AS WELL (in VIEWS):
from app.tasks import batchAdd
r = batchAdd.delay(result_length, result_amount)
return HttpResponse(r.task_id)
Also here is my structure:
projectname
--app
----__init__.py
----__init__.pyc
----admin.py
----admin.pyc
----forms.py
----forms.pyc
----models.py
----models.pyc
----tasks.py
----tests.py
----views.py
----views.pyc
--manage.py
--Procfile
--projectname
----__init__.py
----__init__.pyc
----settings.py
----settings.pyc
----static
----templates
----urls.py
----urls.pyc
----wsgi.py
----wsgi.pyc
--requirements.txt
Have you tried to load celery via manage.py ?
python manage.py celery worker --loglevel=info
You can't just run your celery using:
celery -A tasks worker --loglevel=info -E
Celery requires celeryconfig file with -A option. You should run you celery as described in djcelery docs.
python manage.py celery worker --loglevel=info
Also you should fix your views.py as
from app.tasks import batchAdd