I schedule repeat tasks for my Django app via including them in the CELERYBEAT_SCHEDULE dictionary in my settings.py. For instance:
CELERYBEAT_SCHEDULE = {
'tasks.rank_photos': {
'task': 'tasks.rank_photos',
'schedule': timedelta(seconds=5*60),
},
'tasks.trim_whose_online': {
'task': 'tasks.trim_whose_online',
'schedule': timedelta(seconds=10*60),
},
}
These tasks periodically run (for the life of the app).
I was wondering whether there's a way for a regular user of my app to kick off a periodic task? I.e. is there a way to control this kind of scheduling from views.py? If not, why not? And if yes, an illustrative example would be great. Thanks in advance.
You could use django-celery-beat package. It defines several models (e.g. PeriodicTask) and it will allow you to schedule tasks in your views by simply using those models to create or edit periodic tasks. It is mentioned in the official docs.
There’s also the django-celery-beat extension that stores the schedule in the Django database, and presents a convenient admin interface to manage periodic tasks at runtime.
Related
I use the django-apscheduler package to run cron (scraping) jobs. the package stores the past jobs with some information/properties (e.g. local runtime, duration etc.) somewhere on the database for display on the admin backend.
When I want to access these information/properties about the jobs programmatically in the views.py (e.g. to show the last runtime of a job in the context/template), how would I do that?
in views.py
from django_apscheduler.models import DjangoJobExecution
for accessing the data of executed Jobs
or
from django_abscheduler.models import Jobs
for accessing the scheduled jobs
I'm using django-celery-beat in a django app (this stores the schedule in the database instead of a local file). I've configured my schedule via celery_beat that Celery is initialized with via app.config_from_object(...)
I recently renamed/removed a few tasks and restarted the app. The new tasks showed up, but the tasks removed from the celery_beat dictionary didn't get removed from the database.
Is this expected workflow -- requiring manual removal of tasks from the database? Is there a workaround to automatically reconcile the schedule at Django startup?
I tried a PeriodicTask.objects.all().delete() in celery/__init__.py
def _clean_schedule():
from django.db import transaction
from django_celery_beat.models import PeriodicTask
from django_celery_beat.models import PeriodicTasks
with transaction.atomic():
PeriodicTask.objects.\
exclude(task__startswith='celery.').\
exclude(name__in=settings.CELERY_CONFIG.celery_beat.keys()).\
delete()
PeriodicTasks.update_changed()
_clean_schedule()
but that is not allowed because Django isn't properly started up yet:
django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.
You also can't use Django's AppConfig.ready() because making queries / db connections in ready() is not supported.
Looking at how django-celery-beat actually works to install the schedules, I thought I maybe I could hook into that process.
It doesn't happen when Django starts -- it happens when beat starts. It calls setup_schedule() against the class passed on the beat command line.
Therefore, we can just override the scheduler with
--scheduler=myproject.lib.scheduler:DatabaseSchedulerWithCleanup
to do cleanup:
import logging
from django_celery_beat.models import PeriodicTask
from django_celery_beat.models import PeriodicTasks
from django_celery_beat.schedulers import DatabaseScheduler
from django.db import transaction
class DatabaseSchedulerWithCleanup(DatabaseScheduler):
def setup_schedule(self):
schedule = self.app.conf.beat_schedule
with transaction.atomic():
num, info = PeriodicTask.objects.\
exclude(task__startswith='celery.').\
exclude(name__in=schedule.keys()).\
delete()
logging.info("Removed %d obsolete periodic tasks.", num)
if num > 0:
PeriodicTasks.update_changed()
super(DatabaseSchedulerWithCleanup, self).setup_schedule()
Note, you only want this if you are exclusively managing tasks with beat_schedule. If you add tasks via Django admin or programatically, they will also be deleted.
I am using redis as a broker between Django and Celery. The redis instance I have access to is shared with many other applications and so the broker is not reliable (the redis keys it uses are deleted by others, the messages often get sent to workers in other applications). Changing redis database does not solve the problem (there are few databases and many applications).
How can I configure Celery to prefix all the keys it uses with a custom string? The docs mention ways to add prefixes to queue names, but that does not affect the redis keys. The underlying library (Kombu) does not seem to let the user prefix the keys it uses as far as I can tell.
The functionality to add prefix to all the redis keys has been added as part of this. Now you can configure it like this:
BROKER_URL = 'redis://localhost:6379/0'
celery = Celery('tasks', broker=BROKER_URL, backend=BROKER_URL)
celery.conf.broker_transport_options = {'global_keyprefix': "prefix"}
This is not supported by Celery yet. A pull request on this subject is currently stalled due to a lack of workforce:
https://github.com/celery/kombu/pull/912
You can just override the prefix value of your celery task.
#shared_task(bind=True)
def task(self, params):
self.backend.task_keyprefix = b'new-prefix'
I have a single django application that allows the user to create multiple distinct blogs. Each blog needs to collect model data (e.g. number of visits, clicks, etc.) hourly/daily/weekly etc. and the interval at which data is collected may be different between blogs. Additionally, at some point in time, the users may want to change the frequency of data collection e.g. from weekly to daily on the user interface.
Looking into Periodic Tasks from the official documentation, it appears that I would have to 'hardcode' the interval values into the settings file and I can only specify the interval once e.g.
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
# Executes every Monday morning at 7:30 A.M
'add-every-monday-morning': {
'task': 'tasks.add',
'schedule': crontab(hour=7, minute=30, day_of_week=1),
'args': (16, 16),
},
}
How do I go about this or is it even possible for celery to schedule multiple tasks of the same kind at different intervals AND change the values through the user interface (via AJAX)?
As noted by #devxplorer, django-celery provides a database backend. You could either use this to manage tasks via the Django admin, programmatically, or expose the model through an API.
from djcelery.models import PeriodicTask
PeriodicTask(
name="My First Task",
...
).create()
all_tasks = PeriodicTask.objects.all()
...
Then starting the beat process with
$ celery -A proj beat -S djcelery.schedulers.DatabaseScheduler
My User collection contains data such as
{"user1":"zera",
"my_status":"active",
"date_creation" : ISODate("2013-10-01T10:15:52.055Z")
}
{"user2":"dfgf",
"my_status":"noactive",
"date_creation": ISODate("2013-10-01T08:55:41.212Z")
}
I need to find each user with my_status :"active" and update their my_status after 24 hours from each user's date_creation.
Can anyone suggest a method to do it using django?
Well, I'd write an async task to keep polling the database to check for users with active status. If the user is active, update their status.
For the asynchronous tasks, you can use python-rq but to make things easier there's a django module for python-rq, it's django-rq. Also, Celery is another popular and good option. There's also a module for Django, you can find it here.