Using celery beat as a scheduler for irregular intervals? - django

I have a single django application that allows the user to create multiple distinct blogs. Each blog needs to collect model data (e.g. number of visits, clicks, etc.) hourly/daily/weekly etc. and the interval at which data is collected may be different between blogs. Additionally, at some point in time, the users may want to change the frequency of data collection e.g. from weekly to daily on the user interface.
Looking into Periodic Tasks from the official documentation, it appears that I would have to 'hardcode' the interval values into the settings file and I can only specify the interval once e.g.
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
# Executes every Monday morning at 7:30 A.M
'add-every-monday-morning': {
'task': 'tasks.add',
'schedule': crontab(hour=7, minute=30, day_of_week=1),
'args': (16, 16),
},
}
How do I go about this or is it even possible for celery to schedule multiple tasks of the same kind at different intervals AND change the values through the user interface (via AJAX)?

As noted by #devxplorer, django-celery provides a database backend. You could either use this to manage tasks via the Django admin, programmatically, or expose the model through an API.
from djcelery.models import PeriodicTask
PeriodicTask(
name="My First Task",
...
).create()
all_tasks = PeriodicTask.objects.all()
...
Then starting the beat process with
$ celery -A proj beat -S djcelery.schedulers.DatabaseScheduler

Related

access django-apscheduler's jobs properties programmatically

I use the django-apscheduler package to run cron (scraping) jobs. the package stores the past jobs with some information/properties (e.g. local runtime, duration etc.) somewhere on the database for display on the admin backend.
When I want to access these information/properties about the jobs programmatically in the views.py (e.g. to show the last runtime of a job in the context/template), how would I do that?
in views.py
from django_apscheduler.models import DjangoJobExecution
for accessing the data of executed Jobs
or
from django_abscheduler.models import Jobs
for accessing the scheduled jobs

Run celery task for specific time period

I'm developing web app using django, and I'm using celery to run task in background. Everything is working fine, But i have one issue, I want to run celery task for the specific time period
like from 2pm to 3pm.
I suppose you're using Celery beat to run periodic tasks. Your requirement should be possible using Crontab schedule. Specifically following this example that's given there:
crontab(minute=0, hour='*/3,8-17')
Execute every hour divisible by 3, and every hour during office hours (8am-5pm).
EDIT: If you want to run the task only once but want to specify the time when it's going to be started, specify ETA when calling the task. Example from the documentation:
>>> from datetime import datetime, timedelta
>>> tomorrow = datetime.utcnow() + timedelta(days=1)
>>> add.apply_async((2, 2), eta=tomorrow)

Google App Engine, tasks in Task Queue are not executed automatically

My tasks are added to Task Queue, but nothing executed automatically. I need to click the button "Run now" to run tasks, tasks are executed without problem. Have I missed some configurations ?
I use default queue configuration, standard App Engine with python 27.
from google.appengine.api import taskqueue
taskqueue.add(
url='/inserturl',
params={'name': 'tablename'})
This documentation is for the API you are now mentioning. The idea would be the same: you need to specify the parameter for when you want the task to be executed. In this case, you have different options, such as countdown or eta. Here is the specific documentation for the method you are using to add a task to the queue (taskqueue.add)
ORIGINAL ANSWER
If you follow this tutorial to create queues and tasks, you will see it is based on the following github repo. If you go to the file where the tasks are created (create_app_engine_queue_task.py). There is where you should specify the time when the task must be executed. In this tutorial, to finally create the task, they use the following command:
python create_app_engine_queue_task.py --project=$PROJECT_ID --location=$LOCATION_ID --queue=$QUEUE_ID --payload=hello
However, it is missing the time when you want to execute it, it should look like this
python create_app_engine_queue_task.py --project=$PROJECT_ID --location=$LOCATION_ID --queue=$QUEUE_ID --payload=hello --in_seconds=["countdown" for when the task will be executed, in seconds]
Basically, the key is in this part of the code in create_app_engine_queue_task.py:
if in_seconds is not None:
# Convert "seconds from now" into an rfc3339 datetime string.
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
# Create Timestamp protobuf.
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
# Add the timestamp to the tasks.
task['schedule_time'] = timestamp
If you create the task now and you go to your console, you will see you task will execute and disappear from the queue in the amount of seconds you specified.

How to auto delete the expires data in the database?

If I store a row data in the database table(instance), and the table has a field names expire_time. if the time over the expire_time, I want to delete the row data.
So, if I want to do that, I can every time query the table, traverse every row data, if expires, then delete.
But if I don't query I can not realize the requirement.
So, if there is a method to do that?
I use python django, the database is mariadb.
You can write a custom management command to do this for you. Save this in myapp/management/commands/delete_expired.py for example:
from django.core.management.base import BaseCommand
from django.utils import timezone
from myapp.models import MyModel
class Command(BaseCommand):
help = 'Deletes expired rows'
def handle(self, *args, **options):
now = timezone.now()
MyModel.objects.filter(expire_time__lt=now).delete()
Then either call that command from a cron task or a queue. To do it on the command line you can call:
python manage.py delete_expired
I am not sure what you mean by:
I can not realize the requirement.
But I think you might want consider:
custom manage.py command, and cron this command with your venv python source
add django-cron to routinely check for expired data and delete it
try celery as another solution to cron but it could be too complecated for your case
add event to MariaDB and schedule it periodical
The drawback of custom manage.py cmd and event is if you migrate server you should remember to add new cron job/event to clean db periodicaly.
I don't know a database-level approach to do that (maybe you want to add the mariadb tag if you are looking for a database-specific solution).
At the application level, an approach comes to mind. You may use Celery and, whenever you store a row data, schedule a task to delete it. The celery task should check that expire_time is effectively invalid (can that field be modified or updated?).
You can also (in addition or as an alternative) have a Celery beat job that periodically gets the element with smaller expire_time. If it should be removed, removed and call itself again. Otherwise, wait for next beat.

Periodic task schedules set through view functions (Django app)

I schedule repeat tasks for my Django app via including them in the CELERYBEAT_SCHEDULE dictionary in my settings.py. For instance:
CELERYBEAT_SCHEDULE = {
'tasks.rank_photos': {
'task': 'tasks.rank_photos',
'schedule': timedelta(seconds=5*60),
},
'tasks.trim_whose_online': {
'task': 'tasks.trim_whose_online',
'schedule': timedelta(seconds=10*60),
},
}
These tasks periodically run (for the life of the app).
I was wondering whether there's a way for a regular user of my app to kick off a periodic task? I.e. is there a way to control this kind of scheduling from views.py? If not, why not? And if yes, an illustrative example would be great. Thanks in advance.
You could use django-celery-beat package. It defines several models (e.g. PeriodicTask) and it will allow you to schedule tasks in your views by simply using those models to create or edit periodic tasks. It is mentioned in the official docs.
There’s also the django-celery-beat extension that stores the schedule in the Django database, and presents a convenient admin interface to manage periodic tasks at runtime.