Run periodic celery task with a dynamic schedule in django application - django

I am wondering if it is possible to have my end users dynamically adjust the schedule of a periodic task.
So something along these lines:
# celery.py
def get_schedule():
config = get_user_config() # returns a model object of sorts
return config.frequency_in_seconds
app.conf.beat_schedule = {
'my_periodic_task': {
'task': 'my_periodic_task',
'schedule': get_schedule, # schedule updated based on `get_schedule` function
},
}
This way, if a user were to change the frequency_in_seconds field in their user config setting, it would dynamically update the beat schedule.
My preference would be to do this outside of the Django Admin site and without any additional packages (e.g. django-celery-beat).
Any thoughts or ideas would be much appreciated.
Thanks

If you're using django, you can use django-celery-beat to allow end-users to control the schedule using the django admin panel.

If you're using redis as your result backend, you can use this library here https://github.com/parad0x96/django-redbeat
Create the dynamic periodic task.:
from django_redbeat import PeriodicTaskEntry
task = PeriodicTasksEntry.objects.create(
name="The verbose name of the task",
task="yourapp.tasks.task_name",
args=[arg1, arg2,],
schedule=10# the schedule in seconds
)
This will create a dynamic periodic task, which you have control on the schedule and the creation as well.
Run Celery beat like this :
celery -A your_app_name beat -l INFO -S redbeat.RedBeatScheduler --max-interval 10

Related

Best way to update my django model coming from an external api source?

I am getting my data through requesting an api source, then I put it in my django model. However, data update daily.. so how can I update these data without rendering it everytime?
def index (request):
session = requests.Session()
df = session.get('https://api.coincap.io/v2/assets')
response= df.json()
coin = response['data']
final_result = coin.to_dict('records')
for coin in final_result:
obj, created = Coincap.objects.update_or_create(
symbol = coin['symbol'],
name = coin['name'],
defaults = {
'price': coin['priceUsd']
})
return render(request, '/home.html/')
Right now, I have to go to /home.html , if I want my data update. However, my goal is to later serialize it and make it REST api data, so I wouldn't touch django template anymore. Anyway for it to update internally once a day after i do manage.py runserver?
For those that are looking for an example:
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def handle(self,*args,**kwargs):
//Your request api here
for coin in final_result:
obj, created = Coincap.objects.update_or_create(
symbol = coin['symbol'],
name = coin['name'],
defaults = {
'price': coin['priceUsd']})
Then you run in with cron just as Nikita suggested.
One simple and common solution is to create a custom Django admin command and use Cron to run it at specified intervals. You can write a command's code to your liking and it can have access to all of the models, settings and other parts of your Django project.
You would put your code making a request and writing data to the DB, using your Django models, in your new Command class's handle() method (obviously request parameter is no longer needed). And then, if for example you have named your command update_some_data, you can run it as python manage.py update_some_data.
Assuming Cron exists and is running on the machine. Then you could setup Cron to run this command for you at specified intervals, for example create a file /etc/cron.d/your_app_name and put
0 4 * * * www-data /usr/local/bin/python /path/to/your/manage.py update_some_data >> /var/log/update_some_data.log 2>&1
This would make your update be done everyday at 04:00. If your command would provide any output, it will be written to /var/log/update_some_data.log file.
Of course this is just an example, so your server user running your app (www-data here) and path to the Python executable on the server (/usr/local/bin/python here) should be adjusted for particular use.
See links for further guidance.

Does celery task id change after redistribution

I have a Django model which has a column called celery_task_id. I am using RabbitMQ as the broker. There's a celery function called test_celery which takes a model object as parameter. Now I have the following lines of code which creates a celery task.
def create_celery_task():
celery_task_id = test_celery.apply_async((model_obj,), eta='Future Datetime Object')
model_obj.celery_task_id = celery_task_id
model_obj.save()
----
----
Now inside the celery function I am verifying if the task id is same as of the one stored in the DB or not.
#app.task
def test_celery(model_obj):
if model_obj.celery_task_id == test_celery.request.id:
## Do something
My problem is there are a lot of cases where I can see the task being received and succeeding in the log but not executing the code inside of if condition.
Is it possible that celery task id changes after redistribution. Or are there any other reasons.
One of the recommendations is not to pass Database/ORM objects into the Celery tasks because the may contain stale data. Try to rewrite the task as:
#app.task
def test_celery(model_obj_id):
model_obj = YourModel.objects.get(id=model_obj_id)
if model_obj:
if model_obj.celery_task_id == test_celery.request.id:
## Do something

Where to define Celery subtask queues

I have a pluggable app I'm developing for a Django system. In it, I have a task for creating notifications that looks something like so:
installable_app.tasks
#app.task(name='tasks.generate_notifications')
def generate_notifications(...):
I have a pluggable app I'm developing for a Django system. In it, I have a task for creating notifications that looks something like so:
installable_app.tasks
#app.task(name='tasks.generate_notifications')
def generate_notifications(...):
clients = get_list_of_clients()
for client in clients:
client_generate_notification.delay(client['name'], client['id'])
return "Notification Generation Complete"
#app.task
def client_generate_notification(client_name, client_id):
...
return result
Now I want this to run periodically which can be accomplished with Celery Beat using settings. I also want it to be on its own queue:
settings.py:
CELERYBEAT_SCHEDULE ={
{'generate_schedule_notifications': {
'task': 'tasks.generate_notifications',
'schedule': crontab(hour=6, minute=0),
'options': {'queue': 'notification_gen'},
'args': ('schedule', 'Equipment', 'HVAC')},
}
}
The first task, generate_notifications is run correctly on the queue notification_gen but the client_generate_notification subtasks are run on the default queue.
I know I can specify the queues specifically in the #task decorator but since this is a django app, I would rather they be specified where it is actually run.
I've looked into using the CELERY_ROUTES option but when I tried it, it seemed to overwrite the queues for other tasks I was running.
Is the best practice to define all the possible queues in CELERY_ROUTES or is there a better way to set up my task so that they will both run on the same queue?
Do you want something like it?

How to remove celery task ?

I practice django-celery
settings.py
#import datetime
#CELERYBEAT_SCHEDULE = {
# 'hello_task': {
# 'task': 'hello_task',
# 'schedule': datetime.timedelta(seconds=20),
# },
#}
import datetime
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'app1.tasks.myfunc',
'schedule': datetime.timedelta(seconds=30),
},
}
I try hello_task schedule at first,then I commented it and try add-every-30-seconds
But it still want to execute when hello_task time comes
So I check the database and found it was save the record int it
Why wouldn't it delete when I commented it ??
Is there any command or normal way to delete it ???
Or If I just delete it from database is well ??
If you're using the django-celery database scheduler those periodic tasks in CELERYBEAT_SCHEDULE dict will be added to Django's database, as you found out. django-celery's scheduler then reads its settings primarily from the database. Removing entries from the dict just means that django-celery has nothing to add to the database.
To delete the task properly, remove it from the Django admin page (Djcelery > Periodic Tasks).
When you change the code that celery is working with, you should restart celery to understand the changes. Such as needing to restart django server whenever you change your python codes.

Django-celery task and django transaction

I have a question regarding transactions and celery tasks. So it's no mystery to me that of course if you have a transaction and a celery task accessing the same table/records we'll have a race condition.
However, consider the following piece of code:
def f(self):
# function of module that inherits from models.Model
self.field_a = datetime.now()
self.save()
transaction.commit_unless_managed()
# depending on the configuration of this module
# this might return None or a datetime object.
eta = self.get_task_eta()
if eta:
celery_task_do_something.apply_async(args=(self.pk, self.__class__),
eta=eta)
else:
celery_task_do_something.delay(self.pk, self.__class__)
Here's the celery task:
def celery_task_do_something(pk, cls):
o = cls.objects.get(pk=pk)
if o.field_a:
# perform something
return True
return False
As you can see, before creating the task we call transaction.commit_unless_managed and it should commit, since django transaction is not currently managed.
However, when running celery task the field field_a is not set.
My question:
Since we do commit before creating the task, is it still possible that there's a race condition?
Additional info
We're using Postgres version 9.1
Every transaction is run with READ COMMITTED isolation level
On a different db with engine dowant.lib.db.backends.postgresql_psycopg2_debugger field_a is already set and the task works as expected. With engine dowant.lib.db.backends.postgresql_psycopg2_hstore_ready the described issue appears (not sure if it's related with the engine).
Celery version is 2.2
I tried different databases. Still the same behavior, except when the engines change. So that's why I mentioned this.
Thanks a lot.
Try to add self.__class__.objects.select_for_update().get(pk=self.pk) before save and see what happens.
It should block all reads to this row untill commit is done.
This is late but since django 1.9
transaction.on_commit(lambda: enqueue_atask()))