Where to define Celery subtask queues - django

I have a pluggable app I'm developing for a Django system. In it, I have a task for creating notifications that looks something like so:
installable_app.tasks
#app.task(name='tasks.generate_notifications')
def generate_notifications(...):
I have a pluggable app I'm developing for a Django system. In it, I have a task for creating notifications that looks something like so:
installable_app.tasks
#app.task(name='tasks.generate_notifications')
def generate_notifications(...):
clients = get_list_of_clients()
for client in clients:
client_generate_notification.delay(client['name'], client['id'])
return "Notification Generation Complete"
#app.task
def client_generate_notification(client_name, client_id):
...
return result
Now I want this to run periodically which can be accomplished with Celery Beat using settings. I also want it to be on its own queue:
settings.py:
CELERYBEAT_SCHEDULE ={
{'generate_schedule_notifications': {
'task': 'tasks.generate_notifications',
'schedule': crontab(hour=6, minute=0),
'options': {'queue': 'notification_gen'},
'args': ('schedule', 'Equipment', 'HVAC')},
}
}
The first task, generate_notifications is run correctly on the queue notification_gen but the client_generate_notification subtasks are run on the default queue.
I know I can specify the queues specifically in the #task decorator but since this is a django app, I would rather they be specified where it is actually run.
I've looked into using the CELERY_ROUTES option but when I tried it, it seemed to overwrite the queues for other tasks I was running.
Is the best practice to define all the possible queues in CELERY_ROUTES or is there a better way to set up my task so that they will both run on the same queue?

Do you want something like it?

Related

Run periodic celery task with a dynamic schedule in django application

I am wondering if it is possible to have my end users dynamically adjust the schedule of a periodic task.
So something along these lines:
# celery.py
def get_schedule():
config = get_user_config() # returns a model object of sorts
return config.frequency_in_seconds
app.conf.beat_schedule = {
'my_periodic_task': {
'task': 'my_periodic_task',
'schedule': get_schedule, # schedule updated based on `get_schedule` function
},
}
This way, if a user were to change the frequency_in_seconds field in their user config setting, it would dynamically update the beat schedule.
My preference would be to do this outside of the Django Admin site and without any additional packages (e.g. django-celery-beat).
Any thoughts or ideas would be much appreciated.
Thanks
If you're using django, you can use django-celery-beat to allow end-users to control the schedule using the django admin panel.
If you're using redis as your result backend, you can use this library here https://github.com/parad0x96/django-redbeat
Create the dynamic periodic task.:
from django_redbeat import PeriodicTaskEntry
task = PeriodicTasksEntry.objects.create(
name="The verbose name of the task",
task="yourapp.tasks.task_name",
args=[arg1, arg2,],
schedule=10# the schedule in seconds
)
This will create a dynamic periodic task, which you have control on the schedule and the creation as well.
Run Celery beat like this :
celery -A your_app_name beat -l INFO -S redbeat.RedBeatScheduler --max-interval 10

Mock async_task of Django-q

I'm using django-q and I'm currently working on adding tests using mock for my existing tasks. I could easily create tests for each task without depending on django-q but one of my task is calling another async_task. Here's an example:
import requests
from django_q.tasks import async_task
task_a():
response = requests.get(url)
# process response here
if condition:
async_task('task_b')
task_b():
response = requests.get(another_url)
And here's how I test them:
import requests
from .tasks import task_a
from .mock_responses import task_a_response
#mock.patch.object(requests, "get")
#mock.patch("django_q.tasks.async_task")
def test_async_task(self, mock_async_task, mock_task_a):
mock_task_a.return_value.status_code = 200
mock_task_a.return_value.json.return_value = task_a_response
mock_async_task.return_value = "12345"
# execute the task
task_a()
self.assertTrue(mock_task_a.called)
self.assertTrue(mock_async_task.called)
I know for a fact that async_task returns the task ID, hence the line, mock_async_task.return_value = "12345". However, after running the test, mock_async_task returns False and the task is being added into the queue (I could see a bunch of 01:42:59 [Q] INFO Enqueued 1 from the server) which is what I'm trying to avoid. Is there any way to accomplish this?
In order to prevent the task from being added to the queue, you need to change the configuration sync to True when the tests are running. You can find more info about the configurations here

How to schedule a celery task without blocking Django

I have a Django service that register lot of clients and render a payload containing a timer (lets say 800s) after which the client should be suspended by the service (Change status REGISTERED to SUSPENDED in MongoDB)
I'm running celery with rabbitmq as broker as follows:
celery/tasks.py
#app.task(bind=True, name='suspend_nf')
def suspend_nf(pk):
collection.update_one({'instanceId': str(pk)},
{'$set': {'nfStatus': 'SUSPENDED'}})
and calling the task inside Django view like:
api/views.py
def put(self, request, pk):
now = datetime.datetime.now(tz=pytz.timezone(TIME_ZONE))
timer = now + datetime.timedelta(seconds=response_data["heartBeatTimer"])
suspend_nf.apply_async(eta=timer)
response = Response(data=response_data, status=status.HTTP_202_ACCEPTED)
response['Location'] = str(request.build_absolute_uri())
What am I missing here?
Are you asking that your view blocks totally or view is waiting the "ETA" to complete the execution?
Did you receive any error?
Try using countdown parameter instead of eta.
In your case it's better because you don't need to manipulate dates.
Like this: suspend_nf.apply_async(countdown=response_data["heartBeatTimer"])
Let's see if your view will have some different behavior.
I have finally find a work around, since working on a small project, I don't really need Celery + rabbitmq a simple Threading does the job.
Task look like this :
def suspend_nf(pk, timer):
time.sleep(timer)
collection.update_one({'instanceId': str(pk)},
{'$set': {'nfStatus': 'SUSPENDED'}})
And calling inside the view like :
timer = int(response_data["heartBeatTimer"])
thread = threading.Thread(target=suspend_nf, args=(pk, timer), kwargs={})
thread.setDaemon(True)
thread.start()

How to remove celery task ?

I practice django-celery
settings.py
#import datetime
#CELERYBEAT_SCHEDULE = {
# 'hello_task': {
# 'task': 'hello_task',
# 'schedule': datetime.timedelta(seconds=20),
# },
#}
import datetime
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'app1.tasks.myfunc',
'schedule': datetime.timedelta(seconds=30),
},
}
I try hello_task schedule at first,then I commented it and try add-every-30-seconds
But it still want to execute when hello_task time comes
So I check the database and found it was save the record int it
Why wouldn't it delete when I commented it ??
Is there any command or normal way to delete it ???
Or If I just delete it from database is well ??
If you're using the django-celery database scheduler those periodic tasks in CELERYBEAT_SCHEDULE dict will be added to Django's database, as you found out. django-celery's scheduler then reads its settings primarily from the database. Removing entries from the dict just means that django-celery has nothing to add to the database.
To delete the task properly, remove it from the Django admin page (Djcelery > Periodic Tasks).
When you change the code that celery is working with, you should restart celery to understand the changes. Such as needing to restart django server whenever you change your python codes.

Send a success signal when the group of tasks in celery is finished

So I have a basic configuration django 1.6 + celery 3.1. Say I have an example task:
#app.task
def add(x, y):
time.sleep(6)
return {'result':x + y}
And a function that groups and returns job id
def nested_add(x,y):
grouped_task = group(add.s(x,y) for i in range(0,2))
job = result_array.apply_async()
job.save()
return job.id
Now I want to perform some action when that group of tasks is finished but if I put the the app.task decorator to nested_add and try to catch the task_success then it wouldn't work properly. Any tips of what I should use?
There are actually several options. The most simplest is to use chord. Chord will wail until all sub-tasks are finished with some result and then return the overall result back. More could be found http://ask.github.io/celery/userguide/tasksets.html. Another simple approach is to leverage AsyncResult API collect() method. More could be found here: http://celery.readthedocs.org/en/latest/reference/celery.result.html.
Don't forget to configure your result backend. more could be found http://celery.readthedocs.org/en/latest/getting-started/first-steps-with-celery.html#keeping-results. If you are using RabbitMQ as a brocker then configure it as a result backend too.