Python Celery: Updating task chain during execution - python-2.7

When executing a celery task chain, it is possible to stop the chains execution as follows:
#celeryapp.task(bind=True)
def task(self, continue_task_chain):
if not continue_task_chain:
self.request.chain = None
Instead of cancelling the whole request chain however, I would like to modify self.request.chain with an alternate chain.
#celeryapp.task(bind=True)
def task(self, continue_task_chain):
if not continue_task_chain:
self.request.chain = another_chain
Is there a way to do this?

Related

Mock async_task of Django-q

I'm using django-q and I'm currently working on adding tests using mock for my existing tasks. I could easily create tests for each task without depending on django-q but one of my task is calling another async_task. Here's an example:
import requests
from django_q.tasks import async_task
task_a():
response = requests.get(url)
# process response here
if condition:
async_task('task_b')
task_b():
response = requests.get(another_url)
And here's how I test them:
import requests
from .tasks import task_a
from .mock_responses import task_a_response
#mock.patch.object(requests, "get")
#mock.patch("django_q.tasks.async_task")
def test_async_task(self, mock_async_task, mock_task_a):
mock_task_a.return_value.status_code = 200
mock_task_a.return_value.json.return_value = task_a_response
mock_async_task.return_value = "12345"
# execute the task
task_a()
self.assertTrue(mock_task_a.called)
self.assertTrue(mock_async_task.called)
I know for a fact that async_task returns the task ID, hence the line, mock_async_task.return_value = "12345". However, after running the test, mock_async_task returns False and the task is being added into the queue (I could see a bunch of 01:42:59 [Q] INFO Enqueued 1 from the server) which is what I'm trying to avoid. Is there any way to accomplish this?
In order to prevent the task from being added to the queue, you need to change the configuration sync to True when the tests are running. You can find more info about the configurations here

How to schedule a celery task without blocking Django

I have a Django service that register lot of clients and render a payload containing a timer (lets say 800s) after which the client should be suspended by the service (Change status REGISTERED to SUSPENDED in MongoDB)
I'm running celery with rabbitmq as broker as follows:
celery/tasks.py
#app.task(bind=True, name='suspend_nf')
def suspend_nf(pk):
collection.update_one({'instanceId': str(pk)},
{'$set': {'nfStatus': 'SUSPENDED'}})
and calling the task inside Django view like:
api/views.py
def put(self, request, pk):
now = datetime.datetime.now(tz=pytz.timezone(TIME_ZONE))
timer = now + datetime.timedelta(seconds=response_data["heartBeatTimer"])
suspend_nf.apply_async(eta=timer)
response = Response(data=response_data, status=status.HTTP_202_ACCEPTED)
response['Location'] = str(request.build_absolute_uri())
What am I missing here?
Are you asking that your view blocks totally or view is waiting the "ETA" to complete the execution?
Did you receive any error?
Try using countdown parameter instead of eta.
In your case it's better because you don't need to manipulate dates.
Like this: suspend_nf.apply_async(countdown=response_data["heartBeatTimer"])
Let's see if your view will have some different behavior.
I have finally find a work around, since working on a small project, I don't really need Celery + rabbitmq a simple Threading does the job.
Task look like this :
def suspend_nf(pk, timer):
time.sleep(timer)
collection.update_one({'instanceId': str(pk)},
{'$set': {'nfStatus': 'SUSPENDED'}})
And calling inside the view like :
timer = int(response_data["heartBeatTimer"])
thread = threading.Thread(target=suspend_nf, args=(pk, timer), kwargs={})
thread.setDaemon(True)
thread.start()

Does celery task id change after redistribution

I have a Django model which has a column called celery_task_id. I am using RabbitMQ as the broker. There's a celery function called test_celery which takes a model object as parameter. Now I have the following lines of code which creates a celery task.
def create_celery_task():
celery_task_id = test_celery.apply_async((model_obj,), eta='Future Datetime Object')
model_obj.celery_task_id = celery_task_id
model_obj.save()
----
----
Now inside the celery function I am verifying if the task id is same as of the one stored in the DB or not.
#app.task
def test_celery(model_obj):
if model_obj.celery_task_id == test_celery.request.id:
## Do something
My problem is there are a lot of cases where I can see the task being received and succeeding in the log but not executing the code inside of if condition.
Is it possible that celery task id changes after redistribution. Or are there any other reasons.
One of the recommendations is not to pass Database/ORM objects into the Celery tasks because the may contain stale data. Try to rewrite the task as:
#app.task
def test_celery(model_obj_id):
model_obj = YourModel.objects.get(id=model_obj_id)
if model_obj:
if model_obj.celery_task_id == test_celery.request.id:
## Do something

Send a success signal when the group of tasks in celery is finished

So I have a basic configuration django 1.6 + celery 3.1. Say I have an example task:
#app.task
def add(x, y):
time.sleep(6)
return {'result':x + y}
And a function that groups and returns job id
def nested_add(x,y):
grouped_task = group(add.s(x,y) for i in range(0,2))
job = result_array.apply_async()
job.save()
return job.id
Now I want to perform some action when that group of tasks is finished but if I put the the app.task decorator to nested_add and try to catch the task_success then it wouldn't work properly. Any tips of what I should use?
There are actually several options. The most simplest is to use chord. Chord will wail until all sub-tasks are finished with some result and then return the overall result back. More could be found http://ask.github.io/celery/userguide/tasksets.html. Another simple approach is to leverage AsyncResult API collect() method. More could be found here: http://celery.readthedocs.org/en/latest/reference/celery.result.html.
Don't forget to configure your result backend. more could be found http://celery.readthedocs.org/en/latest/getting-started/first-steps-with-celery.html#keeping-results. If you are using RabbitMQ as a brocker then configure it as a result backend too.

Celery: clean way of revoking the entire chain from within a task

My question is probably pretty basic but still I can't get a solution in the official doc. I have defined a Celery chain inside my Django application, performing a set of tasks dependent from eanch other:
chain( tasks.apply_fetching_decision.s(x, y),
tasks.retrieve_public_info.s(z, x, y),
tasks.public_adapter.s())()
Obviously the second and the third tasks need the output of the parent, that's why I used a chain.
Now the question: I need to programmatically revoke the 2nd and the 3rd tasks if a test condition in the 1st task fails. How to do it in a clean way? I know I can revoke the tasks of a chain from within the method where I have defined the chain (see thisquestion and this doc) but inside the first task I have no visibility of subsequent tasks nor of the chain itself.
Temporary solution
My current solution is to skip the computation inside the subsequent tasks based on result of the previous task:
#shared_task
def retrieve_public_info(result, x, y):
if not result:
return []
...
#shared_task
def public_adapter(result, z, x, y):
for r in result:
...
But this "workaround" has some flaw:
Adds unnecessary logic to each task (based on predecessor's result), compromising reuse
Still executes the subsequent tasks, with all the resulting overhead
I haven't played too much with passing references of the chain to tasks for fear of messing up things. I admit also I haven't tried Exception-throwing approach, because I think that the choice of not proceeding through the chain can be a functional (thus non exceptional) scenario...
Thanks for helping!
I think I found the answer to this issue: this seems the right way to proceed, indeed. I wonder why such common scenario is not documented anywhere, though.
For completeness I post the basic code snapshot:
#app.task(bind=True) # Note that we need bind=True for self to work
def task1(self, other_args):
#do_stuff
if end_chain:
self.request.callbacks[:] = []
....
Update
I implemented a more elegant way to cope with the issue and I want to share it with you. I am using a decorator called revoke_chain_authority, so that it can revoke automatically the chain without rewriting the code I previously described.
from functools import wraps
class RevokeChainRequested(Exception):
def __init__(self, return_value):
Exception.__init__(self, "")
# Now for your custom code...
self.return_value = return_value
def revoke_chain_authority(a_shared_task):
"""
#see: https://gist.github.com/bloudermilk/2173940
#param a_shared_task: a #shared_task(bind=True) celery function.
#return:
"""
#wraps(a_shared_task)
def inner(self, *args, **kwargs):
try:
return a_shared_task(self, *args, **kwargs)
except RevokeChainRequested, e:
# Drop subsequent tasks in chain (if not EAGER mode)
if self.request.callbacks:
self.request.callbacks[:] = []
return e.return_value
return inner
This decorator can be used on a shared task as follows:
#shared_task(bind=True)
#revoke_chain_authority
def apply_fetching_decision(self, latitude, longitude):
#...
if condition:
raise RevokeChainRequested(False)
Please note the use of #wraps. It is necessary to preserve the signature of the original function, otherwise this latter will be lost and celery will make a mess at calling the right wrapped task (e.g. it will call always the first registered function instead of the right one)
As of Celery 4.0, what I found to be working is to remove the remaining tasks from the current task instance's request using the statement:
self.request.chain = None
Let's say you have a chain of tasks a.s() | b.s() | c.s(). You can only access the self variable inside a task if you bind the task by passing bind=True as argument to the tasks' decorator.
#app.task(name='main.a', bind=True):
def a(self):
if something_happened:
self.request.chain = None
If something_happened is truthy, b and c wouldn't be executed.