we are running Django v1.10 with celery v4.0.2, rabbitMQ v3.5.7 and flower v0.9.1 and are pretty new with celery, rabbitMQ and flower.
There is a function x() which was set to retry after 7 days in case of failure. We have 1000's of instances of x re-queued on production. We have fixed the issue and would like to retry the instances asap.
Is there a way to force the retry before it's scheduled time?
If you can get a list of the tasks, it's a matter of calling task.retry(exc=exc) on each one. See docs.
Try celery.task.control.inspect().reserved() and see if you can filter the tasks that way. Example here.
You get a task object from its id with this, per this answer.
result = MyTask.AsyncResult(task_id)
result.get()
So after trying a lot of things, I had to solve it by getting the list of scheduled tasks and their arguments and calling the functions in a for loop with the arguments.
There is apparently no way to retry a task manually by design. You have to create another task with the same parameters. This is what I finally did:
i = inspect()
scheduled = i.scheduled()
for key in scheduled:
for element in scheduled[key]:
reqDict = element['request']
if reqDict['type']=='module.function':
module.function.delay(converted_arguments)
revoke(reqDict['id'], terminate=True)
Related
In my Django project I'm using Celery with a RabbitMQ broker for asynchronous tasks, how can I record the information of all of my tasks (e.g. created time (task appears in queue), worker consume task time, execution time, status, ...) to monitor how Celery is doing?
I know there are solutions like Flower but that seems to much for what I need, django-celery-results looks like what I want but it's missing a few information I need like task created time.
Thanks!
It seems like you often find the answer yourself after asking on SO. I settled with using celery signals to do all the recording I want and store the results in a database table.
We've got Celery/SQS set up for asynchronous task management. We're running Django for our framework. We have a celery task that has a self.retry() in it. Max_retries is set to 15. The retry is happening with an exponential backoff and takes 182 hours to complete all 15 retries.
Last week, this task went haywire, I think due to a bug in our code not properly handling a service outage. It resulted in exponential creation (retrying?) of the same celery task. It eventually used up all available memory and the worker crashed. Restarting the worker results in another crash a couple hours later, since all those tasks (and their retries) keep retrying and spawning new retries until we run out of memory again. Ultimately we ended up with nearly 600k tasks created!
We need our workers to ignore all the tasks with a specific celery GUID. Ideally we could just get rid of them for good. I was going to use revoke() but, per documentation (http://docs.celeryproject.org/en/3.1/userguide/workers.html#commands), this is only implemented for Redis and RabbitMQ, not SQS. Furthermore, when I go to the SQS service in the AWS console, it's showing zero messages in flight so it's not like I can just flush it.
Is there a way to delete or revoke a specific message from SQS using the Celery task ID? Or is there another way to fix this problem? Obviously we need to fix our code so we don't get into this situation again, but first we need to get our worker up and running because without it our website has reduced functionality. Thanks!
So I'm struggling to figure out the optimal way to schedule some events to happen at some point in the future using celery. An example of this is when a new user has registered, we want to send them an email the next day.
We have celery setup and some of you may allude to the eta parameter when calling apply_async. However that won't work for us, as we use SQS which has a visibility timeout that would conflict and generally the eta param shouldn't be used for lengthy periods.
One solution we've implemented at this point is to create events and store them in the database with a 'to-process' timestamp (refers to when to process the event). We use the celery beat scheduler with a task that runs literally every second to see if there are any new events that are ready to process. If there are, we carry out the subsequent tasks.
This solution works, although it doesn't feel great since we're queueing a task every second on SQS. Any thoughts or ideas on this would be great?
I just installed Celery and I want to create a simple status page that shows the current number of workers and their status.
Is this possible? From web searches the best I found was celery.current_app.control.inspect()
But as far as I can see, it doesn't mention anything about workers. (I'm using Kombu with SQS for the backend if that matters)
In the documentation of celery workers it is explained the output of the inspect commands.
By default using celery.current_app.control.inspect() returns an "inspector object" that allows you to ask for the state of all the running workers. For example if you execute this code with two running workers named 'adder' and 'sleeper':
i = celery.current_app.control.inspect()
i.registered()
the call to i.registered() could return something like:
{
'adder#example.com': ['tasks.add'],
'sleeper#example.com': ['tasks.sleeptask'],
}
In conclusion, the "inspector" methods registered, active, scheduled, etc. return a dictionary with the results classified by the workers chosen when celery.current_app.control.inspect() was called (if no workers are passed as arguments, all workers are implicitly selected).
1) I am currently working on a web application that exposes a REST api and uses Django and Celery to handle request and solve them. For a request in order to get solved, there have to be submitted a set of celery tasks to an amqp queue, so that they get executed on workers (situated on other machines). Each task is very CPU intensive and takes very long (hours) to finish.
I have configured Celery to use also amqp as results-backend, and I am using RabbitMQ as Celery's broker.
Each task returns a result that needs to be stored afterwards in a DB, but not by the workers directly. Only the "central node" - the machine running django-celery and publishing tasks in the RabbitMQ queue - has access to this storage DB, so the results from the workers have to return somehow on this machine.
The question is how can I process the results of the tasks execution afterwards? So after a worker finishes, the result from it gets stored in the configured results-backend (amqp), but now I don't know what would be the best way to get the results from there and process them.
All I could find in the documentation is that you can either check on the results's status from time to time with:
result.state
which means that basically I need a dedicated piece of code that runs periodically this command, and therefore keeps busy a whole thread/process only with this, or to block everything with:
result.get()
until a task finishes, which is not what I wish.
The only solution I can think of is to have on the "central node" an extra thread that runs periodically a function that basically checks on the async_results returned by each task at its submission, and to take action if the task has a finished status.
Does anyone have any other suggestion?
Also, since the backend-results' processing takes place on the "central node", what I aim is to minimize the impact of this operation on this machine.
What would be the best way to do that?
2) How do people usually solve the problem of dealing with the results returned from the workers and put in the backend-results? (assuming that a backend-results has been configured)
I'm not sure if I fully understand your question, but take into account each task has a task id. If tasks are being sent by users you can store the ids and then check for the results using json as follows:
#urls.py
from djcelery.views import is_task_successful
urlpatterns += patterns('',
url(r'(?P<task_id>[\w\d\-\.]+)/done/?$', is_task_successful,
name='celery-is_task_successful'),
)
Other related concept is that of signals each finished task emits a signal. A finnished task will emit a task_success signal. More can be found on real time proc.