I use:
Celery
Django-Celery
RabbitMQ
I can see all my tasks in the Django admin page, but at the moment it has just a few states, like:
RECEIVED
RETRY
REVOKED
SUCCESS
STARTED
FAILURE
PENDING
It's not enough information for me. Is it possible to add more details about a running process to the admin page? Like progress bar or finished jobs counter etc.
I know how to use the Celery logging function, but a GUI is better in my case for some reasons.
So, is it possible to send some tracing information to the Django-Celery admin page?
Here's my minimal progress-reporting Django backend using your setup. I'm still a Django n00b and it's the first time I'm messing with Celery, so this can probably be optimized.
from time import sleep
from celery import task, current_task
from celery.result import AsyncResult
from django.http import HttpResponse, HttpResponseRedirect
from django.core.urlresolvers import reverse
from django.utils import simplejson as json
from django.conf.urls import patterns, url
#task()
def do_work():
""" Get some rest, asynchronously, and update the state all the time """
for i in range(100):
sleep(0.1)
current_task.update_state(state='PROGRESS',
meta={'current': i, 'total': 100})
def poll_state(request):
""" A view to report the progress to the user """
if 'job' in request.GET:
job_id = request.GET['job']
else:
return HttpResponse('No job id given.')
job = AsyncResult(job_id)
data = job.result or job.state
return HttpResponse(json.dumps(data), mimetype='application/json')
def init_work(request):
""" A view to start a background job and redirect to the status page """
job = do_work.delay()
return HttpResponseRedirect(reverse('poll_state') + '?job=' + job.id)
urlpatterns = patterns('webapp.modules.asynctasks.progress_bar_demo',
url(r'^init_work$', init_work),
url(r'^poll_state$', poll_state, name="poll_state"),
)
I am starting to try figuring this out myself. Start by defining a PROGRESS state exactly as explained on the Celery userguide, then all you need is to insert a js in your template that will update your progress bar.
Thank #Florian Sesser for your example!
I made a complete Django app that show the progress of create 1000 objects to the users at http://iambusychangingtheworld.blogspot.com/2013/07/django-celery-display-progress-bar-of.html
Everyone can download and use it!
I would recommend a library called celery-progress for this. It is designed to make it as easy as possible to drop-in a basic end-to-end progress bar setup into a django app with as little scaffolding as possible, while also supporting heavy customization on the front-end if desired. Lots of docs and references for getting started in the README.
Full disclosure: I am the author/maintainer of said library.
Related
In my project, I use django celery beat package to execute scheduled tasks. It works well but I have one case that I can't handle.
All the tasks have a PeriodicTack that schedules them.
So the following task:
from celery import shared_task
#shared_task
def foo(**kwargs):
# Here I can to things like this :
whatever_method(kwargs["bar"])
Don't know if it is luck but it turns out that kwargs "points" to the kwargs attribute of the PeriodicTask model.
My question is :
How can I access the PeriodicTask instance that made the task run ?
What if I have 2 PeriodicTask that use the same shared_task but with different schedules/parameters, will it find out which one was the source for that particular run ?
Thanks in advance for your help.
Ok I found a way to do this.
As I said in the comment, making use of #app.task solves my needs.
I end up with a task like this :
#app.task(bind=True)
def foo(self, **kwargs):
# The information I personally need is in self properties, like so :
desired_info = self.request.properties
# Do whatever is needed with desired info...
# Do whatever else ...
Where app is the Celery app as described in the docs.
The bind=True is, as I understood, necessary to make the task having its own request and thus having access to self with information.
In my django application I am using celery. In a post_save signal, I am updating the index in elastic search. But for some reason the task gets hung and never actually executes the code:
What I use to run celery:
celery -A collegeapp worker -l info
The Signal:
#receiver(post_save, sender=University)
def university_saved(sender, instance, created, **kwargs):
"""
University save signal
"""
print('calling celery task')
update_university_index.delay(instance.id)
print('finished')
The task:
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(instance_id):
print('updating university index')
The only output I get is calling celery task. after waiting over 30 minutes, it doesn't ever get to any other print statements and the view continue to wait. Nothing ever shows in celery terminal.
Versions:
Django 3.0,
Celery 4.3,
Redis 5.0.9,
Ubuntu 18
UPDATE:
after doing some testing, using the debug_task defined inside the celery.py file in place of update_university_index does not lead to hanging. It behaves as expect. I thought maybe it could have been app.task vs task decorator but it seems that's not it.
#app.task(bind=True)
def debug_task(text, second_value):
print('printing debug_task {} {}'.format(text, second_value))
This happened with me once, I had made the dumbest error, django tells us to specify celery tasks in tasks.py file, and uses that for task discovery. After that it worked. Could you provide more insight into the directory structure using tree command?
This tutorial is for flask, but the same can be achieved in django. Where this particular tutorial shines is that after you tell celery to execute a task, it also provides you with a uuid and you can ping that url and monitor the progress of the task you triggered.
Verify that the tasks have been registered by celery using (Do make sure that celery is running):
from celery.task.control import inspect
i = inspect()
i.registered_tasks()
Or bash
$ celery inspect registered
$ celery -A collegeapp inspect registered
From https://docs.celeryproject.org/en/latest/faq.html#the-worker-isn-t-doing-anything-just-hanging
Why is Task.delay/apply*/the worker just hanging?
Answer: There’s a bug in some AMQP clients that’ll make it hang if it’s not able to authenticate the current user, the password doesn’t match or the user doesn’t have access to the virtual host specified. Be sure to check your broker logs (for RabbitMQ that’s /var/log/rabbitmq/rabbit.log on most systems), it usually contains a message describing the reason.
Change this line
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(instance_id):
print('updating university index')
To
#task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(self, instance_id):
print('updating university index')
Or add self to the task definition.
I'm still not sure as to why it doesn't work but I found a solution by replace task with app.task
importing app from my celery.py seemed to have resolved the issue.
from collegeapp.celery import app
#app.task(name="update_university_index", bind=True, default_retry_delay=5, max_retries=1, acks_late=True)
def update_university_index(self, instance_id):
print('updating university index')
I have several API's as sources of data, for example - blog posts. What I'm trying to achieve is to send requests to this API's in parallel from Django view and get results. No need to store results in db, I need to pass them to my view response. My project is written on python 2.7, so I can't use asyncio. I'm looking for advice on the best practice to solve it (celery, tornado, something else?) with examples of how to achieve that cause I'm only starting my way in async. Thanks.
A solution is use Celery and pass your request args to this, and in the front use AJAX.
Example:
def my_def (request):
do_something_in_celery.delay()
return Response(something)
To control if a task is finished in Celery, you can put the return of Celery in a variable:
task_run = do_something_in_celery.delay()
In task_run there is a property .id.
This .id you return to your front and use it to monitor the status of task.
And your function executed in Celery must have de decorator #task
#task
do_something_in_celery(*args, **kwargs):
You will a need to control the tasks, like a Redis or RabbitMQ.
Look this URLs:
http://masnun.com/2014/08/02/django-celery-easy-async-task-processing.html
https://buildwithdjango.com/blog/post/celery-progress-bars/
http://docs.celeryproject.org/en/latest/index.html
I found a solution using concurrent.futures ThreadPoolExecutor from futures lib.
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
You can also check out the rest of the concurrent.futures doc.
Important!
The ProcessPoolExecutor class has known (unfixable) problems on Python 2 and should not be relied on for mission critical work.
I realize there are many other questions related to custom django signals that don't work, and believe me, I have read all of them several times with no luck for getting my personal situation to work.
Here's the deal: I'm using django-rq to manage a lengthy background process that is set off by a particular http request. When that background process is done, I want it to fire off a custom Django signal so that the django-rq can be checked for any job failure/exceptions.
Two applications, both on the INSTALLED_APPS list, are at the same level. Inside of app1 there is a file:
signals.py
import django.dispatch
file_added = django.dispatch.Signal(providing_args=["issueKey", "file"])
fm_job_done = django.dispatch.Signal(providing_args=["jobId"])
and also a file jobs.py
from app1 import signals
from django.conf import settings
jobId = 23
issueKey = "fake"
fileObj = "alsoFake"
try:
pass
finally:
signals.file_added.send(sender=settings.SIGNAL_SENDER,issueKey=issueKey,fileName=fileObj)
signals.fm_job_done.send(sender=settings.SIGNAL_SENDER,jobId=jobId)
then inside of app2, in views.py
from app1.signals import file_added, fm_job_done
from django.conf import settings
#Setup signal handlers
def fm_job_done_callback(sender, **kwargs):
print "hellooooooooooooooooooooooooooooooooooo"
logging.info("file manager job done signal fired")
def file_added_callback(sender, **kwargs):
print "hellooooooooooooooooooooooooooooooooooo"
logging.info("file added signal fired")
file_added.connect(file_added_callback,sender=settings.SIGNAL_SENDER,weak=False)
fm_job_done.connect(fm_job_done_callback,sender=settings.SIGNAL_SENDER,weak=False)
I don't get any feedback whatsoever though and am at a total loss. I know for fact that jobs.py is executing, and therefore also that the block of code that should be firing the signals is executing as well since it is in a finally block (no the try is not actually empty - I just put pass there for simplicity) Please feel free to ask for more information - I'll respond asap.
here is the solution for django > 2.0
settings.py:
change name of your INSTALLED_APPS from 'app2' to
'app2.apps.App2Config'
app2 -> apps.py:
from app1.signals import file_added, fm_job_done
Class App2Config(AppConfig):
name = 'app2'
def ready(self):
from .views import fm_job_done_callback, file_added_callback
file_added.connect(file_added_callback)
fm_job_done.connect(fm_job_done_callback)
use django receiver decorator
from django.dispatch import receiver
from app1.signals import file_added, fm_job_done
#receiver(fm_job_done)
def fm_job_done_callback(sender, **kwargs):
print "helloooooooooooooo"
#receiver(file_added)
def file_added_callback(sender, **kwargs):
print "helloooooooooooooo"
Also, I prefer to handle signals in models.py
How does one perform the following (Django 0.96) dispatcher hooks in Django 1.0?
import django.dispatch.dispatcher
def log_exception(*args, **kwds):
logging.exception('Exception in request:')
# Log errors.
django.dispatch.dispatcher.connect(
log_exception, django.core.signals.got_request_exception)
# Unregister the rollback event handler.
django.dispatch.dispatcher.disconnect(
django.db._rollback_on_exception,
django.core.signals.got_request_exception)
Incidentally, this code is from Google's Article on Using Django on GAE. Unfortunately the dispatch code in Django was rewritten between 0.96 and 1.0, and Google's example does not work with Django 1.0.
Of course, the Django people provided a helpful guide on how to do exactly this migration, but I'm not keen enough to figure it out at the moment. :o)
Thanks for reading.
Brian
The basic difference is that you no longer ask the dispatcher to connect you to some signal, you ask the signal directly. So it would look something like this:
from django.core.signals import got_request_exception
from django.db import _rollback_on_exception
def log_exception(*args, **kwds):
logging.exception('Exception in request:')
# Log errors.
got_request_exception.connect(log_exception)
# Unregister the rollback event handler.
_rollback_on_exception.disconnect(got_request_exception)