I'm trying to execute a long running function (ex: sleep(30)) after a Django view returns a response. I've tried implementing the solutions suggested to similar questions:
How to execute code in Django after response has been sent
Execute code in Django after response has been sent to the
client
However, the client's page load only completes after the long running function completes running when using a WSGI server like gunicorn.
Now that Django supports asynchronous views is it possible to run a long running query asynchronously?
Obviously, I am looking for a solution regarding the same issue, to open a view which should start a background task and send a response to the client without waiting until started task is finished.
As far as I understand yet this is not one of the objectives of async view in Django. The problem is that all executed code is connected the the worker started to handle the http request. If the response is sent back to the client the worker cannot handle any other code / task anymore started in the view asynchronous. Therefore, all async functions require an "await" in front of. Consequently, the view will only send its response to the client if the awaited function is finished.
As I understand all background tasks must be pushed in a queue of tasks where another worker can catch each new task. There are several solution for this, like Djangp Channels or Django Q. However, I am not sure what is the most lightweighted solution.
Related
I have a request that lasts more than 3 minutes, I want the request to be sent and immediately give the answer 200 and after the end of the work - give the result
The workflow you've described is called asynchronous task execution.
The main idea is to remove time or resource consuming parts of work from the code that handles HTTP requests and deligate it to some kind of worker. The worker might be a diffrent thread or process or even a separate service that runs on a different server.
This makes your application more responsive, as the users gets the HTTP response much quicker. Also, with this approach you can display such UI-friendly things as progress bars and status marks for the task, create retrial policies if task failes etc.
Example workflow:
user makes HTTP request initiating the task
the server creates the task, adds it to the queue and returns the HTTP response with task_id immediately
the front-end code starts ajax polling to get the results of the task passing task_id
the server handles polling HTTP requests and gets status information for this task_id. It returns the info (whether results or "still waiting") with the HTTP response
the front-end displays spinner if server returns "still waiting" or the results if they are ready
The most popular way to do this in Django is using the celery disctributed task queue.
Suppose a request comes, you will have to verify it. Then send response and use a mechanism to complete the request in the background. You will have to be clear that the request can be completed. You can use pipelining, where you put every task into pipeline, Django-Celery is an option but don't use it unless required. Find easy way to resolve the issue
I am using django framework and ran into some performance problems.
There is a very heavy (which costs about 2 seconds) in my views.py. And let's call it heavy().
The client uses ajax to send a request, which is routed to heavy(), and waits for a json response.
The bad thing is that, I think heavy() is not concurrent. As shown in the image below, if there are two requests routed to heavy() at the same time, one must wait for another. In another word, heavy() is serial: it cannot take another request before returning from current request. The observation is tested and proven on my local machine.
I am trying to make the functions in views.py concurrent and asynchronous. Ideally, when there are two requests coming to heavy(), heavy() should throw the job to some remote worker with a callback, and return. Then, heavy() can process another request. When the task is done, the callback can send the results back to client. The logic is demonstrated as below:
However, there is a problem: if heavy() wants to process another request, it must return; but if it returns something, the django framework will send a (fake)response to the client, and the client may not wait for another response. Moreover, the fake response doesn't contain the correct data. I have searched throught stackoverflow and find less useful tips. I wonder if anyone have tried this and knows a good way to solve this problem.
Thanks,
First make sure that 'inconcurrency' is actually caused by your heavy task. If you're using only one worker for django, you will be able to process only one request at a time, no matter what it will be. Consider having more workers for some concurrency, because it will affect also short requests.
For returning some information when task is done, you can do it in at least two ways:
sending AJAX requests periodicaly to fetch status of your task
using SSE or websocket to subscribe for actual result
Both of them will require to write some more JavaScript code for handling it. First one is really easy achievable, for second one you can use uWSGI capabilities, as described here. It can be handled asynchronously that way, independently of your django workers (django will just create connection and start task in celery, checking status and sending it to client will be handled by gevent.
To follow up on GwynBliedD's answer:
celery is commonly used to process tasks, it has very simple django integration. #GwynBlieD's first suggestion is very commonly implemented using celery and a celery result backend.
https://www.reddit.com/r/django/comments/1wx587/how_do_i_return_the_result_of_a_celery_task_to/
A common workflow Using celery is:
client hits heavy()
heavy() queues heavy() task asynchronously
heavy() returns future task ID to client (view returns very quickly because little work was actually performed)
client starts polling a status endpoint using the task ID
when task completes status returns result to client
Does Django have something similar to ASP.NET MVC's Asynchronous Controller?
I have some requests that will be handled by celery workers, but won't take a long time (a few seconds). I want the clients to get the response after the worker is done. I can have my view function wait for the task to complete, but I'm worried it will put too much burden on the webserver.
Clarification:
Here's the flow I can have today
def my_view(request):
async = my_task.delay(params)
result = async.get()
return my_response(result)
async.get() can take a few seconds - not too long so that the client can't wait for the HTTP response to get back.
This code might put unnecessary strain on the server. What ASP.NET MVC's AsynchronousController provides, is the ability to break this function in two, something similar to this:
def my_view(request):
async = my_task.delay(params)
return DelayedResponse(async, lambda result=>my_response(result))
This releases the webserver to handle other requests until the async operation is done. Once it done, it will execute the lambda expression on the result, giving back the response.
Instead of waiting for request to complete, you can return status "In progress" an then send one more request to check if status has changed. Since you're doing pure lookups, the response will be very fast and won't put much burden on your web server.
You can outsource this specific view/feature to Tornado web server which is designed for async callback. The rest of the site may continue to run on django.
Most likely the solution should be not technical, but in UI/UX area. If something takes long, it's ok to notify user about it if notification is clear.
Yes, you can do something only when the task completes. You would want to look into something called chain(). You can bind celery tasks in chain:
chain = first_function.s(set) | second_Function.s(do)
chain()
These two functions first_function and second_function will both are celery functions. The second_function is executed only when first_function finishes its execution.
I have now succesfully setup Django-celery to check after my existing tasks to remind the user by email when the task is due:
#periodic_task(run_every=datetime.timedelta(minutes=1))
def check_for_tasks():
tasks = mdls.Task.objects.all()
now = datetime.datetime.utcnow().replace(tzinfo=utc,second=00, microsecond=00)
for task in tasks:
if task.reminder_date_time == now:
sendmail(...)
So far so good, however what if I wanted to also display a popup to the user as a reminder?
Twitter bootstrap allows creating popups and displaying them from javascript:
$(this).modal('show');
The problem is though, how can a celery worker daemon run this javascript on the user's browser? Maybe I am going a complete wrong way and this is not possible at all. Therefore the question remains can a cronjob on celery ever be used to achieve a ui notification on the browser?
Well, you can't use the Django messages framework, because the task has no way to access the user's request, and you can't pass request objects to the workers neither, because they're unpickable.
But you could definitely use something like django-notifications. You could create notifications in your task and attach them to the user in question. Then, you could retrieve those messages from your view and handle them in your templates to your liking. The user would see the notification on their next request (or you could use AJAX polling for real-time-ish notifications or HTML5 websockets for real real-time [see django-websocket]).
Yes it is possible but it is not easy. Ways to do/emulate server to client communication:
polling
The most trivial approach would be polling the server from javascript. Your celery task could create rows in your database that can be fetched by a url like /updates which checks for new updates, marks the rows as read and returns them.
long polling
Often referred to as comet. The client does a request to the server which pends until the server decides to return something. See django-comet for example.
websocket
To enable true server to client communication you need an open connection from the client to the server. django-socketio and django-websocket are examples of reusable apps that make this possible.
My advice judging by your question's context: either do some basic polling or stick with the emails.
How can I access the result of a Celery task in my main Django application process? Or, how can I publish to an existing socket connection from a separate process?
I have an application in which users receive scores. When a score is recorded, calculations are made (progress towards goals, etc), and based on those calculations notifications are sent to interested users. The calculations may take 30s+, so to avoid sluggish UI those operations are performed in a background process via a Celery task, invoked by the post_save signal of my Score model.
Ideally the post_save signal on my Nofication model would publish a message to subscribed clients (I'm using django-socketio, a wrapper for gevent-socketio). This seems straightforward...
Create a Score
Do some calculations on the new Score instance in a background process
Based on those calculations, create a Notification
On Notification save, grab the instance and publish to subscribed clients via socket connection
However after trying the following I'm not sure this is possible:
passing gevent's SocketIOServer instance to the callback method invoked by the task, but this requires pickling the passed object, which isn't possible
storing the socket's session_id (different from Django's session_id) in memchache and retrieving that in the Celery task process.
using Redis pubsub, so methods called by post_save signals on models created in a background process could simply publish to a Redis channel, but listening to chat channel in main application process (that has access to the socket connection) blocks the rest of the application.
I've also tried spawning new threads for each Redis client, which are created for each socket subscriber. As far as I can tell this requires spawning a new gevent.greenlets.Greenlet, and gevent can't be used in multiple threads
Surely this is a solved problem. What am I missing?
You already have django-socketio, writing a pub/sub with redis would be a pity :)
client side:
var socket = new io.Socket();
socket.connect();
socket.on('connect', function() {
socket.subscribe({{ score_update_channel }});
});
server side:
from django_socketio import broadcast_channel
def user_score_update(user):
return 'score_updates_user_%s' % user.pk
channel = user_score_update(user)
broadcast_channel(score_result_data, channel)
You need to run the broadcast on the django-socketio process; if you run it from a different process (celery worker) it will not work (channels are referenced in memory by the django-socketio process); You can solve this by wrapping it in a view and that celery will call (making a real http request) when the task is complete.