Continue request django rest framework - django

I have a request that lasts more than 3 minutes, I want the request to be sent and immediately give the answer 200 and after the end of the work - give the result

The workflow you've described is called asynchronous task execution.
The main idea is to remove time or resource consuming parts of work from the code that handles HTTP requests and deligate it to some kind of worker. The worker might be a diffrent thread or process or even a separate service that runs on a different server.
This makes your application more responsive, as the users gets the HTTP response much quicker. Also, with this approach you can display such UI-friendly things as progress bars and status marks for the task, create retrial policies if task failes etc.
Example workflow:
user makes HTTP request initiating the task
the server creates the task, adds it to the queue and returns the HTTP response with task_id immediately
the front-end code starts ajax polling to get the results of the task passing task_id
the server handles polling HTTP requests and gets status information for this task_id. It returns the info (whether results or "still waiting") with the HTTP response
the front-end displays spinner if server returns "still waiting" or the results if they are ready
The most popular way to do this in Django is using the celery disctributed task queue.

Suppose a request comes, you will have to verify it. Then send response and use a mechanism to complete the request in the background. You will have to be clear that the request can be completed. You can use pipelining, where you put every task into pipeline, Django-Celery is an option but don't use it unless required. Find easy way to resolve the issue

Related

Execute code after Response using Django's Async Views

I'm trying to execute a long running function (ex: sleep(30)) after a Django view returns a response. I've tried implementing the solutions suggested to similar questions:
How to execute code in Django after response has been sent
Execute code in Django after response has been sent to the
client
However, the client's page load only completes after the long running function completes running when using a WSGI server like gunicorn.
Now that Django supports asynchronous views is it possible to run a long running query asynchronously?
Obviously, I am looking for a solution regarding the same issue, to open a view which should start a background task and send a response to the client without waiting until started task is finished.
As far as I understand yet this is not one of the objectives of async view in Django. The problem is that all executed code is connected the the worker started to handle the http request. If the response is sent back to the client the worker cannot handle any other code / task anymore started in the view asynchronous. Therefore, all async functions require an "await" in front of. Consequently, the view will only send its response to the client if the awaited function is finished.
As I understand all background tasks must be pushed in a queue of tasks where another worker can catch each new task. There are several solution for this, like Djangp Channels or Django Q. However, I am not sure what is the most lightweighted solution.

Best place to retry a task in Django: requests or celery task

I am hitting an API using requests library in Django inside a celery task. To be very specific, it fetches some record from database, prepares a json and does a POST request. In the certain case scenario, the call fails with 500 error code. I want to retry the POST request again. What's the best way to go about it and why?
Retry the Celery task itself (See implementation)
Retry the request using urllib.util.retry (See full implementation)
Each Celery job runs in a separate process. In your case, repeating a 500-returned POST request has nothing to do with creating another process. One process will and should be enough of handling such request. So you need to retry the request using urllib.util.retry in the same Celery job and then terminate the job until you get a response with code 200.

How to update progress bar while making a Django Rest api request?

My django rest app accepts request to scrape multiple pages for prices & compare them (which takes time ~5 seconds) then returns a list of the prices from each page as a json object.
I want to update the user with the current operation, for example if I scrape 3 pages I want to update the interface like this :
Searching 1/3
Searching 2/3
Searching 3/3
How can I do this?
I am using Angular 2 for my front end but this shouldn't make a big difference as it's a backend issue.
This isn't the only way, but this is how I do this in Django.
Things you'll need
Asynchronous worker procecess
This allows you to do work outside the context of the request-response cycle. The most common are either django-rq or Celery. I'd recommend django-rq for its simplicity, especially if all you're implementing is a progress indicator.
Caching layer (optional)
While you can use the database for persistence in this case, temporary cache key-value stores make more sense here as the progress information is ephemeral. The Memcached backend is built into Django, however I'd recommend switching to Redis as it's more fully featured, super fast, and since it's behind Django's caching abstraction, does not add complexity. (It's also a requirement for using the django-rq worker processes above)
Implementation
Overview
Basically, we're going to send a request to the server to start the async worker, and poll a different progress-indicator endpoint which gives the current status of that worker's progress until it's finished (or failed).
Server side
Refactor the function you'd like to track the progress of into an async task function (using the #job decorator in the case of django-rq)
The initial POST endpoint should first generate a random unique ID to identify the request (possibly with uuid). Then, pass the POST data along with this unique ID to the async function (in django-rq this would look something like function_name.delay(payload, unique_id)). Since this is an async call, the interpreter does not wait for the task to finish and moves on immediately. Return a HttpResponse with a JSON payload that includes the unique ID.
Back in the async function, we need to set the progress using cache. At the very top of the function, we should add a cache.set(unique_id, 0) to show that there is zero progress so far. Using your own math implementation, as the progress approaches 100% completion, change this value to be closer to 1. If for some reason the operation fails, you can set this to -1.
Create a new endpoint to be polled by the browser to check the progress. This looks for a unique_id query parameter and uses this to look up the progress with cache.get(unique_id). Return a JSON object back with the progress amount.
Client side
After sending the POST request for the action and receiving a response, that response should include the unique_id. Immediately start polling the progress endpoint at a regular interval, setting the unique_id as a query parameter. The interval could be something like 1 second using setInterval(), with logic to prevent sending a new request if there is still a pending request.
When the progress received equals to 1 (or -1 for failures), you know the process is finished and you can stop polling
That's it! It's a bit of work just to get progress indicators, but once you've done it once it's much easier to re-use the pattern in other projects.
Another way to do this which I have not explored is via Webhooks / Channels. In this way, polling is not required, and the server simply sends the messages to the client directly.

Django concurrency with celery

I am using django framework and ran into some performance problems.
There is a very heavy (which costs about 2 seconds) in my views.py. And let's call it heavy().
The client uses ajax to send a request, which is routed to heavy(), and waits for a json response.
The bad thing is that, I think heavy() is not concurrent. As shown in the image below, if there are two requests routed to heavy() at the same time, one must wait for another. In another word, heavy() is serial: it cannot take another request before returning from current request. The observation is tested and proven on my local machine.
I am trying to make the functions in views.py concurrent and asynchronous. Ideally, when there are two requests coming to heavy(), heavy() should throw the job to some remote worker with a callback, and return. Then, heavy() can process another request. When the task is done, the callback can send the results back to client. The logic is demonstrated as below:
However, there is a problem: if heavy() wants to process another request, it must return; but if it returns something, the django framework will send a (fake)response to the client, and the client may not wait for another response. Moreover, the fake response doesn't contain the correct data. I have searched throught stackoverflow and find less useful tips. I wonder if anyone have tried this and knows a good way to solve this problem.
Thanks,
First make sure that 'inconcurrency' is actually caused by your heavy task. If you're using only one worker for django, you will be able to process only one request at a time, no matter what it will be. Consider having more workers for some concurrency, because it will affect also short requests.
For returning some information when task is done, you can do it in at least two ways:
sending AJAX requests periodicaly to fetch status of your task
using SSE or websocket to subscribe for actual result
Both of them will require to write some more JavaScript code for handling it. First one is really easy achievable, for second one you can use uWSGI capabilities, as described here. It can be handled asynchronously that way, independently of your django workers (django will just create connection and start task in celery, checking status and sending it to client will be handled by gevent.
To follow up on GwynBliedD's answer:
celery is commonly used to process tasks, it has very simple django integration. #GwynBlieD's first suggestion is very commonly implemented using celery and a celery result backend.
https://www.reddit.com/r/django/comments/1wx587/how_do_i_return_the_result_of_a_celery_task_to/
A common workflow Using celery is:
client hits heavy()
heavy() queues heavy() task asynchronously
heavy() returns future task ID to client (view returns very quickly because little work was actually performed)
client starts polling a status endpoint using the task ID
when task completes status returns result to client

Django asynchronous requests

Does Django have something similar to ASP.NET MVC's Asynchronous Controller?
I have some requests that will be handled by celery workers, but won't take a long time (a few seconds). I want the clients to get the response after the worker is done. I can have my view function wait for the task to complete, but I'm worried it will put too much burden on the webserver.
Clarification:
Here's the flow I can have today
def my_view(request):
async = my_task.delay(params)
result = async.get()
return my_response(result)
async.get() can take a few seconds - not too long so that the client can't wait for the HTTP response to get back.
This code might put unnecessary strain on the server. What ASP.NET MVC's AsynchronousController provides, is the ability to break this function in two, something similar to this:
def my_view(request):
async = my_task.delay(params)
return DelayedResponse(async, lambda result=>my_response(result))
This releases the webserver to handle other requests until the async operation is done. Once it done, it will execute the lambda expression on the result, giving back the response.
Instead of waiting for request to complete, you can return status "In progress" an then send one more request to check if status has changed. Since you're doing pure lookups, the response will be very fast and won't put much burden on your web server.
You can outsource this specific view/feature to Tornado web server which is designed for async callback. The rest of the site may continue to run on django.
Most likely the solution should be not technical, but in UI/UX area. If something takes long, it's ok to notify user about it if notification is clear.
Yes, you can do something only when the task completes. You would want to look into something called chain(). You can bind celery tasks in chain:
chain = first_function.s(set) | second_Function.s(do)
chain()
These two functions first_function and second_function will both are celery functions. The second_function is executed only when first_function finishes its execution.