Sustain an http connection while django processes a big request (20mins+) - django

I've got a django site that is producing a csv download. The content of the csv is dictated by user defined parameters. It's possible that users will set parameters that require significant thinking time on the server. I need a way of sustaining the http connection so the browser doesn't kick up an error message. I heard that it's possible to send intermittent http headers to do this. Can anyone point me in the right direction to set this up on a django site?
(unfortunatly I'm stuck with the possibility of slow reports - improving my sql won't mitigate this)

Don't do it online. Trigger an offline task, use a bit of Javascript to repeatedly call a view that checks if the task has finished, and redirect to the finished file when it's ready.

Instead of blocking the user and it's browser for 20 minutes (which is not a good idea) do the time-consuming task in the background. When the task will finish and generate the result simply notify the user so that he/she will just need to download the ready result.

Related

django with heavy computation and long runtime - offline computation and send results

I have a django app where the user sends a request, and the server does some SQL lookup, followed by computation on results and finally showing the results to the user.
The SQL lookup and the computation afterwards can take a long time, maybe 30+ minutes. I have seen some webpages ask for email in such cases then send you the URL later. But I'm not sure how this can be done in django or whether there are other options for this situation. Any pointer will be very helpful.
(I'm sorry but as I said it's a rather general question, I don't know how can I provide a min runnable code for this)
One way to accomplish this would be to use something like Celery, which is a distributed task queue. The processing task would go into the queue (synchronously or asynchronously), and it would call a function to send an email to the user alerting them it is ready when the task is complete.
Documentation: https://docs.celeryproject.org/en/stable/django/first-steps-with-django.html

How to force all user's browsers to refresh for software update

I have a number of web applications that run for a number of businesses, day in and day out.
The applications are in PHP/MySQL/JS Running on a remote apache server.
For many years, I have performed updates at late night when the software is not in use.
I would like to be able to perform updates to the software during working hours, if possible.
I have many times asked my clients to make sure they shut the software down at night, and close their browsers - but can never guarantee that they have done so.
I have a refresh timer in the JS that trigger a browser to refresh at 11:59. It will happen If the browser is still open.
But I would like able to perform this refresh at any open browser - when I want.
I have mulled over a few ways to do this - including cron and database values that can be read and reset - but:
I wonder if anyone has had success with achieving this?
You want to refresh all open browser tabs that are pointing at your xAMP-ish applications. A few questions:
Does the refresh need to be immediate, or can it be deferred? that is, do everyone's tabs need to be refreshed at the same time, regardless of user interaction; or is it acceptable to wait until the next request from each client, whenever it may be?
Can you schedule the refresh ahead of time (say, with at least 1 session-timeout interval lead-up time), or do you need a method that triggers refreshes immediately?
If you require immediate refreshes, with no ahead-of-time scheduling, you are out of luck. The only way to do this is to keep an open channel for asynchronous updates from the server to the clients, which is hard to do with plain Apache/PHP (see comet, websockets).
If you can make do with deferred refreshes (waiting until a user submits a request), you have several alternatives. For example, you can
expire all sessions (by calling a script that removes all the corresponding server-side session files; found in /var/lib/php/sessions/ in linux). Note that your users will not appreciate losing, say, their shopping-cart contents.
use JavaScript to check a client-side version value (loaded at login-time, and kept in localStorage or similar) against incoming replies from the server (which would load it from a configuration file or a DB request). If the server-side value has changed, save whatever can be saved to localStorage (to avoid the previous scenario), inform the user, and refresh the page.
Alternatively, if you can schedule the refreshes with enough fore-warning, you can include instructions in server-replies that will invoke the refresh mechanism when needed. For example, such replies could change your current "reset at 11:59:59" code to read "reset at $requested_reset_time".
As I understand the problem, you would want control over when the user sees 'fresh' content and when the cached stuff is okay. If this is right,
Add the following in your head content -
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<meta http-equiv="Pragma" content="no-cache" />
<meta http-equiv="Expires" content="0" />
Upon receiving this header, user's browser will automatically fetch a fresh content. And you can flip on/off the above lines to suit your needs. This might not be the most sophisticated way of achieving the desired functionality but worth trying.
There are a lot of things to consider before doing something like this. For example, if someone is actively working on a page, maybe filling out a form or something and you were able to refresh their window, that could create a negative user experience. I believe some of the other answers here addressed some other concerns as well.
That said, I know from working with the Launch Darkly feature flag service that it can be done. I don't understand all the inner workings, unfortunately, but my understanding is that the service uses observables to watch for updates. Observables are similar to promises, except they continuously watch for new changes to their target. You could then force a page reload (or perhaps an alert to the user, prompting one) when the target updates.

How to handle file processing request in Django?

I am making a Django Rest framework based server and in one of the request, I get an audio file from front-end, on which I need to run some ML based algorithm(I have script for same) and respond back to user with the result. Problem is that this request might take 5-10 seconds to execute. I am trying to understand following things:
Will Celery help me reduce the workload on server, as in any case I need to wait for the result of the ML Algo and respond back to user.
Should I create a different server to handle this type of request? Will that be a better approach?
Also, is my flow of doing things correct. First, Upload the file to some cloud platform for storage and serialize the instance to get the url of file. Second run the script using celery and wait for the result. Third, Respond back with the result.
Thanks for helping.

Returning the result of celery task to the client in Django template

So I'm trying to accomplish the following. User browses webpage and at the sime time there is a task running in the background. When the task completes it should return args where one of args is flag: True in order to trigger a javascript and javascript shows a modal form.
I tested it before without async tasks and it works, but now with celery it just stores results in database. I did some research on tornado-celery and related stuff but some of components like tornado-redis is not mantained anymore so it would not be vise in my opinion to use that.
So what are my options, thanks?
If I understand you correctly, then you want to communicate something from the server side back to the client. You generally have three options for that:
1) Make a long pending request to the server - kinda bad. Jumping over the details, it will bog down your web server if not configured to handle that, it will make your site score low on performance tests and if the request fails, everything fails.
2) Poll the server with numerous requests with a time interval (0.2 s, something like that) - better. It will increase the traffic, but the requests will be tiny and will not interfere with the site's performance very much. If you instate a long interval to not load the server with pointless requests, then the users will see the data with a bit of a delay. On the upside this will not fail (if written correctly) even if the connection is interrupted.
3) Websockets where the server can just hit the client with any message whenever needed - nice, but takes some time to get used to. If you want to try, you can use django-channels which is a nice library for Django websockets.
If I did not understand you correctly and this is not the problem at hand and you are figuring how to get data back from a Celery task to Django, then you can store the Celery task ID-s and use the ID-s to first check, if the task is completed and then query the data from Celery.

How to update progress bar while making a Django Rest api request?

My django rest app accepts request to scrape multiple pages for prices & compare them (which takes time ~5 seconds) then returns a list of the prices from each page as a json object.
I want to update the user with the current operation, for example if I scrape 3 pages I want to update the interface like this :
Searching 1/3
Searching 2/3
Searching 3/3
How can I do this?
I am using Angular 2 for my front end but this shouldn't make a big difference as it's a backend issue.
This isn't the only way, but this is how I do this in Django.
Things you'll need
Asynchronous worker procecess
This allows you to do work outside the context of the request-response cycle. The most common are either django-rq or Celery. I'd recommend django-rq for its simplicity, especially if all you're implementing is a progress indicator.
Caching layer (optional)
While you can use the database for persistence in this case, temporary cache key-value stores make more sense here as the progress information is ephemeral. The Memcached backend is built into Django, however I'd recommend switching to Redis as it's more fully featured, super fast, and since it's behind Django's caching abstraction, does not add complexity. (It's also a requirement for using the django-rq worker processes above)
Implementation
Overview
Basically, we're going to send a request to the server to start the async worker, and poll a different progress-indicator endpoint which gives the current status of that worker's progress until it's finished (or failed).
Server side
Refactor the function you'd like to track the progress of into an async task function (using the #job decorator in the case of django-rq)
The initial POST endpoint should first generate a random unique ID to identify the request (possibly with uuid). Then, pass the POST data along with this unique ID to the async function (in django-rq this would look something like function_name.delay(payload, unique_id)). Since this is an async call, the interpreter does not wait for the task to finish and moves on immediately. Return a HttpResponse with a JSON payload that includes the unique ID.
Back in the async function, we need to set the progress using cache. At the very top of the function, we should add a cache.set(unique_id, 0) to show that there is zero progress so far. Using your own math implementation, as the progress approaches 100% completion, change this value to be closer to 1. If for some reason the operation fails, you can set this to -1.
Create a new endpoint to be polled by the browser to check the progress. This looks for a unique_id query parameter and uses this to look up the progress with cache.get(unique_id). Return a JSON object back with the progress amount.
Client side
After sending the POST request for the action and receiving a response, that response should include the unique_id. Immediately start polling the progress endpoint at a regular interval, setting the unique_id as a query parameter. The interval could be something like 1 second using setInterval(), with logic to prevent sending a new request if there is still a pending request.
When the progress received equals to 1 (or -1 for failures), you know the process is finished and you can stop polling
That's it! It's a bit of work just to get progress indicators, but once you've done it once it's much easier to re-use the pattern in other projects.
Another way to do this which I have not explored is via Webhooks / Channels. In this way, polling is not required, and the server simply sends the messages to the client directly.