Faking a Streaming Response in Django to Avoid Heroku Timeout - django

I have a Django app that uses django-wkhtmltopdf to generate PDFs on Heroku. Some of the responses exceed the 30 second timeout. Because this is a proof-of-concept running on the free tier, I'd prefer not to tear apart what I have to move to a worker/ poll process. My current view looks like this:
def dispatch(self, request, *args, **kwargs):
do_custom_stuff()
return super(MyViewClass, self).dispatch(request, *args, **kwargs)
Is there a way I can override the dispatch method of the view class to fake a streaming response like this or with the Empy Chunking approach mentioned here to send an empty response until the PDF is rendered? Sending an empty byte will restart the timeout process giving plenty of time to send the PDF.

I solved a similar problem using Celery, something like this.
def start_long_process_view(request, pk):
task = do_long_processing_stuff.delay()
return HttpResponse(f'{"task":"{task.id}"}')
Then you can have a second view that can check the task state.
from celery.result import AsyncResult
def check_long_process(request, task_id):
result = AsyncResult(task_id)
return HttpResponse(f'{"state":"{result.state}"')
Finally using javascript you can just fetch the status just after the task is being started. Updating every half second will more than enough to give your users a good feedback.
If you think Celery is to much, there are light alternatives that will work just great: https://djangopackages.org/grids/g/workers-queues-tasks/

Related

Django - on_rollback using transaction.atomic

What exactly I want to do
On rollback, I want to run a do_something function.
Is it possible to do so?
Right now, transaction.atomic does its job perfectly, but it only takes care of database. On rollback, I want to take care of a few other things using that do_something function I mentioned before.
My use cases for transaction.atomic are pretty simple and there's not much to say about, I just use it on typical views which are creating and updating objects.
*Updated
Example View
#transaction.atomic
def create(self, request, *args, **kwargs):
serializer = ConvertSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
instance = ConvertModel.objects.create(
request.user, request.converter, serializer.validated_data
)
# I have some code here which create a json file on the disk based on that instance
create_json(request.user, request.converter)
return Response(ConvertSerializer.data, status=HTTP_201_CREATED)
Example on_rollback function
This function simply removes the JSON file which was created by the view.
def remove_json(file_path):
os.remove(file_path)
The example is not what exactly I do now, but it's a pretty similar case. After the rollback, the database is okay, but I have that invalid JSON file on the disk which can make huge problems for me.

How to implement callbacks asynchronously in Django?

I am building a web app, using Django, which has two parts:
deploy docker containers and store the expiry time in the database,
according to the expiry time kill them, however, a user of the app can choose to extend the life of the container.
How do I implement the second part without polling on the database?
I tried using asyncio and implemented a custom middleware in Django but it blocks the execution. Is there any other way which does the job asynchronously.
import asyncio
from threading import Thread
def callback_func(eventloop):
"""
check DB
if now:
kill
else:
register a new callback with new updated time
"""
# Logic to kill a container goes here
print ("Inside callback")
class KillerMiddleware:
def __init__(self, get_response):
self.get_response = get_response
self._eventloop = asyncio.new_event_loop()
asyncio.set_event_loop(self._eventloop)
self._t = Thread(target=lambda : self._eventloop.run_forever())
self._t.daemon = True
self._t.start()
def __call__(self, request):
response = self.get_response(request)
self._eventloop.call_later(86400, callback_func, self._eventloop)
return response
I really don't know much about python-asyncio, but it sounds like Django Channels would be right for this.

How to execute code in Django after response has been sent to the client (on PythonAnywhere)?

I'm looking for a way to execute code in Django after the response has been sent to the client. I know the usual way is to implement a task queue (e.g., Celery). However, the PaaS service I'm using (PythonAnywhere) doesn't support task queues as of May 2019. It also seems overly complex for a few simple use cases. I found the following solution on SO: Execute code in Django after response has been sent to the client. The accepted answer works great when run locally. However, in production on PythonAnywhere, it still blocks the response from being sent to the client. What is causing that?
Here's my implementation:
from time import sleep
from datetime import datetime
from django.http import HttpResponse
class HttpResponseThen(HttpResponse):
"""
WARNING: THIS IS STILL BLOCKING THE PAGE LOAD ON PA
Implements HttpResponse with a callback i.e.,
The callback function runs after the http response.
"""
def __init__(self, data, then_callback=lambda: 'hello world', **kwargs):
super().__init__(data, **kwargs)
self.then_callback = then_callback
def close(self):
super().close()
return_value = self.then_callback()
print(f"Callback return value: {return_value}")
def my_callback_function():
sleep(20)
print('This should print 20 seconds AFTER the page loads.')
print('On PA, the page actually takes 20 seconds to load')
def test_view(request):
return HttpResponseThen("Timestamp: "+str(datetime.now()),
then_callback=my_callback_function) # This is still blocking on PA
I'm expecting the response to be sent to the client immediately, but it actually takes a full 20 seconds for the page to load. (On my laptop, the code works great. The response is sent immediately and the print statements execute 20 seconds later.)

How to send asynchronous HTTP requests from Django and wait for results in python2.7?

I have several API's as sources of data, for example - blog posts. What I'm trying to achieve is to send requests to this API's in parallel from Django view and get results. No need to store results in db, I need to pass them to my view response. My project is written on python 2.7, so I can't use asyncio. I'm looking for advice on the best practice to solve it (celery, tornado, something else?) with examples of how to achieve that cause I'm only starting my way in async. Thanks.
A solution is use Celery and pass your request args to this, and in the front use AJAX.
Example:
def my_def (request):
do_something_in_celery.delay()
return Response(something)
To control if a task is finished in Celery, you can put the return of Celery in a variable:
task_run = do_something_in_celery.delay()
In task_run there is a property .id.
This .id you return to your front and use it to monitor the status of task.
And your function executed in Celery must have de decorator #task
#task
do_something_in_celery(*args, **kwargs):
You will a need to control the tasks, like a Redis or RabbitMQ.
Look this URLs:
http://masnun.com/2014/08/02/django-celery-easy-async-task-processing.html
https://buildwithdjango.com/blog/post/celery-progress-bars/
http://docs.celeryproject.org/en/latest/index.html
I found a solution using concurrent.futures ThreadPoolExecutor from futures lib.
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
You can also check out the rest of the concurrent.futures doc.
Important!
The ProcessPoolExecutor class has known (unfixable) problems on Python 2 and should not be relied on for mission critical work.

Execute code in Django after response has been sent to the client

In my Django application I want to keep track of whether a response has been sent to the client successfully. I am well aware that there is no "watertight" way in a connectionless protocol like HTTP to ensure the client has received (and displayed) a response, so this will not be mission-critical functionality, but still I want to do this at the latest possible time. The response will not be HTML so any callbacks from the client (using Javascript or IMG tags etc.) are not possible.
The "latest" hook I can find would be adding a custom middleware implementing process_response at the first position of the middleware list, but to my understanding this is executed before the actual response is constructed and sent to the client. Are there any hooks/events in Django to execute code after the response has been sent successfully?
The method I am going for at the moment uses a subclass of HttpResponse:
from django.template import loader
from django.http import HttpResponse
# use custom response class to override HttpResponse.close()
class LogSuccessResponse(HttpResponse):
def close(self):
super(LogSuccessResponse, self).close()
# do whatever you want, this is the last codepoint in request handling
if self.status_code == 200:
print('HttpResponse successful: %s' % self.status_code)
# this would be the view definition
def logging_view(request):
response = LogSuccessResponse('Hello World', mimetype='text/plain')
return response
By reading the Django code I am very much convinced that HttpResponse.close() is the latest point to inject code into the request handling. I am not sure if there really are error cases that are handled better by this method compared to the ones mentioned above, so I am leaving the question open for now.
The reasons I prefer this approach to the others mentioned in lazerscience's answer are that it can be set up in the view alone and does not require middleware to be installed. Using the request_finished signal, on the other hand, wouldn't allow me to access the response object.
If you need to do this a lot, a useful trick is to have a special response class like:
class ResponseThen(Response):
def __init__(self, data, then_callback, **kwargs):
super().__init__(data, **kwargs)
self.then_callback = then_callback
def close(self):
super().close()
self.then_callback()
def some_view(request):
# ...code to run before response is returned to client
def do_after():
# ...code to run *after* response is returned to client
return ResponseThen(some_data, do_after, status=status.HTTP_200_OK)
...helps if you want a quick/hacky "fire and forget" solution without bothering to integrate a proper task queue or split off a separate microservice from your app.
I suppose when talking about middleware you are thinking about the middleware's process_request method, but there's also a process_response method that is called when the HttpResponse object is returned. I guess that will be the latest moment where you can find a hook that you can use.
Furthermore there's also a request_finished signal being fired.
I modified Florian Ledermann's idea a little bit... So someone can just use the httpresponse function normally, but allows for them to define a function and bind it to that specific httpresponse.
old_response_close = HttpResponse.close
HttpResponse.func = None
def new_response_close(self):
old_response_close(self)
if self.func is not None:
self.func()
HttpResponse.close = new_response_close
It can be used via:
def myview():
def myfunc():
print("stuff to do")
resp = HttpResponse(status=200)
resp.func = myfunc
return resp
I was looking for a way to send a response, then execute some time consuming code after... but if I can get a background (most likely a celery) task to run, then it will have rendered this useless to me. I will just kick off the background task before the return statement. It should be asynchronous, so the response will be returned before the code is finished executing.
---EDIT---
I finally got celery to work with aws sqs. I basically posted a "how to". Check out my answer on this post:
Cannot start Celery Worker (Kombu.asynchronous.timer)
I found a filthy trick to do this by accessing a protected member in HttpResponse.
def some_view(request):
# ...code to run before response is returned to client
def do_after():
# ...code to run *after* response is returned to client
response = HttpResponse()
response._resource_closers.append(do_after)
return response
It works in Django 3.0.6 , check the "close" function in the prototype of HttpResponse.
def close(self):
for closer in self._resource_closers:
try:
closer()
except Exception:
pass
# Free resources that were still referenced.
self._resource_closers.clear()
self.closed = True
signals.request_finished.send(sender=self._handler_class)