How to have django give a HTTP response before continuing on to complete a task associated to the request? - django

In my django piston API, I want to yield/return a http response to the the client before calling another function that will take quite some time. How do I make the yield give a HTTP response containing the desired JSON and not a string relating to the creation of a generator object?
My piston handler method looks like so:
def create(self, request):
data = request.data
*other operations......................*
incident.save()
response = rc.CREATED
response.content = {"id":str(incident.id)}
yield response
manage_incident(incident)
Instead of the response I want, like:
{"id":"13"}
The client gets a string like this:
"<generator object create at 0x102c50050>"
EDIT:
I realise that using yield was the wrong way to go about this, in essence what I am trying to achieve is that the client receives a response right away before the server moves onto the time costly function of manage_incident()

This doesn't have anything to do with generators or yielding, but I've used the following code and decorator to have things run in the background while returning the client an HTTP response immediately.
Usage:
#postpone
def long_process():
do things...
def some_view(request):
long_process()
return HttpResponse(...)
And here's the code to make it work:
import atexit
import Queue
import threading
from django.core.mail import mail_admins
def _worker():
while True:
func, args, kwargs = _queue.get()
try:
func(*args, **kwargs)
except:
import traceback
details = traceback.format_exc()
mail_admins('Background process exception', details)
finally:
_queue.task_done() # so we can join at exit
def postpone(func):
def decorator(*args, **kwargs):
_queue.put((func, args, kwargs))
return decorator
_queue = Queue.Queue()
_thread = threading.Thread(target=_worker)
_thread.daemon = True
_thread.start()
def _cleanup():
_queue.join() # so we don't exit too soon
atexit.register(_cleanup)

Perhaps you could do something like this (be careful though):
import threading
def create(self, request):
data = request.data
# do stuff...
t = threading.Thread(target=manage_incident,
args=(incident,))
t.setDaemon(True)
t.start()
return response
Have anyone tried this? Is it safe? My guess is it's not, mostly because of concurrency issues but also due to the fact that if you get a lot of requests, you might also get a lot of processes (since they might be running for a while), but it might be worth a shot.
Otherwise, you could just add the incident that needs to be managed to your database and handle it later via a cron job or something like that.
I don't think Django is built either for concurrency or very time consuming operations.
Edit
Someone have tried it, seems to work.
Edit 2
These kind of things are often better handled by background jobs. The Django Background Tasks library is nice, but there are others of course.

You've turned your view into a generator thinking that Django will pick up on that fact and handle it appropriately. Well, it won't.
def create(self, request):
return HttpResponse(real_create(request))
EDIT:
Since you seem to be having trouble... visualizing it...
def stuff():
print 1
yield 'foo'
print 2
for i in stuff():
print i
output:
1
foo
2

Related

Does Django Channels support a synchronous long-polling consumer?

I'm using Channels v2.
I want to integrate long-polling into my project.
The only consumer I see in the documentation for http long polling is the AsyncHttpConsumer.
The code I need to run in my handle function is not asynchronous. It connects to another device on the network using a library that is not asynchronous. From what I understand, this will cause the event loop to block, which is bad.
Can I run my handler synchronously, in a thread somehow? There's a SyncConsumer, but that seems to have something to do with Web Sockets. It doesn't seem applicable to Long Polling.
Using AsyncHttpConsumer as a reference, I was able to write an almost exact duplicate of the class, but subclassing SyncConsumer instead of AsyncConsumer as AsyncHttpConsumer does.
After a bit of testing, I soon realized that since my code was all running in a single thread, until the handle() method finished running, which presumably runs until done, the disconnect() method wouldn't be triggered, so there was no way to interrupt a long running handle() method, even if the client disconnects.
The following new version runs handle() in a thread, and gives the user 2 ways to check if the client disconnected:
from channels.consumer import SyncConsumer
from channels.exceptions import StopConsumer
from threading import Thread, Event
# We can't pass self.client_disconnected to handle() as a reference if it's
# a regular bool. That means if we use a regular bool, and the variable
# changes in this thread, it won't change in the handle() method. Using a
# class fixes this.
# Technically, we could just pass the Event() object
# (self.client_disconnected) to the handle() method, but then the client
# needs to know to use .is_set() instead of just checking if it's True or
# False. This is easier for the client.
class RefBool:
def __init__(self):
self.val = Event()
def set(self):
self.val.set()
def __bool__(self):
return self.val.is_set()
def __repr__(self):
current_value = bool(self)
return f"RefBool({current_value})"
class SyncHttpConsumer(SyncConsumer):
"""
Sync HTTP consumer. Provides basic primitives for building synchronous
HTTP endpoints.
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.handle_thread = None
self.client_disconnected = RefBool()
self.body = []
def send_headers(self, *, status=200, headers=None):
"""
Sets the HTTP response status and headers. Headers may be provided as
a list of tuples or as a dictionary.
Note that the ASGI spec requires that the protocol server only starts
sending the response to the client after ``self.send_body`` has been
called the first time.
"""
if headers is None:
headers = []
elif isinstance(headers, dict):
headers = list(headers.items())
self.send(
{"type": "http.response.start", "status": status, "headers": headers}
)
def send_body(self, body, *, more_body=False):
"""
Sends a response body to the client. The method expects a bytestring.
Set ``more_body=True`` if you want to send more body content later.
The default behavior closes the response, and further messages on
the channel will be ignored.
"""
assert isinstance(body, bytes), "Body is not bytes"
self.send(
{"type": "http.response.body", "body": body, "more_body": more_body}
)
def send_response(self, status, body, **kwargs):
"""
Sends a response to the client. This is a thin wrapper over
``self.send_headers`` and ``self.send_body``, and everything said
above applies here as well. This method may only be called once.
"""
self.send_headers(status=status, **kwargs)
self.send_body(body)
def handle(self, body):
"""
Receives the request body as a bytestring. Response may be composed
using the ``self.send*`` methods; the return value of this method is
thrown away.
"""
raise NotImplementedError(
"Subclasses of SyncHttpConsumer must provide a handle() method."
)
def disconnect(self):
"""
Overrideable place to run disconnect handling. Do not send anything
from here.
"""
pass
def http_request(self, message):
"""
Sync entrypoint - concatenates body fragments and hands off control
to ``self.handle`` when the body has been completely received.
"""
if "body" in message:
self.body.append(message["body"])
if not message.get("more_body"):
full_body = b"".join(self.body)
self.handle_thread = Thread(target=self.handle, args=(full_body, self.client_disconnected), daemon=True)
self.handle_thread.start()
def http_disconnect(self, message):
"""
Let the user do their cleanup and close the consumer.
"""
self.client_disconnected.set()
self.disconnect()
self.handle_thread.join()
raise StopConsumer()
The SyncHttpConsumer class is used very similarly to how you would use the AsyncHttpConsumer class - you subclass it, and define a handle() method. The only difference is that the handle() method takes an extra arg:
class MyClass(SyncHttpConsumer):
def handle(self, body, client_disconnected):
while not client_disconnected:
...
Or you could, just like with the AsyncHttpConsumer class, override the disconnect() method instead if you prefer.
I'm still not sure if this is the best way to do this, or why Django Channels doesn't include something like this in addition to AsyncHttpConsumer. If anyone knows, please let us know.

How to turn a Django Rest Framework API View into an async one?

I am trying to build a REST API that will manage some machine learning classification tasks. I have written an API view, which when hit, will trigger the start of a classification task (such as: training an SVM classifier with the data the user provided previously). However, this is a long running task, so I would ideally not have the user wait once they have made a request to this view. Instead, I would like to start this task in the background and give them a response immediately. They can later view the results of the classification in a separate view (haven't implemented that yet.)
I am using ASGI_APPLICATION = 'mlxplorebackend.asgi.application' in settings.py.
Here's my API view in views.py
import asyncio
from concurrent.futures import ProcessPoolExecutor
from django import setup as SetupDjango
# ... other imports
loop = asyncio.get_event_loop()
def DummyClassification():
result = sum(i * i for i in range(10 ** 7))
print(result)
return result
# ... other API views
class TaskExecuteView(APIView):
"""
Once an API call is made to this view, the classification algorithm will start being processed.
Depends on:
1. Parser for the classification algorithm type and parameters
2. Classification algorithm implementation
"""
def get(self, request, taskId, *args, **kwargs):
try:
task = TaskModel.objects.get(taskId = taskId)
except TaskModel.DoesNotExist:
raise Http404
else:
# this is basically the classification task for now
# need to turn this to an async view
with ProcessPoolExecutor(initializer = SetupDjango) as pool:
loop.run_in_executor(pool, DummyClassification)
return Response({ "message": "The task with id: {} has been started".format(task.taskId) }, status = status.HTTP_200_OK)
The problem I am facing is the following:
When I do not use with ProcessPoolExecutor(initializer = SetupDjango) as pool: i.e. without the initializer, I get django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet. (full traceback at: https://paste.ubuntu.com/p/ctjmFNYMXW/)
When I do use the initializer, the view no longer remains async, it gets blocked. The response returns after the task is completed, which is about 5 seconds on my machine. I do realize I am not really making use of asyncio.sleep() inside my DummyClassification() function, but I can't figure out the way to do so.
I am guessing this is not the way to do it, therefore any suggestions would be appreciated. I would like to avoid celery if I can, since that seems a tad bit too complicated for me.
Edit:
If I get rid of ProcessPoolExecutor() and simply do loop.run_in_executor(None, DummyClassification), it works as expected, but then only one worker thread is working on the task, which doesn't seem remotely ideal for a classification task.
This was a ride. I at first went through the pain of setting up celery only to find out that the original problem of the classification task using one CPU core remains. Then I switched to django-rq with redis and it is currently working as expected.
from .tasks import Pipeline
class TaskExecuteView(APIView):
"""
Once an API call is made to this view, the classification algorithm will start being processed.
Depends on:
1. Parser for the classification algorithm type
2. Classification algorithm implementation
"""
def get(self, request, taskId, *args, **kwargs):
try:
task = TaskModel.objects.get(taskId = taskId)
except TaskModel.DoesNotExist:
raise Http404
else:
Pipeline.delay(taskId) # this is async now ✔
# mark this as an in-progress task
TaskModel.objects.filter(taskId = taskId).update(inProgress = True)
return Response({ "message": "The task with id: {}, title: {} has been started".format(task.taskId, task.taskTitle) }, status = status.HTTP_200_OK)
tasks.py
from django_rq import job
#job('default', timeout=3600)
def Pipeline(taskId):
# classification task

Number of queries executed over psycopg2 connection

I would like to know the number of sql queries which were executed on a psycopg2 connection.
Is there a way to get this number?
I would like to warn if a http request produces too many statements.
I am running a django application. If DEBUG is True, then I have connection.queries. But I would like to get this value from a production server
Update
I want numbers (statistics) from the prod environment. This question is not about debugging a particular http request.
Have a look at django-silk. It is a profiling tool that records metrics like response times and the number of queries.
If you want to roll you own solution and you are using Django 2.0, you can create a middleware with a connection wrapper. The documentation even showcases a QueryLogger class:
import time
from django.db import connection
class QueryLogger:
def __init__(self):
self.queries = []
def __call__(self, execute, sql, params, many, context):
current_query = {'sql': sql, 'params': params, 'many': many}
start = time.time()
try:
result = execute(sql, params, many, context)
except Exception as e:
current_query['status'] = 'error'
current_query['exception'] = e
raise
else:
current_query['status'] = 'ok'
return result
finally:
duration = time.time() - start
current_query['duration'] = duration
self.queries.append(current_query)
class QueryLogginMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
ql = QueryLogger()
with connection.execute_wrapper(ql):
response = self.get_response(request)
# do something with ql.queries here
return response
The amount of queries made on Production and Development are the same, if you have the same environment on your database and everything else.
I recommend you to use Django Debug Toolbar as mentioned, copy see about how many queries your View are doing and rethink your code based on that, if you want to see about those queries performance i recommend you to use the explain command from postgresql.
I usually, copy the query and paste it with explain inside my postgreaql database shell. See this: http://recordit.co/rGZ2SAo7PX

Flask global exception handling

How could one handle exceptions globally with Flask? I have found ways to use the following to handle custom db interactions:
try:
sess.add(cat2)
sess.commit()
except sqlalchemy.exc.IntegrityError, exc:
reason = exc.message
if reason.endswith('is not unique'):
print "%s already exists" % exc.params[0]
sess.rollback()
The problem with try-except is I would have to run that on every aspect of my code. I can find better ways to do that for custom code. My question is directed more towards global catching and handling for:
apimanager.create_api(
Model,
collection_name="models",
**base_writable_api_settings
)
I have found that this apimanager accepts validation_exceptions: [ValidationError] but I have found no examples of this being used.
I still would like a higher tier of handling that effects all db interactions with a simple concept of "If this error: show this, If another error: show something else" that just runs on all interactions/exceptions automatically without me including it on every apimanager (putting it in my base_writable_api_settings is fine I guess). (IntegrityError, NameError, DataError, DatabaseError, etc)
I tend to set up an error handler on the app that formats the exception into a json response. Then you can create custom exceptions like UnauthorizedException...
class Unauthorized(Exception):
status_code = 401
#app.errorhandler(Exception)
def _(error):
trace = traceback.format_exc()
status_code = getattr(error, 'status_code', 400)
response_dict = dict(getattr(error, 'payload', None) or ())
response_dict['message'] = getattr(error, 'message', None)
response_dict['traceback'] = trace
response = jsonify(response_dict)
response.status_code = status_code
traceback.print_exc(file=sys.stdout)
return response
You can also handle specific exceptions using this pattern...
#app.errorhandler(ValidationError)
def handle_validation_error(error):
# Do something...
Error handlers get attached to the app, not the apimanager. You probably have something like
app = Flask()
apimanager = ApiManager(app)
...
Put this somewhere using that app object.
My preferred approach uses decorated view-functions.
You could define a decorator like the following:
def handle_exceptions(func):
#wraps(func)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except ValidationError:
# do something
except HTTPException:
# do something else ...
except MyCustomException:
# do a third thing
Then you can simply decorate your view-functions, e.g.
#app.route('/')
#handle_exceptions
def index():
# ...
I unfortunately do not know about the hooks Flask-Restless offers for passing view-functions.

Django - Check users messages every request

I want to check if a user has any new messages each time they load the page. Up until now, I have been doing this inside of my views but it's getting fairly hard to maintain since I have a fair number of views now.
I assume this is the kind of thing middleware is good for, a check that will happen every single page load. What I need it to do is so:
Check if the user is logged in
If they are, check if they have any messages
Store the result so I can reference the information in my templates
Has anyone ever had to write any middleware like this? I've never used middleware before so any help would be greatly appreciated.
You could use middleware for this purpose, but perhaps context processors are more inline for what you want to do.
With middleware, you are attaching data to the request object. You could query the database and find a way to jam the messages into the request. But context processors allow you to make available extra entries into the context dictionary for use in your templates.
I think of middleware as providing extra information to your views, while context processors provide extra information to your templates. This is in no way a rule, but in the beginning it can help to think this way (I believe).
def messages_processor(request):
return { 'new_messages': Message.objects.filter(unread=True, user=request.user) }
Include that processor in your settings.py under context processors. Then simply reference new_messages in your templates.
I have written this middleware on my site for rendering messages. It checks a cookie, if it is not present it appends the message to request and saves a cookie, maybe you can do something similar:
class MyMiddleware:
def __init__(self):
#print 'Initialized my Middleware'
pass
def process_request(self, request):
user_id = False
if request.user.is_active:
user_id = str(request.user.id)
self.process_update_messages(request, user_id)
def process_response(self, request, response):
self.process_update_messages_response(request, response)
return response
def process_update_messages(self, request, user_id=False):
update_messages = UpdateMessage.objects.exclude(expired=True)
render_message = False
request.session['update_messages'] = []
for message in update_messages:
if message.expire_time < datetime.datetime.now():
message.expired = True
message.save()
else:
if request.COOKIES.get(message.cookie(), True) == True:
render_message = True
if render_message:
request.session['update_messages'].append({'cookie': message.cookie(), 'cookie_max_age': message.cookie_max_age})
messages.add_message(request, message.level, message)
break
def process_update_messages_response(self, request, response):
try:
update_messages = request.session['update_messages']
except:
update_messages = False
if update_messages:
for message in update_messages:
response.set_cookie(message['cookie'], value=False, max_age=message['cookie_max_age'], expires=None, path='/', domain=None, secure=None)
return response