How to run two requests parallel in django rest - django

I have two requests, which are called from react front end, one request is running in a loop which is returning image per request, now the other request is registering a user, both are working perfect, but when the images request is running in a loop, at the same time I register user from other tab,but that request status shows pending, if I stops the images request then user registered,How can I run them parallel at the same time.
urls.py
url(r'^create_user/$', views.CreateUser.as_view(), name='CreateAbout'),
url(r'^process_image/$', views.AcceptImages.as_view(), name='AcceptImage'),
Views.py
class CreateUser(APIView):
def get(self,request):
return Response([UserSerializer(dat).data for dat in User.objects.all()])
def post(self,request):
payload=request.data
serializer = UserSerializer(data=payload)
if serializer.is_valid():
instance = serializer.save()
instance.set_password(instance.password)
instance.save()
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
class AcceptImages(APIView):
def post(self,request):
global video
stat, img = video.read()
frame = img
retval, buffer_img = cv2.imencode('.jpg', frame)
resdata = base64.b64encode(buffer_img)
return Response(resdata)
These endpoints I am calling from react,the second endpoint is being calling in a loop and the same time from other tab I register user but it shows status pending and If I stop the image endpoint it then register the user,how can I make these two request run parallel.
I have researched a lot but can't find appropriate solution, there is one solution I using celery, but did not whether it solves my problem if it solves how can I implement above scenerio

You should first determine whether the bottleneck is the frontend or the backend.
frontend: Chrome can make up to 6 requests for the same domain at a time. (Up to HTTP/1.1)
backend: If you use python manage.py runserver, consider using gunicorn or uWSGI. As the Django documentation says, the manage.py command should not be used in production. Modify the process and thread count settings in the gunicorn or uWSGI to 2 or higher and try again.

Related

Offloading extensive calculations in save method of Django custom FileField

I'm building a gallery webapp based on Django (4.1.1) and Vue. I want to also upload and display videos (not only images). For supporting formats, that don't work in a video html tag, I'm converting these formats to mp4 via pyffmpeg.
For that I created a custom field for my model based on FileField. In it's save method I take the files content, convert it and save the result. This is called by the serializer through a corresponding ViewSet. This works, but the video conversion takes way too long, so that the webrequest from my Vue app (executed with axios) is running into a timeout.
It is clear, that I need to offload the conversion somehow, return a corresponding response directly and save the data in the database as soon as the conversion is finished.
Is this even possible? Or do I need to write a custom view apart from the ViewSet to do the calculation? Can you give me a hint on how to offload that calcuation? I only have rudimentary knowledge about things like asyncio.
TL;DR: How to do extensive calculations asychronous on file data before saving them to a model with FileField and returning a response before the calcuation ends?
I can provide my current code if necessary.
I've now solved my problem, though I'm still interested in other/better solutions. My solution works but might not be the best and I feel it is a bit hacky at some places.
TL;DR: Installed django-q as task queue manager with a redis database backend, connected it to django and then called the function for transcoding the video file from my view via
taskid = async_task("apps.myapp.services.transcode_video", data)
This should be a robust system to handle these transcode tasks in parallel and without blocking the request.
I found this tutorial about Django-Q. Django-Q manages and executes tasks from django. It runs in parallel with Django and is connected to it via its broker (a redis database in this case).
First I installed django-q and the redis client modules via pip
pip install django-q redis
Then I build up a Redis database (here running in a docker container on my machine with the official redis image). How to do that depends largely on your platform.
Then configuring Django to use Django-Q by adding the configuration into settings.py (Note, that I disabled timeouts, because the transcode tasks can take rather long. May change that in future):
Q_CLUSTER = {
'name': 'django_q_django',
'workers': 8,
'recycle': 500,
'timeout': None,
'compress': True,
'save_limit': 250,
'queue_limit': 500,
'cpu_affinity': 1,
'label': 'Django Q',
'redis': {
'host': 'redishostname',
'port': 6379,
'password': 'mysecureredisdbpassword',
'db': 0, }
}
and then activating Django-Q by adding it to the installed apps in settings.py:
INSTALLED_APPS = [
...
'django_q',
]
Then migrate the model definitions of Django Q via:
python manage.py migrate
and start Django Q via (the Redis database should run at this point):
python manage.py qcluster
This runs in a separate terminal from the typical
python manage.py runserver
Note: Of course these two are only for development. I currently don't know how to deploy Django Q in production yet.
Now we need a file for our functions. As in the tutorial I added the file services.py to my app. There I simply defined the function to run:
def transcode_video(data):
# Doing my transcoding stuff here
return {'entryid': entry.id, 'filename': target_name}
This function can then be called inside the view code via:
taskid = async_task("apps.myapp.services.transcode_video", data)
So I can provide data to the function and get a task ID as a return value. The return value of the executed function will appear in the result field of the created task, so that you can even return data from there.
I encountered a problem at that stage: The data contains a TemporaryUploadedFile object, which resulted in a pickle error. The data seems to get pickled before it gets passed to Django Q, which didn't work for that object. There might be a way to convert the file in a picklable format, though since I already need the file on the filesystem for invoking pyffmeg on it, in the view I just write the data to a file (in chunks to avoid loading the whole file into memory at once) with
with open(filepath, 'wb') as f:
for chunk in self.request.data['file'].chunks():
f.write(chunk)
Normally in the ViewSet I would call serializer.save() at the end, but for transcoding I don't do that, since the new object gets saved inside the Django Q function after the transaction. There I create it like this: (UploadedFile being from dango.core.files.uploadedfile and AlbumEntry being my own model for which I want to create an instance)
with open(target_path, 'rb') as f:
file = UploadedFile(
file=f,
name=target_name,
content_type=data['file_type']+"/"+data['target_ext'],
)
entry = AlbumEntry(
file=file,
... other Model fields here)
entry.save()
To return a defined Response from the viewset even when the object wasn't created yet, I had to overwrite the create() method in addition to the perform_create() method (where I did all the handling). For this I copied the code from the parent class and changed it slightly to return a specific response depending on the return value of perform_create() (which previously didn't return anything):
def create(self, request, *args, **kwargs):
serializer = self.get_serializer(data=request.data)
serializer.is_valid(raise_exception=True)
taskid = self.perform_create(serializer)
if taskid:
return HttpResponse(json.dumps({'taskid': taskid, 'status': 'transcoding'}), status=status.HTTP_201_CREATED)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED, headers=headers)
So perform_create() would return a task ID on transcode jobs and None otherwise. A corresponding response is send here.
Last but not least there was the problem of the frontend not knowing when the transcoding was done. So I build a simple view to get a task by ID:
#api_view(['GET'])
#authentication_classes([authentication.SessionAuthentication])
#permission_classes([permissions.IsAuthenticated])
def get_task(request, task_id):
task = Task.get_task(task_id)
if not task:
return HttpResponse(json.dumps({
'success': False
}))
return HttpResponse(json.dumps({
'id': task.id,
'result': task.result,
...some more data to return}))
You can see that I return a fixed response, when the task is not found. This is my workaround, since by default the Task object will get created only when the task is finished. For my purpose it is OK to just assume, that it still runs. A comment in this github issue of Django Q suggests, that to get an up-to-date Task object you would need to write your own Task model and implement it in a way, that it regularly checks Django Q for the Task status. I didn't want to do this.
I also put the result in the response, so that my frontend can poll the task regularly (by its task ID) and when the transcode is finished it will contain the ID of the created Model object in the database. When my frontend sees this, it will load the objects content.

how can I avoid time out in django

I am creating a site that downloads videos from other sites and converts them to GIF when requested.
The problem is that it takes too long to download and convert videos.
This causes a 504 timeout error.
How can I avoid timeout?
Currently, I am downloading using celery when we receive a request.
While downloading, django redirects right away.
def post(self, request):
form = URLform(request.POST)
ctx = {'form':form}
....
t = downloand_video.delay(data)
return redirect('gif_to_mp4:home')
This prevents transferring the files to the user.
Because celery cannot return file to user or response.
How can I send the file to the user?

Serving images asynchronously using django and celery?

I have a django app that serves images when a certain page is loaded. The images are stored on S3 and I retrieve them using boto and send the image content as an HttpResponse with the appropriate content type.
The problem here is, this is a blocking call. Sometimes, it takes a long time (few secs for the image of few hundred KBs) to retrieve the images and serve them to the client.
I tried doing converting this process to a celery task (async, non-blocking), but I am not sure how I can send back the data (images) when they are done downloading. Just returning HttpResponse from a celery task does not work. I found docs related to http callback tasks in an old celery docs here, but this is not supported in the newer celery versions.
So, should I use polling in the js? (I have used celery tasks in other parts of my website, but all of them are socket based) or is this even the right way to approach the problem?
Code:
Django views code that fetches the images (from S3 using boto3): (in views.py)
#csrf_protect
#ensure_csrf_cookie
def getimg(request, public_hash):
if request.user.is_authenticated:
query = img_table.objects.filter(public_hash=public_hash)
else:
query = img_table.objects.filter(public_hash=public_hash, is_public=True)
if query.exists():
item_dir = construct_s3_path(s3_map_thumbnail_folder, public_map_hash)
if check(s3, s3_bucket, item_dir): #checks if file exists
item = s3.Object(s3_bucket, item_dir)
item_content = item.get()['Body'].read()
return HttpResponse(item_content, content_type="image/png",status=200)
else: #if no image found, return a blank image
blank = Image.new('RGB', (1000,1000), (255,255,255))
response = HttpResponse(content_type="image/jpeg")
blank.save(response, "JPEG")
return response
else: #if request image corresponding to hash is not found in db
return render(request, 'core/404.html')
I call the above django view in a page like this:
<img src='/api/getimg/123abc' alt='img'>
In urls.py I have:
url(r'^api/getimg/(?P<public_hash>[a-zA-Z0-9]{6})$', views.getimg, name='getimg')

Faking a Streaming Response in Django to Avoid Heroku Timeout

I have a Django app that uses django-wkhtmltopdf to generate PDFs on Heroku. Some of the responses exceed the 30 second timeout. Because this is a proof-of-concept running on the free tier, I'd prefer not to tear apart what I have to move to a worker/ poll process. My current view looks like this:
def dispatch(self, request, *args, **kwargs):
do_custom_stuff()
return super(MyViewClass, self).dispatch(request, *args, **kwargs)
Is there a way I can override the dispatch method of the view class to fake a streaming response like this or with the Empy Chunking approach mentioned here to send an empty response until the PDF is rendered? Sending an empty byte will restart the timeout process giving plenty of time to send the PDF.
I solved a similar problem using Celery, something like this.
def start_long_process_view(request, pk):
task = do_long_processing_stuff.delay()
return HttpResponse(f'{"task":"{task.id}"}')
Then you can have a second view that can check the task state.
from celery.result import AsyncResult
def check_long_process(request, task_id):
result = AsyncResult(task_id)
return HttpResponse(f'{"state":"{result.state}"')
Finally using javascript you can just fetch the status just after the task is being started. Updating every half second will more than enough to give your users a good feedback.
If you think Celery is to much, there are light alternatives that will work just great: https://djangopackages.org/grids/g/workers-queues-tasks/

Models.py not getting loaded on startup with uWSGI

I have a system where I need full dynamic control of URLs, both before and after the request.
I am using signals for this, and for the pre-request signal (the one I'm having trouble with, I have a piece of middleware like this, which connects to the signal, allows it to check if the current request.path applies to it, and then goes with the first one it gets. This normally works fine, and is fairly elegant):
class PreRouteMiddleWare(object):
def process_request(self, request):
url = request.path.strip('/')
if url == '':
url = '/'
pre_routes = pre_route.send(sender=request, url=url)
for reciever, response in pre_routes:
if response:
return response
return None
Now, to register something that happens "pre" the Django routing stack, I do something like this in the app's models.py:
#receiver(pre_route)
def try_things(sender, url, **kwargs):
try:
thing= Thing.objects.get(url=url)
from myapp.views import myview
return myview(sender, some_args)
except Thing.DoesNotExist:
return False
Which also works great on my dev server.
However, the problem arises in production, where I use uWSGI. I start uWSGI (from upstart) like this:
sudo /usr/local/bin/uwsgi --emperor '/srv/*/uwsgi.ini' --enable-threads --single-interpreter
And my uwsgi.ini looks like this:
[uwsgi]
socket = /srv/new/uwsgi.sock
module = wsgi:app
chdir = /srv/new/myapp
virtualenv = /srv/new
env = DJANGO_SETTINGS_MODULE=myapp.settings
uid = wsgi_new
gid = www-data
chmod = 770
processes = 2
What seems to be happening is for each uWSGI process/thread, they only seem to load models.py on the first request, meaning that the first request for each process will fail to connect the signals. This means that I have n (where n is the number of processes) requests fail completely because models.py is not loaded at startup (as it is in development).
Am I configuring uWSGI wrong? Is there a better way to force signals to be connected at startup?
Django actually lazily loads stuff. Using the development server gives a false sense of security about how things will work in a real WSGI server because the loading of the management commands by the development server forces a lot of early initialisation that doesn't occur with a production server.
You might read:
http://blog.dscpl.com.au/2010/03/improved-wsgi-script-for-use-with.html
which explains the issue as it occurs in mod_wsgi. Same thing will happen for uWSGI.
Ok, turns out that I needed to make my middleware hook process_view as opposed to process_request:
class PreRouteMiddleWare(object):
def process_view(self, request, *args, **kwargs):
url = request.path.strip('/')
if url == '':
url = '/'
pre_routes = pre_route.send(sender=request, url=url)
for reciever, response in pre_routes:
if response:
return response
return None
And now it works great!