How do I make excel spreadsheets downloadable in Django? - django

I'm writing a web application that generates reports from a local database. I want to generate an excel spreadsheet and immediately cause the user to download it. However, when I try to return the file via HttpResponse, I can not open the file. However, if I try to open the file in storage, the file opens perfectly fine.
This is using Django 2.1 (for database reasons, I'm not using 2.2) and I'm generating the file with xlrd. There is another excel spreadsheet that will need to be generated and downloaded that uses the openpyxl library (both libraries serve very distinct purposes IMO).
This spreadsheet is not very large (5x6 column s xrows).
I've looked at other similar stack overflow questions and followed their instructions. Specifically, I am talking about this answer:
https://stackoverflow.com/a/36394206/6411417
As you can see in my code, the logic is nearly the same and yet I can not open the downloaded excel spreadsheets. The only difference is that my file name is generated when the file is generated and returned into the file_name variable.
def make_lrm_summary_file(request):
file_path = make_lrm_summary()
if os.path.exists(file_path):
with open(file_path, 'rb') as fh:
response = HttpResponse(fh.read(), content_type="application/vnd.ms-excel")
response['Content-Disposition'] = f'inline; filename="{ os.path.basename(file_path) }"'
return response
raise Http404
Again, the file is properly generated and stored on my server but the download itself is providing an excel file that can not be opened. Specifically, I get the error message:
EXCEL.EXE - Application Error | The application was unable to start correctly (0x0000005). Click OK to close the application.

Related

How to pass InMemoryUploadedFile as a file?

User records audio, audio gets saved into audio Blob and sent to backend. I want to get the audio file and send it to openai whisper API.
files = request.FILES.get('audio')
audio = whisper.load_audio(files)
I've tried different ways to send the audio file but none of it seemed to work and I don't understand how it should be sent. I would prefer not to save the file. I want user recorded audio sent to whisper API from backend.
Edit*
The answer by AKX seems to work but now there is another error
Edit 2*
He has edited his answer and everything works perfectly now. Thanks a lot to #AKX!
load_audio() requires a file on disk, so you'll need to cater to it – but you can use a temporary file that's automagically deleted outside the with block. (On Windows, you may need to use delete=False because of sharing permission reasons.)
import os
import tempfile
file = request.FILES.get('audio')
with tempfile.NamedTemporaryFile(suffix=os.path.splitext(file.name)[1], delete=False) as f:
for chunk in file.chunks():
f.write(chunk)
f.seek(0)
try:
audio = whisper.load_audio(f.name)
finally:
os.unlink(f.name)

When using Django's Default Storage should/can you close() an opened file?

When using Django's DefaultStorage it's possible to open and read a file something like this:
from django.core.files.storage import default_storage
file = default_storage.open("dir/file.txt", mode="rb")
data = file.read()
When using python's own open() method, it's best to close() the file afterwards, or use a with open("dir/file.txt") as file: construction.
But reading the docs for Django's Storage classes, and browsing the source, I don't see a close() equivalent.
So my questions are:
Should a file opened with Django's Default Storage be closed?
If so, how?
If not, why isn't it necessary?
You don't see a close method because you are looking at the Storage class. The open method of the Storage class returns an instance of django.core.files.base.File [Source code] which basically wraps the python file object and also has a close method that closes the file (The methods like read, etc. are inherited from FileProxyMixin).
Generally when you open a file you should close it, this is the same with Django, which is also emphasised in the documentation:
Closing files is especially important when accessing file fields in a
loop over a large number of objects. If files are not manually closed
after accessing them, the risk of running out of file descriptors may
arise. This may lead to the following error:
OSError: [Errno 24] Too many open files
But there are few instances where you shouldn't close files, which mostly are when you are passing the file to some function / method / object that will read it, for example if you create a FileResponse object you shouldn't close the file as Django will close it by itself:
The file will be closed automatically, so don’t open it with a context
manager.
To complete your example code, you will close the file as:
from django.core.files.storage import default_storage
file = default_storage.open("dir/file.txt", mode="rb")
data = file.read()
file.close()

Why is setting a django FileField from existing file on the same partition slow?

In my Django application I have to deal with huge files. Instead of uploading them via the web app, the users may place them into a folder (called .dump) on a Samba share and then can choose the file in the Django app to create a new model instance from it. The view looks roughly like this:
class AddDumpedMeasurement(View):
def get(self, request, *args, **kwargs):
filename = request.GET.get('filename', None)
dump_dir = os.path.join(settings.MEDIA_ROOT, settings.MEASUREMENT_DATA_DUMP_PATH)
in_file = os.path.join(dump_dir, filename)
if isfile(in_file):
try:
with open(in_file, 'rb') as f:
object = NCFile.objects.create(sample=sample, created_by=request.user, file=File(f))
return JsonResponse(data={'redirect': object.get_absolute_url()})
except:
return JsonResponse(data={'error': 'Couldn\'t read file'}, status=400)
else:
return JsonResponse(data={'error': 'File not found'}, status=400)
As MEDIA_ROOT and .dump are on the same Samba share (which is mounted by the web server), why is moving the file to its new location so slow? I would have expected it to be almost instantaneous. Is it because I open() it and stream the bytes to the file object? If so, is there a better way to move the file to its correct destination and create the model instance?
Using a temporary file and replacing it with the original one allows one to use os.rename which is fast.
tmp_file = NamedTemporaryFile()
object = NCFile.objects.create(..., file=File(tmp_file))
tmp_file.close()
if isfile(object.file.path):
os.remove(object.file.path)
new_relative_path = os.path.join(os.path.dirname(object.file.name), filename)
new_relative_path = object.file.storage.get_available_name(new_relative_path)
os.rename(in_file, os.path.join(settings.MEDIA_ROOT, new_relative_path))
object.file.name = new_relative_path
object.save()
Is it because I open() it and stream the bytes to the file object?
I would argue that it is so. A simple move operation on a file system object means just updating a record on the file systems internal database. That would indeed be instantaneous
opening a local file, reading it line by line is like a copy operation which could be slow depending on the file size. Additionally you are doing this at a very high level while an OS copy operation happens at a much lower level.
But that's not the real cause of the problem. You have said the files are on a samba share. Which I presume means that you have mounted a remote folder locally. Thus when you read the file in question you are actually fetching it over the network. That will be slower than a disk read. Then when you write the destination file, you are writing data over the network, again an operation that's slower than a disk write.

How to zip or tar a static folder without writing anything to the filesystem in python?

I know about this question. But you can’t write to filesystem in app engine (shutil or zipfile require creating files).
So basically I need to archive something like/base/naclusing zip or tar, and write the output to the web browser asking the page (the output will never exceed 32 Mb).
It just happened that I had to solve the exact same problem tonight :) This worked for me:
import StringIO
import tarfile
fd = StringIO.StringIO()
with tarfile.open(mode="w:gz", fileobj=fd) as tgz:
tgz.add('dir_to_download')
self.response.headers['Content-Type'] ='application/octet-stream'
self.response.headers['Content-Disposition'] = 'attachment; filename="archive.tgz"'
self.response.write(fd.getvalue())
Key points:
used StringIO to fake a file in memory
used fileobj to pass directly the fake file's object to tarfile.open() (also supported by gzip.GzipFile() if you prefer gzip instead of tarfile)
set headers to present the response as a downloadable file

Boto: small file download works but large file doesnt

I have a script that works very well in presenting the user with a list of files stored in an S3 bucket which when they select, the file is downloaded and something is then done with the file.
This method works on files up to 600Mb, but when the user chooses another file which is 2Gb is get a Boto exception error stating the file is being used by another process.
listname = self.list_ctrl.GetItemText(i)
conn = boto.connect_s3(access_key, secret_key)
bucket = conn.get_bucket('data')
key = bucket.get_key(listname)
key.get_contents_to_filename(key.name)
It is really puzzling as it works great for smaller files
Any ideas what may be causing it to fail?