Python: writing file and using buffer - django

I'm using django to generate personalized file, but while doing so a file is generated, and in terms of space using it is quite a poor thing to do.
this is how i do it right now:
with open(filename, 'wb') as f:
pdf.write(f) #pdf is an object of pyPDF2 library
with open(filename, 'rb') as f:
return send_file(data=f, filename=filename) #send_file is a HTTPResponse parametted to download file data
So in the code above a file is generated.
The easy fix would be to deleted the file after downloading it, but i remember in java using stream object to handle this case.
Is it possible to do so in Python?
EDIT:
def send_file(data, filename, mimetype=None, force_download=False):
disposition = 'attachment' if force_download else 'inline'
filename = os.path.basename(filename)
response = HttpResponse(data, content_type=mimetype or 'application/octet-stream')
response['Content-Disposition'] = '%s; filename="%s"' % (disposition, filename)
return response

Without knowing the exact details of the pdf.write and send_file functions, I expect in both cases they will take an object that conforms to the BinaryIO interface. So, you could try using a BytesIO to store the content in an in-memory buffer, rather than writing out to a file:
with io.BytesIO() as buf:
pdf.write(buf)
buf.seek(0)
send_file(data=buf, filename=filename)
Depending on the exact nature of the above-mentioned functions, YMMV.

Related

Django, Default file icon is missing after download of file

I have written code for downloading a file through an API. It works fine as I can see. The file size is the same. But the file has no longer a default file icon. I am pretty new at this and maybe I am doing something wrong. I am reading the file as I would for a standard textfile and saving it in the same way with the binary option. So how can the files be same size and still something seems to be missing in the downloaded file? Is there a better way to download files?
This is the code on the server:
file_location = 'static/File.pkg'
try:
with open(file_location, 'rb') as f:
filex_data = f.read()
response = HttpResponse(filex_data, content_type='application/octet-stream')
response['Content-Disposition'] = 'attachment; filename="File.pkg"'
return response
This is the code on my local computer:
url = 'http://myServer/waprfile/'
x = requests.get(url, data=data, headers=headers)
f = open("TheNewFile.pgk", "ab")
f.write(x.content)
f.close()

Pandas closing django-like file, gives ValueError: I/O operation on closed file when uploading

In my process i need to upload a file to django as:
newFile = request.FILES['file']
then in another big function i open it with pandas:
data = pandas.read_csv(data_file, engine = 'python', header=headers_row, encoding = 'utf-8-sig')
and then i need to upload it
uploaded_file = Uploaded_file(file = newFile, retailer = ret, date = date)
but randomly (like 50/50) i get a ValueError: I/O operation on closed file.
Any solution to this? is it possible to open the file again or maybe make a copy of it and use pandas in one and upload the other?
I tried the later but i'm not sure of the implications of going this route:
from io import BytesIO
output = BytesIO(newFile.file.read())
for now it works but i'd appreciate any input on this

Flask response to excel file giving corrupt excel file

I have a website that has a button. When it's clicked, it returns a number of pandas dataframers into an excel file and returns that excel file automatically as download.
It seems to work ok, except when I open the file, it seems to be corrupted. It asks if some of the tabs should be recovered. I'm using the code below. Any suggestions are appreciated what could be the cause for this.
import io
from flask.helpers import make_response
from pandas.io.excel import ExcelWriter
output = io.BytesIO()
writer = ExcelWriter(output)
dfs = [df1,df2....]
tabs ['tab1','tab2',....]
for df, tab_name in zip(dfs, tab_names):
df.to_excel(writer, tab_name)
writer.close()
resp = make_response(output.getvalue())
resp.headers['Content-Disposition'] = 'attachment; filename=output.xlsx'
resp.headers["Content-type"] = "text/csv"
return resp
You'll need to to add
output.seek(0)
after you close the writer.
You might also find it easier to write
return send_file(output, attachment_filename="output.xlsx", as_attachment=True)
(after importing send_file from flask)

How to download data from azure-storage using get_blob_to_stream

I have some files in my azure-storage account. i need to download them using get_blob_to_stream.it is returning azure.storage.blob.models.Blob object. so i couldn't download it by using below code.
def download(request):
file_name=request.POST['tmtype']
fp = open(file_name, 'wb')
generator = block_blob_service.list_blobs(container_name)
for blob in generator:
print(blob.name)
if blob.name==file_name:
blob=block_blob_service.get_blob_to_stream(container_name, blob.name, fp,max_connections= 2)
response = HttpResponse(blob, content_type="image/png")
response['Content-Disposition'] = "attachment; filename="+file_name
return response
You can actually use the get_blob_to_path property, below is an example in python:
from azure.storage.blob import BlockBlobService
bb = BlockBlobService(account_name='', account_key='')
container_name = ""
blob_name_to_download = "test.txt"
file_path ="/home/Adam/Downloaded_test.txt"
bb.get_blob_to_path(container_name, blob_name_to_download, file_path, open_mode='wb', snapshot=None, start_range=None, end_range=None, validate_content=False, progress_callback=None, max_connections=2, lease_id=None, if_modified_since=None, if_unmodified_since=None, if_match=None, if_none_match=None, timeout=None)
This example with download a blob file named: "test.txt", in a container, to File_path"/home/Adam/Downloaded_test.txt" , you can also keep the same name if you'd like to as well. You can find more samples including this one in https://github.com/adamsmith0016/Azure-storage
If you want to use get_blob_to_stream. You can download with below code:
with io.open(file_path, 'wb') as file:
blob = block_blob_service.get_blob_to_stream(
container_name=container_name,
blob_name=blob_name, stream=file,
max_connections=2)
Just note that the file content will be streamed to the file rather than the returned blob object. The blob.content should be None. That is by design. See https://github.com/Azure/azure-storage-python/issues/538.

django return file over HttpResponse - file is not served correctly

I want to return some files in a HttpResponse and I'm using the following function. The file that is returned always has a filesize of 1kb and I do not know why. I can open the file, but it seems that it is not served correctly. Thus I wanted to know how one can return files with django/python over a HttpResponse.
#login_required
def serve_upload_files(request, file_url):
import os.path
import mimetypes
mimetypes.init()
try:
file_path = settings.UPLOAD_LOCATION + '/' + file_url
fsock = open(file_path,"r")
#file = fsock.read()
#fsock = open(file_path,"r").read()
file_name = os.path.basename(file_path)
file_size = os.path.getsize(file_path)
print "file size is: " + str(file_size)
mime_type_guess = mimetypes.guess_type(file_name)
if mime_type_guess is not None:
response = HttpResponse(fsock, mimetype=mime_type_guess[0])
response['Content-Disposition'] = 'attachment; filename=' + file_name
except IOError:
response = HttpResponseNotFound()
return response
Edit:
The bug is actually not a bug ;-)
This solution is working in production on an apache server, thus the source is ok.
While writing this question I tested it local with the django development server and was wondering why it does not work. A friend of mine told me that this issue could arise if the mime types are not set in the server. But he was not sure if this is the problem. But one thing for sure.. it has something to do with the server.
Could it be that the file contains some non-ascii characters that render ok in production but not in development?
Try reading the file as binary:
fsock = open(file_path,"rb")
Try passing the fsock iterator as a parameter to HttpResponse(), rather than to its write() method which I think expects a string.
response = HttpResponse(fsock, mimetype=...)
See http://docs.djangoproject.com/en/dev/ref/request-response/#passing-iterators
Also, I'm not sure you want to call close on your file before returning response. Having played around with this in the shell (I've not tried this in an actual Django view), it seems that the response doesn't access the file until the response itself is read. Trying to read a HttpResponse created using a file that is now closed results in a ValueError: I/O operation on closed file.
So, you might want to leave fsock open, and let the garbage collector deal with it after the response is read.
Try disabling "django.middleware.gzip.GZipMiddleware" from your MIDDLEWARE_CLASSES in settings.py
I had the same problem, and after I looked around the middleware folder, this middleware seemed guilty to me and removing it did the trick for me.