Flask response to excel file giving corrupt excel file - flask

I have a website that has a button. When it's clicked, it returns a number of pandas dataframers into an excel file and returns that excel file automatically as download.
It seems to work ok, except when I open the file, it seems to be corrupted. It asks if some of the tabs should be recovered. I'm using the code below. Any suggestions are appreciated what could be the cause for this.
import io
from flask.helpers import make_response
from pandas.io.excel import ExcelWriter
output = io.BytesIO()
writer = ExcelWriter(output)
dfs = [df1,df2....]
tabs ['tab1','tab2',....]
for df, tab_name in zip(dfs, tab_names):
df.to_excel(writer, tab_name)
writer.close()
resp = make_response(output.getvalue())
resp.headers['Content-Disposition'] = 'attachment; filename=output.xlsx'
resp.headers["Content-type"] = "text/csv"
return resp

You'll need to to add
output.seek(0)
after you close the writer.
You might also find it easier to write
return send_file(output, attachment_filename="output.xlsx", as_attachment=True)
(after importing send_file from flask)

Related

Pandas closing django-like file, gives ValueError: I/O operation on closed file when uploading

In my process i need to upload a file to django as:
newFile = request.FILES['file']
then in another big function i open it with pandas:
data = pandas.read_csv(data_file, engine = 'python', header=headers_row, encoding = 'utf-8-sig')
and then i need to upload it
uploaded_file = Uploaded_file(file = newFile, retailer = ret, date = date)
but randomly (like 50/50) i get a ValueError: I/O operation on closed file.
Any solution to this? is it possible to open the file again or maybe make a copy of it and use pandas in one and upload the other?
I tried the later but i'm not sure of the implications of going this route:
from io import BytesIO
output = BytesIO(newFile.file.read())
for now it works but i'd appreciate any input on this

Django download excel file results in corrupted Excel file

I am trying to export a Pandas dataframe from a Django app as an Excel file. I currently export it to CSV file and this works fine except as noted. The problem is that when the user open the csv file in Excel App, a string that looks like numbers .... for example a cell with a value of '111,1112' or '123E345' which is intended to be a string, ends up showing as error or exponent in Excel view; even if I make sure that the Pandas column is not numeric.
This is how I export to CSV:
response = HttpResponse(content_type='text/csv')
filename = 'aFileName'
response['Content-Disposition'] = 'attachment; filename="' + filename + '"'
df.to_csv(response, encoding='utf-8', index=False)
return response
To export with content type EXCEL, I saw several references where the following approach was recommended:
with BytesIO() as b:
# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(b, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
return HttpResponse(b.getvalue(), content_type='application/vnd.ms-excel')
When I try this, It appears to export something, but if I try to open the file in Excel Office 2016, I get a Excel message that the file is corrupted. Generally the size is few KB at most.
Please advise what could be wrong with the 2nd approach that is causing a bad export. I am using Django 2.2.1, Pandas 0.25.1
Thank you

xlwt cannot format number to date

Why the following code can't format 44000 to a date in excel? It shows up in xls file as the original number no matter what I try.
Things I have tried:
Different format string, none works. I copy them from source file so no mistake here
Check style object with breakpoint, it gets the correct num_format_str
quote or un-quote the number
I am using Mac Preview to open the xls file if that's relevant.
import xlwt
book = xlwt.Workbook(encoding='utf8')
sheet = book.add_sheet('sheet 1')
style = xlwt.easyxf(num_format_str="M/D/YY")
sheet.write(1, 1, 44000, style=style)
response = HttpResponse(mimetype='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename=test.xls'
book.save(response)
return response
The code is with no problem. Problem is with Mac Preview.. I open the file in Excel on Windows and 44400 shows as date.

Django to serve generated excel file

I looked at the various questions similar to mine, but I could not find anything a fix for my problem.
In my code, I want to serve a freshly generated excel file residing in my app directory in a folder named files
excelFile = ExcelCreator.ExcelCreator("test")
excelFile.create()
response = HttpResponse(content_type='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename="test.xls"'
return response
So when I click on the button that run this part of the code, it sends to the user an empty file. By looking at my code, I can understand that behavior because I don't point to that file within my response...
I saw some people use the file wrapper (which I don't quite understand the use). So I did like that:
response = HttpResponse(FileWrapper(excelFile.file),content_type='application/vnd.ms-excel')
But then, I receive the error message from server : A server error occurred. Please contact the administrator.
Thanks for helping me in my Django quest, I'm getting better with all of your precious advices!
First, you need to understand how this works, you are getting an empty file because that is what you are doing, actually:
response = HttpResponse(content_type='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename="test.xls"'
HttpResponse receives as first arg the content of the response, take a look to its contructor:
def __init__(self, content='', mimetype=None, status=None, content_type=None):
so you need to create the response with the content that you wish, is this case, with the content of your .xls file.
You can use any method to do that, just be sure the content is there.
Here a sample:
import StringIO
output = StringIO.StringIO()
# read your content and put it in output var
out_content = output.getvalue()
output.close()
response = HttpResponse(out_content, mimetype='application/vnd.ms-excel')
response['Content-Disposition'] = 'attachment; filename="test.xls"'
I would recommend you use:
python manage.py runserver
to run your application from the command line. From here you will see the console output of your application and any exceptions that are thrown as it runs. This may provide a quick resolution to your problem.

django return file over HttpResponse - file is not served correctly

I want to return some files in a HttpResponse and I'm using the following function. The file that is returned always has a filesize of 1kb and I do not know why. I can open the file, but it seems that it is not served correctly. Thus I wanted to know how one can return files with django/python over a HttpResponse.
#login_required
def serve_upload_files(request, file_url):
import os.path
import mimetypes
mimetypes.init()
try:
file_path = settings.UPLOAD_LOCATION + '/' + file_url
fsock = open(file_path,"r")
#file = fsock.read()
#fsock = open(file_path,"r").read()
file_name = os.path.basename(file_path)
file_size = os.path.getsize(file_path)
print "file size is: " + str(file_size)
mime_type_guess = mimetypes.guess_type(file_name)
if mime_type_guess is not None:
response = HttpResponse(fsock, mimetype=mime_type_guess[0])
response['Content-Disposition'] = 'attachment; filename=' + file_name
except IOError:
response = HttpResponseNotFound()
return response
Edit:
The bug is actually not a bug ;-)
This solution is working in production on an apache server, thus the source is ok.
While writing this question I tested it local with the django development server and was wondering why it does not work. A friend of mine told me that this issue could arise if the mime types are not set in the server. But he was not sure if this is the problem. But one thing for sure.. it has something to do with the server.
Could it be that the file contains some non-ascii characters that render ok in production but not in development?
Try reading the file as binary:
fsock = open(file_path,"rb")
Try passing the fsock iterator as a parameter to HttpResponse(), rather than to its write() method which I think expects a string.
response = HttpResponse(fsock, mimetype=...)
See http://docs.djangoproject.com/en/dev/ref/request-response/#passing-iterators
Also, I'm not sure you want to call close on your file before returning response. Having played around with this in the shell (I've not tried this in an actual Django view), it seems that the response doesn't access the file until the response itself is read. Trying to read a HttpResponse created using a file that is now closed results in a ValueError: I/O operation on closed file.
So, you might want to leave fsock open, and let the garbage collector deal with it after the response is read.
Try disabling "django.middleware.gzip.GZipMiddleware" from your MIDDLEWARE_CLASSES in settings.py
I had the same problem, and after I looked around the middleware folder, this middleware seemed guilty to me and removing it did the trick for me.