Django csv output in Excel - django

Hi I have a simple view which returns a csv file of a queryset which is generated from a mysql db using utf-8 encoding:
def export_csv(request):
...
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=search_results.csv'
writer = csv.writer(response, dialect=csv.excel)
for item in query_set:
writer.writerow(smart_str(item))
return response
return render(request, 'search_results.html', context)
This works fine as a CSV file, and can be opened in text editors, LibreOffice etc. without problem.
However, I need to supply a file which can be opened in MS Excel in Windows without errors. If I have strings with latin characters in the queryset such as 'Española' then the output in Excel is 'Española'.
I tried this blogpost but it didn't help. I also know abut the xlwt package, but I am curious if there is a way of correcting the output, using the CSV method I have at the moment.
Any help much appreciated.

Looks like there is not a uniform solution for all version of Excel.
Your best bet migth be to go with openpyxl, but this is rather complicated and requiers
separate handling of downloads for excel users which is not optimal.
Try adding byte order marks at the beginnign (0xEF, 0xBB, 0xBF) of file. See microsoft-excel-mangles-diacritics-in-csv-files
There is another similar post.

You might give python-unicodecsv a go. It replaces the python csv module which doesn't handle Unicode too gracefully.
Put the unicodecsv folder somehwere you can import it or install via setup.py
Import it into your view file, eg :
import unicodecsv as csv

I found out there are 3 things to do for Excel to open unicode csv files properly:
Use utf-16-le charset
Insert utf-16 byte order mark to the beginning of exported file
Use tabs instead of commas in csv
So, this should make it work in Python 3.7 and Django 2.2
import codecs
...
def export_csv(request):
...
response = HttpResponse(content_type='text/csv', charset='utf-16-le')
response['Content-Disposition'] = 'attachment; filename=search_results.csv'
response.write(codecs.BOM_UTF16_LE)
writer = csv.writer(response, dialect='excel-tab')
for item in query_set:
writer.writerow(smart_str(item))
return response

Related

Django PIPE youtube-dl to view for download

TL;DR: I want to pipe the output of youtube-dl to the user's browser on a button click, without having to save the video on my server's disk.
So I'm trying to have a "download" button on a page (django backend) where the user is able to download the video they're watching.
I am using the latest version of youtube-dl.
In my download view I have this piece of code:
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
file = ydl.download([f"https://clips.twitch.tv/{pk}"])
And it works, to some extend. It does download the file to my machine, but I am not sure how to allow users to download the file.
I thought of a few ways to achieve this, but the only one that really works for me would be a way to pipe the download to user(client) without needing to store any video on my disk. I found this issue on the same matter, but I am not sure how to make it work. I successfully piped the download to stdout using ydl_opts = {'outtmpl': '-'}, but I'm not sure how to pipe that to my view's response. One of the responses from a maintainer mentions a subprocess.Popen, I looked it up but couldn't make out how it should be implemented in my case.
I did a workaround.
I download the file with a specific name, I return the view with HttpResponse, with force-download content-type, and then delete the file using python.
It's not what I originally had in mind, but it's the second best solution that I could come up with. I will select this answer as accepted solution until a Python wizard gives a solution to the original question.
The code that I have right now:
def download_clip(request, pk):
ydl_opts = {
'outtmpl': f"{pk}.mp4"
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download([f"https://clips.twitch.tv/{pk}"])
path = f"{pk}.mp4"
file_path = os.path.join(path)
if os.path.exists(file_path):
with open(file_path, 'rb') as fh:
response = HttpResponse(fh.read(), content_type="application/force-download")
response['Content-Disposition'] = 'inline; filename=' + os.path.basename(file_path)
os.remove(file_path)
return response
raise Http404

read text file content with python at zapier

I have problems getting the content of a txt-file into a Zapier
object using https://zapier.com/help/code-python/. Here is the code I am
using:
with open('file', 'r') as content_file:
content = content_file.read()
I'd be glad if you could help me with this. Thanks for that!
David here, from the Zapier Platform team.
Your code as written doesn't work because the first argument for the open function is the filepath. There's no file at the path 'file', so you'll get an error. You access the input via the input_data dictionary.
That being said, the input is a url, not a file. You need to use urllib to read that url. I found the answer here.
I've got a working copy of the code like so:
import urllib2 # the lib that handles the url stuff
result = []
data = urllib2.urlopen(input_data['file'])
for line in data: # file lines are iterable
result.append(line) # keep each line, or parse, etc.
return {'lines': result}
The key takeaway is that you need to return a dictionary from the function, so make sure you somehow squish your file into one.
​Let me know if you've got any other questions!
#xavid, did you test this in Zapier?
It fails miserably beacuse urllib2 doesn't exist in the zapier python environment.

pyPdf Splitting Large PDF fails after splitting 150-152 pages of the PDF

I have a function that takes in PDF file path as input and splits it into separate pages as shown below:
import os,time
from pyPdf import PdfFileReader, PdfFileWriter
def split_pages(file_path):
print("Splitting the PDF")
temp_path = os.path.join(os.path.abspath(__file__), "temp_"+str(int(time.time())))
if not os.path.exists(temp_path):
os.makedirs(temp_path)
inputpdf = PdfFileReader(open(file_path, "rb"))
if inputpdf.getIsEncrypted():
inputpdf.decrypt('')
for i in xrange(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(i))
with open(os.path.join(temp_path,'%s.pdf'% i),"wb") as outputStream:
output.write(outputStream)
It works for small files but the problem is that It only splits for first 0-151 pages when the PDF has more than 152 pages and stops after that. It also sucks out all the memory of the system before I kill it.
Please let me know what I'm doing wrong or where the problem is occurring and how do I correct it?
It seems like the problem is with pyPdf itself. I switched to pyPDF2 and it worked.

Django: exporting model data to excel file messes up character set

I'm trying to export model data to a Microsoft Excel file type (.xls) by using this view:
def generate_spreadsheet(request):
alumnos = Alumno.objects.all()
response = render_to_response("spreadsheet.html", {'alumnos': alumnos})
filename = "alumnoss.xls"
response['Content-Disposition'] = 'attachment; filename='+filename
response['Content-Type'] = 'application/vnd.ms-excel; charset=utf-16'
return response
As you can see, I define the character set as utf-16 which should include all of the extra characters like áéíóú, etc. But when I open the excel document, instead of reading
Vélez
you read:
Vélez
Any help would be appreciated :)
You can set what charset is going to be used for rendering defining DEFAULT_CHARSET in your settings.py file:
http://docs.djangoproject.com/en/1.3/ref/settings/#default-charset
render_to_response() is probably writing in 'utf-8', not in utf-16.

Serving a dynamically generated MS Excel files using django and xlwt fails in Internet Explorer

I am trying to use xlwt to create MS-Excel files from the contents of the database on my django site.
I have seen several solutions here on stackoverflow, in particular this link: django excel xlwt
and this django snippet: http://djangosnippets.org/snippets/2233/
These examples work in firefox, but not in Internet Explorer. Instead of getting prompted to open or save a file, a bunch of wingding junk appears on the screen. It seems that IE thinks the response is html.
Here is my view function:
def exportexcel(request):
from xlwt import Workbook
wb = Workbook()
ws = wb.add_sheet('Sheetname')
ws.write(0, 0, 'Firstname')
ws.write(0, 1, 'Surname')
ws.write(1, 0, 'Hans')
ws.write(1, 1, 'Muster')
fname = 'testfile.xls'
response = HttpResponse(mimetype="application/ms-excel")
response['Content-Disposition'] = 'attachment; filename=%s' % fname
wb.save(response)
return response
I am seeing this behavior in IE 8.
Any suggestions as to why this isn't working in Internet Explorer?
Thanks.
The mimetype you're using application/ms-excel is invalid for .xls files.
The standard one is application/vnd.ms-excel
Look here Setting mime type for excel document for more informations.