Django - writing a PDF created with xhtml2pdf to server's disk - django

I'm trying to write a script that will save a pdf created by xhtml2pdf directly to the server, without doing the usual route of prompting the user to download it to their computer. Documents() is the Model I am trying to save to, and the new_project and output_filename variables are set elsewhere.
html = render_to_string(template, RequestContext(request, context)).encode('utf8')
result = open(output_filename, "wb")
pdf = CreatePDF(src=html, dest=results, path = "", encoding = 'UTF-8', link_callback=link_callback) #link callback was originally set to link_callback, defined below
result.close()
if not pdf.err:
new_doc=Documents()
new_doc.project=new_project
new_doc.posted_by=old_mess[0].from_user_fk.username
new_doc.documents = result
new_doc.save()
With this configuration when it reaches new_doc.save() I get the error: 'file' object has no attribute '_committed'
Does anyone know how I can fix this? Thanks!

After playing around with it I found a working solution. The issue was I was not creating the new Document while result (the pdf) was still open.
"+" needed to be added to open() so that the pdf file was available for reading and writing, and not just writing.
Note that this does save the pdf in a different folder first (Files). If that is not the desired outcome for your application you will need to delete it.
html = render_to_string(template, RequestContext(request, context)).encode('utf8')
results = StringIO()
result = open("Files/"+output_filename, "w+b")
pdf = CreatePDF(src=html, dest=results, path = "", encoding = 'UTF-8', link_callback=link_callback) #link callback was originally set to link_callback, defined below
if not pdf.err:
result.write(results.getvalue())
new_doc=Documents()
new_doc.project=new_project
new_doc.documents.save(output_filename, File(result))
new_doc.posted_by=old_mess[0].from_user_fk.username
new_doc.save()
result.close()

Related

Openpyxl Django

Trying to automate modifying an Ecxel file using openpyxl lib and below code works:
def modify_excel_view(request):
wb = openpyxl.load_workbook('C:\\my_dir\\filename.xlsx')
sh1 = wb['Sheet1']
row = sh1.max_row
column = sh1.max_column
for i in range(2, row + 1):
# ...some_code...
wb.save('C:\\my_dir\\new_filename.xlsx')
return render(request, 'my_app/convert.html', {'wb': wb})
But how to implement a similar method without static (hardcoding) path to file and a filename? Like choosing xlsx file from modal window?
Saving modified copy on a desktop by default? (Maybe like 'ReportLab library': response.write(pdf)=>return response. And it saving on the desktop by default.)
Thank you in advance!
You can send the file path and name as POST parameters and extract them in the views.py function using request.POST.get('file_path').
Just add a line to save the copy to the desktop before returning.

Django PIPE youtube-dl to view for download

TL;DR: I want to pipe the output of youtube-dl to the user's browser on a button click, without having to save the video on my server's disk.
So I'm trying to have a "download" button on a page (django backend) where the user is able to download the video they're watching.
I am using the latest version of youtube-dl.
In my download view I have this piece of code:
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
file = ydl.download([f"https://clips.twitch.tv/{pk}"])
And it works, to some extend. It does download the file to my machine, but I am not sure how to allow users to download the file.
I thought of a few ways to achieve this, but the only one that really works for me would be a way to pipe the download to user(client) without needing to store any video on my disk. I found this issue on the same matter, but I am not sure how to make it work. I successfully piped the download to stdout using ydl_opts = {'outtmpl': '-'}, but I'm not sure how to pipe that to my view's response. One of the responses from a maintainer mentions a subprocess.Popen, I looked it up but couldn't make out how it should be implemented in my case.
I did a workaround.
I download the file with a specific name, I return the view with HttpResponse, with force-download content-type, and then delete the file using python.
It's not what I originally had in mind, but it's the second best solution that I could come up with. I will select this answer as accepted solution until a Python wizard gives a solution to the original question.
The code that I have right now:
def download_clip(request, pk):
ydl_opts = {
'outtmpl': f"{pk}.mp4"
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download([f"https://clips.twitch.tv/{pk}"])
path = f"{pk}.mp4"
file_path = os.path.join(path)
if os.path.exists(file_path):
with open(file_path, 'rb') as fh:
response = HttpResponse(fh.read(), content_type="application/force-download")
response['Content-Disposition'] = 'inline; filename=' + os.path.basename(file_path)
os.remove(file_path)
return response
raise Http404

Directing Output Paths of Altered Files

How can I direct the destination of the output file to my db?
My models.py is structured like so:
class Model(models.Model):
char = models.CharField(max_length=50, null=False, blank=False)
file = models.FileField(upload_to=upload_location, null=True, blank=True)
I have the user enter a value for 'char', and then the value of 'char' is printed on to a file. The process of successfully printing onto the file is working, however, the file is outputting to my source directory.
My goal is to have the output file 'pdf01.pdf' output to my db and be represented as 'file' so that the admin can read it.
Much of the information in the Dango docs has been focussed on directing the path of objects imported by the user directly, not on files that have been created internally. I have been reading mostly from these docs:
Models-Fields
Models
File response objects
Outputting PDFs
I have seen it recommend to write to a buffer, not a file, then save the buffer contents to my db however I haven't been able to find many examples of how to do that relevant to my situation online.
Perhaps there is a relevant gap in my knowledge regarding buffers and BytesIO? Here is the function I have been using to alter the pdf, I have been using BytesIO to temporarily store files throughout the process but have not been able to figure out how to use it to direct the output anywhere specific.
can = canvas.Canvas(BytesIO(), pagesize=letter)
can.drawString(10, 10, char)
can.save()
BytesIO().seek(0)
text_pdf = PdfFileReader(BytesIO())
base_file = PdfFileReader(open("media/01.pdf", "rb"))
page = base_file.getPage(0)
page.mergePage(text_pdf.getPage(0))
PdfFileWriter().addPage(page)
PdfFileWriter().write(open("pdf01.pdf", "wb")
FileField does not store files directly in the database. Files get uploaded in a location on the filesystem determined by the upload_to argument. Only some metadata are stored in the DB, including the path of the file in your filesystem.
If you want to have the contents of the files in the database, you could create a new File model that includes a BinaryField to store the data and a CharField to store the URL from which the file can be fetched. To feed the data of PdfFileWriter to the binary field of Django, perhaps the most appropriate would be to use BytesIO.
I found this workaround to direct the file to a desired location (in this case both my media_cdn folder and also output it to an admin.)
I set up an admin action to perform the function that outputs the file so the admin will have access to both the output version in the form of both an HTTP response and through the media_cdn storage.
Hope this helps anyone who struggles with the same problem.
#admin.py
class edit_and_output():
def output:
author = Account.email
#alter file . . .
with open('media_cdn/account/{0}.pdf'.format(author), 'wb') as out_file:
output.write(out_file)
response = HttpResponse(content_type='application/pdf')
response['Content-Disposition'] = 'attachment;filename="{0}.pdf"'.format(author)
output.write(response)

Save pdf from django-wkhtmltopdf to server (instead of returning as a response)

I'm writing a Django function that takes some user input, and generates a pdf for the user. However, the process for generating the pdf is quite intensive, and I'll get a lot of repeated requests so I'd like to store the generated pdfs on the server and check if they already exist before generating them.
The problem is that django-wkhtmltopdf (which I'm using for generation) is meant to return to the user directly, and I'm not sure how to store it on the file.
I have the following, which works for returning a pdf at /pdf:
urls.py
urlpatterns = [
url(r'^pdf$', views.createPDF.as_view(template_name='site/pdftemplate.html', filename='my_pdf.pdf'))
]
views.py
class createPDF(PDFTemplateView):
filename = 'my_pdf.pdf'
template_name = 'site/pdftemplate.html'
So that works fine to create a pdf. What I'd like is to call that view from another view and save the result. Here's what I've got so far:
#Create pdf
pdf = createPDF.as_view(template_name='site/pdftemplate.html', filename='my_pdf.pdf')
pdf = pdf(request).render()
pdfPath = os.path.join(settings.TEMP_DIR,'temp.pdf')
with open(pdfPath, 'w') as f:
f.write(pdf.content)
This creates temp.pdf and is about the size I'd expect but the file isn't valid (it renders as a single completely blank page).
Any suggestions?
Elaborating on the previous answer given: to generate a pdf file and save to disk do this anywhere in your view:
...
context = {...} # build your context
# generate response
response = PDFTemplateResponse(
request=self.request,
template=self.template_name,
filename='file.pdf',
context=context,
cmd_options={'load-error-handling': 'ignore'})
# write the rendered content to a file
with open("file.pdf", "wb") as f:
f.write(response.rendered_content)
...
I have used this code in a TemplateView class so request and template fields were set like that, you may have to set it to whatever is appropriate in your particular case.
Well, you need to take a look to the code of wkhtmltopdf, first you need to use the class PDFTemplateResponse in wkhtmltopdf.views to get access to the rendered_content property, this property get us access to the pdf file:
response = PDFTemplateResponse(
request=<your_view_request>,
template=<your_template_to_render>,
filename=<pdf_filename.pdf>,
context=<a_dcitionary_to_render>,
cmd_options={'load-error-handling': 'ignore'})
Now you could use the rendered_content property to get access to the pdf file:
mail.attach('pdf_filename.pdf', response.rendered_content, 'application/pdf')
In my case I'm using this pdf to attach to an email, you could store it.

How to process an uploaded KML file in GeoDjango

I wrote a cmd line routine to import a kml file into a geoDjango application, which works fine when you feed it a locally saved KML file path (using the datasource object).
Now I am writing a web file upload dialog, to achieve the same thing. This is the beginning of the code that I have, problem is, that the GDAL DataSource object does not seem to understand Djangos UploadedFile format. It is held in memory and not a file path as expected.
What would be the best strategy to convert the UploadedFile to a normal file, and access this through a path? I dont want to keep the file after processing.
def createFeatureSet(request):
if request.method == 'POST':
inMemoryFile = request.FILES['myfile']
name = inMemoryFile.name
POSTGIS_SRID = 900913
ds = DataSource(inMemoryFile) #This line doesnt work!!!
for layer in ds:
if layer.geom_type in (OGRGeomType('Point'), OGRGeomType('Point25D'), OGRGeomType('MultiPoint'), OGRGeomType('MultiPoint25D')):
layerGeomType = OGRGeomType('MultiPoint').django
elif layer.geom_type in (OGRGeomType('LineString'),OGRGeomType('LineString25D'), OGRGeomType('MultiLineString'), OGRGeomType('MultiLineString25D')):
layerGeomType = OGRGeomType('MultiLineString').django
elif layer.geom_type in (OGRGeomType('Polygon'), OGRGeomType('Polygon25D'), OGRGeomType('MultiPolygon'), OGRGeomType('MultiPolygon25D')):
layerGeomType = OGRGeomType('MultiPolygon').django
DataSource is a wrapper around GDAL's C API and needs an actual file. You'll need to write your upload somewhere on the disk, for insance using a tempfile. Then you can pass the file to DataSource.
Here is a suggested solution using a tempfile. I put the processing code in its own function which is now called.
f = request.FILES['myfile']
temp = tempfile.NamedTemporaryFile(delete=False)
temp.write(f.read())
temp.close()
createFeatureSet(temp.name, source_SRID= 900913)