Directing Output Paths of Altered Files - django

How can I direct the destination of the output file to my db?
My models.py is structured like so:
class Model(models.Model):
char = models.CharField(max_length=50, null=False, blank=False)
file = models.FileField(upload_to=upload_location, null=True, blank=True)
I have the user enter a value for 'char', and then the value of 'char' is printed on to a file. The process of successfully printing onto the file is working, however, the file is outputting to my source directory.
My goal is to have the output file 'pdf01.pdf' output to my db and be represented as 'file' so that the admin can read it.
Much of the information in the Dango docs has been focussed on directing the path of objects imported by the user directly, not on files that have been created internally. I have been reading mostly from these docs:
Models-Fields
Models
File response objects
Outputting PDFs
I have seen it recommend to write to a buffer, not a file, then save the buffer contents to my db however I haven't been able to find many examples of how to do that relevant to my situation online.
Perhaps there is a relevant gap in my knowledge regarding buffers and BytesIO? Here is the function I have been using to alter the pdf, I have been using BytesIO to temporarily store files throughout the process but have not been able to figure out how to use it to direct the output anywhere specific.
can = canvas.Canvas(BytesIO(), pagesize=letter)
can.drawString(10, 10, char)
can.save()
BytesIO().seek(0)
text_pdf = PdfFileReader(BytesIO())
base_file = PdfFileReader(open("media/01.pdf", "rb"))
page = base_file.getPage(0)
page.mergePage(text_pdf.getPage(0))
PdfFileWriter().addPage(page)
PdfFileWriter().write(open("pdf01.pdf", "wb")

FileField does not store files directly in the database. Files get uploaded in a location on the filesystem determined by the upload_to argument. Only some metadata are stored in the DB, including the path of the file in your filesystem.
If you want to have the contents of the files in the database, you could create a new File model that includes a BinaryField to store the data and a CharField to store the URL from which the file can be fetched. To feed the data of PdfFileWriter to the binary field of Django, perhaps the most appropriate would be to use BytesIO.

I found this workaround to direct the file to a desired location (in this case both my media_cdn folder and also output it to an admin.)
I set up an admin action to perform the function that outputs the file so the admin will have access to both the output version in the form of both an HTTP response and through the media_cdn storage.
Hope this helps anyone who struggles with the same problem.
#admin.py
class edit_and_output():
def output:
author = Account.email
#alter file . . .
with open('media_cdn/account/{0}.pdf'.format(author), 'wb') as out_file:
output.write(out_file)
response = HttpResponse(content_type='application/pdf')
response['Content-Disposition'] = 'attachment;filename="{0}.pdf"'.format(author)
output.write(response)

Related

Storing a .html report with a data model

I am new to Django, and I am looking for the best approach to the following problem.
I have an application that is producing two reports. One is a JSON blob so I store it in psql with data model that uses JSONField.
The second report is a .html file.
The .html file will be generated multiple times a day so the first thing that came to mind was storing it in the db.
I need to be able to pull the report as well so it can be displayed to the user in the UI.
I created a test data model using TextField:
class TestResultsHTML(models.Model):
name = models.CharField(max_length=200)
report = models.TextField()
It makes it into the Db no problem, however when I attempt to retrieve it I can't seem to get the actual report:
In [3]: html_results = TestResultsHTML.objects.get(id=4)
In [4]: html_results.name
Out[4]: 'b0f5c336-867a-44a3-a5ef-6297bf6042cf'
In [5]: html_results.report
Out[5]: "<_io.TextIOWrapper name='report.html' mode='r' encoding='UTF-8'>"
I was expected that .report would return the actual contents of the file. The file itself is 1800+ lines.
Is this a good approach or is this not the intended use of TextField?
A TextField doesn't store the file, Django has a FileField for this (see here). This saves the file to a certain location/folder and the object saved in the DB essentially stores that location which you can then access later. Something like this:
class TestResultsHTML(models.Model):
name = models.CharField(max_length=200)
file_loc = models.FileField(upload_to=upload_location)
Then open the file at a later date with something like this:
with open(html_results.file_loc, 'w'):

Save pdf from django-wkhtmltopdf to server (instead of returning as a response)

I'm writing a Django function that takes some user input, and generates a pdf for the user. However, the process for generating the pdf is quite intensive, and I'll get a lot of repeated requests so I'd like to store the generated pdfs on the server and check if they already exist before generating them.
The problem is that django-wkhtmltopdf (which I'm using for generation) is meant to return to the user directly, and I'm not sure how to store it on the file.
I have the following, which works for returning a pdf at /pdf:
urls.py
urlpatterns = [
url(r'^pdf$', views.createPDF.as_view(template_name='site/pdftemplate.html', filename='my_pdf.pdf'))
]
views.py
class createPDF(PDFTemplateView):
filename = 'my_pdf.pdf'
template_name = 'site/pdftemplate.html'
So that works fine to create a pdf. What I'd like is to call that view from another view and save the result. Here's what I've got so far:
#Create pdf
pdf = createPDF.as_view(template_name='site/pdftemplate.html', filename='my_pdf.pdf')
pdf = pdf(request).render()
pdfPath = os.path.join(settings.TEMP_DIR,'temp.pdf')
with open(pdfPath, 'w') as f:
f.write(pdf.content)
This creates temp.pdf and is about the size I'd expect but the file isn't valid (it renders as a single completely blank page).
Any suggestions?
Elaborating on the previous answer given: to generate a pdf file and save to disk do this anywhere in your view:
...
context = {...} # build your context
# generate response
response = PDFTemplateResponse(
request=self.request,
template=self.template_name,
filename='file.pdf',
context=context,
cmd_options={'load-error-handling': 'ignore'})
# write the rendered content to a file
with open("file.pdf", "wb") as f:
f.write(response.rendered_content)
...
I have used this code in a TemplateView class so request and template fields were set like that, you may have to set it to whatever is appropriate in your particular case.
Well, you need to take a look to the code of wkhtmltopdf, first you need to use the class PDFTemplateResponse in wkhtmltopdf.views to get access to the rendered_content property, this property get us access to the pdf file:
response = PDFTemplateResponse(
request=<your_view_request>,
template=<your_template_to_render>,
filename=<pdf_filename.pdf>,
context=<a_dcitionary_to_render>,
cmd_options={'load-error-handling': 'ignore'})
Now you could use the rendered_content property to get access to the pdf file:
mail.attach('pdf_filename.pdf', response.rendered_content, 'application/pdf')
In my case I'm using this pdf to attach to an email, you could store it.

Django rest Framework, change filename of ImageField

I have an API endpoint with Django Rest Framework to upload an image.
class MyImageSerializer(serializers.ModelSerializer):
image = serializers.ImageField(source='image')
I can upload images but they are saved with the filename that is sent from the client which can result to collisions. I would like instead to upload the file to my CDN with a timestamp filename.
Generating the filename is not the problem, just saving the image with it.
Any one knows how to do that?
Thanks.
If your image is of type ImageField from django, then you don't really have to do anything, not even declare it in your serializer like you did. It's enough to add it in the fields attribute and django will handle collisions. This means django will add _index on each new file which might generate a collision, so if you upload a file named 'my_pic.jpg' 5 times, you will actually have files 'my_pic.jpg', 'my_pic_1.jpg', 'my_pic_2.jpg', 'my_pic_3.jpg', 'my_pic_4.jpg' on your server.
Now, this is done using django's implementation for FileSystemStorage (see here), but if you want it to append a timestamp to your filename, all you have to do is write a storage class where you overwrite the get_available_name(name) method. Example:
class MyFileSystemStorage(FileSystemStorage):
def get_available_name(self, name):
''' name is the current file name '''
now = time.time()
stamp = datetime.datetime.fromtimestamp(now).strftime('%Y-%m-%d-%H-%M-%S')
return '{0}_{1}'.format(name, str(stamp))
And the image field in your model:
image = models.ImageField(upload_to='your upload dir', storage= MyFileSystemStorage)
Important update
As of August 20, 2014 this is no longer an issue, since Django found a vulnerability related to this behaviour (thanks #mlissner for pointing it out) . Important excerpt :
We’ve remedied the issue by changing the algorithm for generating file
names if a file with the uploaded name already exists.
Storage.get_available_name() now appends an underscore plus a random 7
character alphanumeric string (e.g. "_x3a1gho"), rather than iterating
through an underscore followed by a number (e.g. "_1", "_2", etc.).

How to process an uploaded KML file in GeoDjango

I wrote a cmd line routine to import a kml file into a geoDjango application, which works fine when you feed it a locally saved KML file path (using the datasource object).
Now I am writing a web file upload dialog, to achieve the same thing. This is the beginning of the code that I have, problem is, that the GDAL DataSource object does not seem to understand Djangos UploadedFile format. It is held in memory and not a file path as expected.
What would be the best strategy to convert the UploadedFile to a normal file, and access this through a path? I dont want to keep the file after processing.
def createFeatureSet(request):
if request.method == 'POST':
inMemoryFile = request.FILES['myfile']
name = inMemoryFile.name
POSTGIS_SRID = 900913
ds = DataSource(inMemoryFile) #This line doesnt work!!!
for layer in ds:
if layer.geom_type in (OGRGeomType('Point'), OGRGeomType('Point25D'), OGRGeomType('MultiPoint'), OGRGeomType('MultiPoint25D')):
layerGeomType = OGRGeomType('MultiPoint').django
elif layer.geom_type in (OGRGeomType('LineString'),OGRGeomType('LineString25D'), OGRGeomType('MultiLineString'), OGRGeomType('MultiLineString25D')):
layerGeomType = OGRGeomType('MultiLineString').django
elif layer.geom_type in (OGRGeomType('Polygon'), OGRGeomType('Polygon25D'), OGRGeomType('MultiPolygon'), OGRGeomType('MultiPolygon25D')):
layerGeomType = OGRGeomType('MultiPolygon').django
DataSource is a wrapper around GDAL's C API and needs an actual file. You'll need to write your upload somewhere on the disk, for insance using a tempfile. Then you can pass the file to DataSource.
Here is a suggested solution using a tempfile. I put the processing code in its own function which is now called.
f = request.FILES['myfile']
temp = tempfile.NamedTemporaryFile(delete=False)
temp.write(f.read())
temp.close()
createFeatureSet(temp.name, source_SRID= 900913)

Django upload file into specific directory that depends on the POST URI

I'd like to store uploaded files into a specific directory that depends on the URI of the POST request. Perhaps, I'd also like to rename the file to something fixed (the name of the file input for example) so I have an easy way to grep the file system, etc. and also to avoid possible security problems.
What's the preferred way to do this in Django?
Edit: I should clarify that I'd be interested in possibly doing this as a file upload handler to avoid writing a large file twice to the file system.
Edit2: I suppose one can just 'mv' the tmp file to a new location. That's a cheap operation if on the same file system.
Fixed olooney example. It is working now
#csrf_exempt
def upload_video_file(request):
folder = 'tmp_dir2/' #request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
BASE_PATH = '/home/'
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
file_content = ContentFile( request.FILES['file'].read() )
try:
# Iterate through the chunks.
for chunk in file_content.chunks():
fout.write(chunk)
fout.close()
html = "<html><body>SAVED</body></html>"
return HttpResponse(html)
except:
html = "<html><body>NOT SAVED</body></html>"
return HttpResponse(html)
Django gives you total control over where (and if) you save files. See: http://docs.djangoproject.com/en/dev/topics/http/file-uploads/
The below example shows how to combine the URL and the name of the uploaded file and write the file out to disk:
def upload(request):
folder = request.path.replace("/", "_")
uploaded_filename = request.FILES['file'].name
# create the folder if it doesn't exist.
try:
os.mkdir(os.path.join(BASE_PATH, folder))
except:
pass
# save the uploaded file inside that folder.
full_filename = os.path.join(BASE_PATH, folder, uploaded_filename)
fout = open(full_filename, 'wb+')
# Iterate through the chunks.
for chunk in fout.chunks():
fout.write(chunk)
fout.close()
Edit: How to do this with a FileUploadHandler? It traced down through the code and it seems like you need to do four things to repurpose the TemporaryFileUploadHandler to save outside of FILE_UPLOAD_TEMP_DIR:
extend TemporaryUploadedFile and override init() to pass through a different directory to NamedTemporaryFile. It can use the try mkdir except for pass I showed above.
extend TemporaryFileUploadHandler and override new_file() to use the above class.
also extend init() to accept the directory where you want the folder to go.
Dynamically add the request handler, passing through a directory determined from the URL:
request.upload_handlers = [ProgressBarUploadHandler(request.path.replace('/', '_')]
While non-trivial, it's still easier than writing a handler from scratch: In particular, you won't have to write a single line of error-prone buffered reading. Steps 3 and 4 are necessary because FileUploadHandlers are not passed request information by default, I believe, so you'll have to tell it separately if you want to use the URL somehow.
I can't really recommend writing a custom FileUploadHandler for this. It's really mixing layers of responsibility. Relative to the speed of uploading a file over the internet, doing a local file copy is insignificant. And if the file's small, Django will just keep it in memory without writing it out to a temp file. I have a bad feeling that you'll get all this working and find you can't even measure the performance difference.