Django SuspiciousFileOperation - django

I have a model that contains a FileField:
class Foo(models.Model):
fileobj = models.FileField(upload_to="bar/baz")
I am generating a file, and saving it in /tmp/ as part of the save method. This file then needs to be set as the "fileobj" of the model instance.
Currently, I'm trying this:
with open(
f"/tmp/{self.number}.pdf", "r"
) as h:
self.fileobj = File(h)
Unfortunately, this fails with: django.core.exceptions.SuspiciousFileOperation:, because the file exists outside of the django project.
I've tried reading the docs, but they didn't help much. Does django take a file, and upon assigning it as a FileField, move it to the media directory, or do I need to manually put it there myself, before attaching it to the model instance. If the second case, what is the point of "upload_to"?

You can use InMemoryUploadedFile object like this.
import os
import io
with open(path, 'rb') as h:
f = InMemoryUploadedFile(io.BytesIO(h.read()), 'fileobj',
'name.pdf', 'application/pdf',
os.path.getsize(path), None)
self.fileobj = f

Related

Filling MS Word Template from Django

I found some python docs relating to docxtpl at this link:
https://docxtpl.readthedocs.io/en/latest/
I followed the instruction and entered the code found at this site into a view and created the associated URL. When I go to the URL I would like for a doc to be generated - but I get an error that no HTTP response is being returned. I understand I am not defining one, but I am a bit confused about what HTTP response I need to define (I am still very new to this). The MS word template that I have saved is titled 'template.docx'.
Any help would be greatly appreciated!
VIEWS.PY
def doc_test(request):
doc = DocxTemplate("template.docx")
context = { 'ultimate_consignee' : "World company" }
doc.render(context)
doc.save("generated_doc.docx")
I would like accessing this view to generate the doc, where the variables are filled with what is defined in the context above.
Gist: Read the contents of the file and return the data in an HTTP response.
First of all, you'll have to save the file in memory so that it's easier to read. Instead of saving to a file name like doc.save("generated_doc.docx"), you'll need to save it to a file-like object.
Then read the contents of this file-like object and return it in an HTTP response.
import io
from django.http import HttpResponse
def doc_test(request):
doc = DocxTemplate("template.docx")
# ... your other code ...
doc_io = io.BytesIO() # create a file-like object
doc.save(doc_io) # save data to file-like object
doc_io.seek(0) # go to the beginning of the file-like object
response = HttpResponse(doc_io.read())
# Content-Disposition header makes a file downloadable
response["Content-Disposition"] = "attachment; filename=generated_doc.docx"
# Set the appropriate Content-Type for docx file
response["Content-Type"] = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
return response
Note: This code may or may not work because I haven't tested it. But the general principle remains the same i.e. read the contents of the file and return it in an HTTP response with appropriate headers.
So if this code doesn't work, maybe because the package you're using doesn't support writing to file-like objects or for some other reason, then it would be a good idea to ask the creator of the package or file an issue on their Github about how to read the contents of the file.
Here is a more concise solution:
import os
from io import BytesIO
from django.http import FileResponse
from docxtpl import DocxTemplate
def downloadWord(request, pk):
context = {'first_name' : 'xxx', 'sur_name': 'yyy'}
byte_io = BytesIO()
tpl = DocxTemplate(os.path.join(BASE_PATH, 'template.docx'))
tpl.render(context)
tpl.save(byte_io)
byte_io.seek(0)
return FileResponse(byte_io, as_attachment=True, filename=f'generated_{pk}.docx')

How to associate a generated file with a Django model

I want to create a file and associate it with the FileField of my model. Here's my simplified attempt:
#instantiate my form with the POST data
form = CSSForm(request.POST)
#generate a css object from a ModelForm
css = form.save(commit=False)
#generate some css:
css_string = "body {color: #a9f;}"
#create a css file:
filename = "myfile.css"
#try to write the file and associate it with the model
with open(filename, 'wb') as f:
df = File(f) #create django File object
df.write(css_string)
css.css_file = df
css.save()
The call to save() throws a "seek of closed file" exception. If I move the save() to the with block, it produces an unsupported operation "read". At the moment, the files are being created in my media directory, but are empty. If I just render the css_string with the HttpResponse then I see the expected css.
The docs don't seem to have an example on how to link a generated file and a database field. How do I do this?
Django FileField would either be a django.core.files.File, which is a file instance or django.core.files.base.ContentFile, which takes a string as parameter and compose a ContentFile. Since you already had the file content as a string, sounds like ContentFile is the way to go(I couldn't test it but it should work):
from django.core.files.base import ContentFile
# create an in memory instance
css = form.save(commit=False)
# file content as string
css_string = "body {color: #a9f;}"
# create ContentFile instance
css_file = ContentFile(css_string)
# assign the file to the FileField
css.css_file.save('myfile.css', css_file)
css.save()
Check django doc about FileField details.

Django: Copy FileFields

I'm trying to copy a file using a hardlink, where the file is stored as a Django FileField. I'd like to use a hardlink to save space and copy time (no changes are expected to be made to the original file or copy). However, I'm getting some odd errors when I try to call new_file.save() from the snippet below.
AttributeError: 'file' object has no attribute '_committed'
My thinking is that after making the hardlink, I can just open the linked file and store it to the Django new File instance's FileFile. Am I missing a step here or something?
models.py
class File(models.Model):
stored_file = models.FileField()
elsewhere.py
import os
original_file = File.objects.get(id=1)
original_file_path = original_file.file.path
new_file = File()
new_file_path = '/path/to/new/file'
os.makedirs(os.path.realpath(os.path.dirname(new_file_path)))
os.link(original_file_path, new_file_path)
new_file.stored_file = file(new_file_path)
new_file.save()
There is no need to create hardlink, just duplicate the file holder:
new_file = File(stored_file=original_file.stored_file)
new_file.save()
update
If you want to specify file to FileField or ImageField, you could simply
new_file = File(stored_file=new_file_path)
# or
new_file = File()
new_file.stored_file = new_file_path
# or
from django.core.files.base import File
# from django.core.files.images import ImageFile # for ImageField
new_file.stored_file = File(new_file_path)
the field accepts path in basestring or File() instance, the code in your question uses file() and hence is not accepted.
I think I solved this issue, but not sure why it works. I wrapped the file object in a "DjangoFile" class (I imported as DjangoFile to avoid clashing with my previously defined File model).
from django.core.files.base import File as DjangoFile
...
new_file.stored_file = DjangoFile(file(new_file_path))
new_file.save()
This approached seemed to save the file OK.

How does one use magic to verify file type in a Django form clean method?

I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file types to pdfs, word, and open office documents.
To this end, I have installed python-magic and would like to check file types as follows per the specs for python-magic:
mime = magic.Magic(mime=True)
file_mime_type = mime.from_file('address/of/file.txt')
However, recently uploaded files lack addresses on my server. I also do not know of any method of the mime object akin to "from_file_content" that checks for the mime type given the content of the file.
What is an effective way to use magic to verify file types of uploaded files in Django forms?
Stan described good variant with buffer. Unfortunately the weakness of this method is reading file to the memory. Another option is using temporary stored file:
import tempfile
import magic
with tempfile.NamedTemporaryFile() as tmp:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
Also, you might want to check the file size:
if form.cleaned_data['file'].size < ...:
print(magic.from_buffer(form.cleaned_data['file'].read()))
else:
# store to disk (the code above)
Additionally:
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later).
So you might want to handle it like so:
import os
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
finally:
os.unlink(tmp.name)
tmp.close()
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Where uploaded data is stored
Why no trying something like that in your view :
m = magic.Magic()
m.from_buffer(request.FILES['my_file_field'].read())
Or use request.FILES in place of form.cleaned_data if django.forms.Form is really not an option.
mime = magic.Magic(mime=True)
attachment = form.cleaned_data['attachment']
if hasattr(attachment, 'temporary_file_path'):
# file is temporary on the disk, so we can get full path of it.
mime_type = mime.from_file(attachment.temporary_file_path())
else:
# file is on the memory
mime_type = mime.from_buffer(attachment.read())
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Example from Django code. Performed for image fields during validation.
You can use django-safe-filefield package to validate that uploaded file extension match it MIME-type.
from safe_filefield.forms import SafeFileField
class MyForm(forms.Form):
attachment = SafeFileField(
allowed_extensions=('xls', 'xlsx', 'csv')
)
In case you're handling a file upload and concerned only about images,
Django will set content_type for you (or rather for itself?):
from django.forms import ModelForm
from django.core.files import File
from django.db import models
class MyPhoto(models.Model):
photo = models.ImageField(upload_to=photo_upload_to, max_length=1000)
class MyForm(ModelForm):
class Meta:
model = MyPhoto
fields = ['photo']
photo = MyPhoto.objects.first()
photo = File(open('1.jpeg', 'rb'))
form = MyForm(files={'photo': photo})
if form.is_valid():
print(form.instance.photo.file.content_type)
It doesn't rely on content type provided by the user. But
django.db.models.fields.files.FieldFile.file is an undocumented
property.
Actually, initially content_type is set from the request, but when
the form gets validated, the value is updated.
Regarding non-images, doing request.FILES['name'].read() seems okay to me.
First, that's what Django does. Second, files larger than 2.5 Mb by default
are stored on a disk. So let me point you at the other answer
here.
For the curious, here's the stack trace that leads to updating
content_type:
django.forms.forms.BaseForm.is_valid: self.errors
django.forms.forms.BaseForm.errors: self.full_clean()
django.forms.forms.BaseForm.full_clean: self._clean_fields()
django.forms.forms.BaseForm._clean_fiels: field.clean()
django.forms.fields.FileField.clean: super().clean()
django.forms.fields.Field.clean: self.to_python()
django.forms.fields.ImageField.to_python

Processing file uploads before object is saved

I've got a model like this:
class Talk(BaseModel):
title = models.CharField(max_length=200)
mp3 = models.FileField(upload_to = u'talks/', max_length=200)
seconds = models.IntegerField(blank = True, null = True)
I want to validate before saving that the uploaded file is an MP3, like this:
def is_mp3(path_to_file):
from mutagen.mp3 import MP3
audio = MP3(path_to_file)
return not audio.info.sketchy
Once I'm sure I've got an MP3, I want to save the length of the talk in the seconds attribute, like this:
audio = MP3(path_to_file)
self.seconds = audio.info.length
The problem is, before saving, the uploaded file doesn't have a path (see this ticket, closed as wontfix), so I can't process the MP3.
I'd like to raise a nice validation error so that ModelForms can display a helpful error ("You idiot, you didn't upload an MP3" or something).
Any idea how I can go about accessing the file before it's saved?
p.s. If anyone knows a better way of validating files are MP3s I'm all ears - I also want to be able to mess around with ID3 data (set the artist, album, title and probably album art, so I need it to be processable by mutagen).
You can access the file data in request.FILES while in your view.
I think that best way is to bind uploaded files to a form, override the forms clean method, get the UploadedFile object from cleaned_data, validate it anyway you like, then override the save method and populate your models instance with information about the file and then save it.
a cleaner way to get the file before be saved is like this:
from django.core.exceptions import ValidationError
#this go in your class Model
def clean(self):
try:
f = self.mp3.file #the file in Memory
except ValueError:
raise ValidationError("A File is needed")
f.__class__ #this prints <class 'django.core.files.uploadedfile.InMemoryUploadedFile'>
processfile(f)
and if we need a path, ther answer is in this other question
You could follow the technique used by ImageField where it validates the file header and then seeks back to the start of the file.
class ImageField(FileField):
# ...
def to_python(self, data):
f = super(ImageField, self).to_python(data)
# ...
# We need to get a file object for Pillow. We might have a path or we might
# have to read the data into memory.
if hasattr(data, 'temporary_file_path'):
file = data.temporary_file_path()
else:
if hasattr(data, 'read'):
file = BytesIO(data.read())
else:
file = BytesIO(data['content'])
try:
# ...
except Exception:
# Pillow doesn't recognize it as an image.
six.reraise(ValidationError, ValidationError(
self.error_messages['invalid_image'],
code='invalid_image',
), sys.exc_info()[2])
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
return f