Import files into a collection from CLI (django command) - django

I'm trying to build a django command to upload files and create associated pages for them.
My docs are PDF files, and my problem is to automatically "upload" those files into the right target "media" directory, without explicit copying them with my command script from the 'docs repository' to the MEDIA_ROOT defined directory.
I've tryed to use:
Code
f = File(open(file_path, 'r'))
# models.OfficeDocument is an inheritor of BaseDocument class
new_document, created = models.OfficeDocument.objects.get_or_create(title=title,
collection=collection,
file=f)
Error
SuspiciousFileOperation: The joined path (<my_local_path>) is located outside of the base path component (<MEDIA_ROOT path>)
but wagtail says me I'm not in the right directory (not in MEDIA_ROOT)
How can I do that?

You are trying to save the documents with their files in the current location, without copying them to the new location first. You could add some code to copy the files to the right location, or it seems reasonable to mimic what the Wagtail documents add view does here—instantiate a modelform, clean it, and call its save method. That will handle the saving of the document as it has the upload_to property configured on the file field.
Try:
from wagtail.wagtailcore.models import get_root_collection_id
collection = get_root_collecion_id()
user = some_user_you_will_attribute_these_to
doc = Document(uploaded_by_user=user)
upload_dict = {
'title': some_title,
'file': f,
'collection': collection,
'tags': '',
}
form = DocumentForm(upload_dict, f, instance=doc, user=request.user)
if form.is_valid():
form.save()
You might find that the file object needs to be something like a wrapped Django UploadedFile instance instead, see https://docs.djangoproject.com/en/1.11/ref/files/uploads/

Related

Django SuspiciousFileOperation

I have a model that contains a FileField:
class Foo(models.Model):
fileobj = models.FileField(upload_to="bar/baz")
I am generating a file, and saving it in /tmp/ as part of the save method. This file then needs to be set as the "fileobj" of the model instance.
Currently, I'm trying this:
with open(
f"/tmp/{self.number}.pdf", "r"
) as h:
self.fileobj = File(h)
Unfortunately, this fails with: django.core.exceptions.SuspiciousFileOperation:, because the file exists outside of the django project.
I've tried reading the docs, but they didn't help much. Does django take a file, and upon assigning it as a FileField, move it to the media directory, or do I need to manually put it there myself, before attaching it to the model instance. If the second case, what is the point of "upload_to"?
You can use InMemoryUploadedFile object like this.
import os
import io
with open(path, 'rb') as h:
f = InMemoryUploadedFile(io.BytesIO(h.read()), 'fileobj',
'name.pdf', 'application/pdf',
os.path.getsize(path), None)
self.fileobj = f

How to associate a generated file with a Django model

I want to create a file and associate it with the FileField of my model. Here's my simplified attempt:
#instantiate my form with the POST data
form = CSSForm(request.POST)
#generate a css object from a ModelForm
css = form.save(commit=False)
#generate some css:
css_string = "body {color: #a9f;}"
#create a css file:
filename = "myfile.css"
#try to write the file and associate it with the model
with open(filename, 'wb') as f:
df = File(f) #create django File object
df.write(css_string)
css.css_file = df
css.save()
The call to save() throws a "seek of closed file" exception. If I move the save() to the with block, it produces an unsupported operation "read". At the moment, the files are being created in my media directory, but are empty. If I just render the css_string with the HttpResponse then I see the expected css.
The docs don't seem to have an example on how to link a generated file and a database field. How do I do this?
Django FileField would either be a django.core.files.File, which is a file instance or django.core.files.base.ContentFile, which takes a string as parameter and compose a ContentFile. Since you already had the file content as a string, sounds like ContentFile is the way to go(I couldn't test it but it should work):
from django.core.files.base import ContentFile
# create an in memory instance
css = form.save(commit=False)
# file content as string
css_string = "body {color: #a9f;}"
# create ContentFile instance
css_file = ContentFile(css_string)
# assign the file to the FileField
css.css_file.save('myfile.css', css_file)
css.save()
Check django doc about FileField details.

Django reupload images when changed upload_to

I have changed upload_to attribute of my models image field. How can I re-upload all images to new paths?
So, I think 're-uploading' is the wrong way to think about it -- re-uploading the images will still leave the old ones lying around, which (depending on how many images you have) could be a massive waste of space. One way to do this instead would be by the following two step process:
1) Move the files manually, on your server, to the new upload_to location via whatever method is OS appropriate. This could probably all be done with one mv command on linux, if that's what you're hosting on.
2) If you just changed the upload_to attribute, and didn't change the MEDIA_ROOT settings or anything else, what you need to change is the Imagefield's name property. An ImageField's name properly usually is a joining of your upload_to string and your image's filename (this then gets appended to MEDIA_URL to form the images url or MEDIA_ROOT to form the actual upload path). So you could update the models in your Django shell by typing something like this:
import os
from my_app import MyModel
newpath = 'your/new/upload_to/'
for obj in MyModel.objects.all():
image_name = os.path.split(obj.my_img_field.name)[1]
obj.my_img_field.name = newpath + image_name
obj.save()
You can check to see if everything worked properly by calling obj.my_img_field.url and seeing if that points where it should.
Here's a little snippet that I made when I needed to do this on many models and didn't want to do this on th OS level.
For use with strftime this have to be modified though.
models = (YourModel1, YourModel2)
for Model in models:
for field in Model._meta.get_fields():
if not hasattr(field, 'upload_to'):
continue
for instance in Model.objects.all():
f = getattr(instance, field.name)
if not f:
continue
if field.upload_to not in str(f):
filename = os.path.basename(f.name)
new_path = os.path.join(field.upload_to, filename)
os.makedirs(
os.path.join(
settings.MEDIA_ROOT,
field.upload_to
),
exist_ok=True
)
try:
shutil.move(
os.path.join(settings.MEDIA_ROOT, f.name),
os.path.join(settings.MEDIA_ROOT, new_path)
)
setattr(instance, field.name, new_path)
except FileNotFoundError as e:
logger.error("Not found {}".format(field.name))
logger.error(str(e))
else:
instance.save()

How does one use magic to verify file type in a Django form clean method?

I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file types to pdfs, word, and open office documents.
To this end, I have installed python-magic and would like to check file types as follows per the specs for python-magic:
mime = magic.Magic(mime=True)
file_mime_type = mime.from_file('address/of/file.txt')
However, recently uploaded files lack addresses on my server. I also do not know of any method of the mime object akin to "from_file_content" that checks for the mime type given the content of the file.
What is an effective way to use magic to verify file types of uploaded files in Django forms?
Stan described good variant with buffer. Unfortunately the weakness of this method is reading file to the memory. Another option is using temporary stored file:
import tempfile
import magic
with tempfile.NamedTemporaryFile() as tmp:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
Also, you might want to check the file size:
if form.cleaned_data['file'].size < ...:
print(magic.from_buffer(form.cleaned_data['file'].read()))
else:
# store to disk (the code above)
Additionally:
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later).
So you might want to handle it like so:
import os
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
finally:
os.unlink(tmp.name)
tmp.close()
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Where uploaded data is stored
Why no trying something like that in your view :
m = magic.Magic()
m.from_buffer(request.FILES['my_file_field'].read())
Or use request.FILES in place of form.cleaned_data if django.forms.Form is really not an option.
mime = magic.Magic(mime=True)
attachment = form.cleaned_data['attachment']
if hasattr(attachment, 'temporary_file_path'):
# file is temporary on the disk, so we can get full path of it.
mime_type = mime.from_file(attachment.temporary_file_path())
else:
# file is on the memory
mime_type = mime.from_buffer(attachment.read())
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Example from Django code. Performed for image fields during validation.
You can use django-safe-filefield package to validate that uploaded file extension match it MIME-type.
from safe_filefield.forms import SafeFileField
class MyForm(forms.Form):
attachment = SafeFileField(
allowed_extensions=('xls', 'xlsx', 'csv')
)
In case you're handling a file upload and concerned only about images,
Django will set content_type for you (or rather for itself?):
from django.forms import ModelForm
from django.core.files import File
from django.db import models
class MyPhoto(models.Model):
photo = models.ImageField(upload_to=photo_upload_to, max_length=1000)
class MyForm(ModelForm):
class Meta:
model = MyPhoto
fields = ['photo']
photo = MyPhoto.objects.first()
photo = File(open('1.jpeg', 'rb'))
form = MyForm(files={'photo': photo})
if form.is_valid():
print(form.instance.photo.file.content_type)
It doesn't rely on content type provided by the user. But
django.db.models.fields.files.FieldFile.file is an undocumented
property.
Actually, initially content_type is set from the request, but when
the form gets validated, the value is updated.
Regarding non-images, doing request.FILES['name'].read() seems okay to me.
First, that's what Django does. Second, files larger than 2.5 Mb by default
are stored on a disk. So let me point you at the other answer
here.
For the curious, here's the stack trace that leads to updating
content_type:
django.forms.forms.BaseForm.is_valid: self.errors
django.forms.forms.BaseForm.errors: self.full_clean()
django.forms.forms.BaseForm.full_clean: self._clean_fields()
django.forms.forms.BaseForm._clean_fiels: field.clean()
django.forms.fields.FileField.clean: super().clean()
django.forms.fields.Field.clean: self.to_python()
django.forms.fields.ImageField.to_python

USe dynamic destination folder for uploaded file in Django

I would like to create dynamically the destination of my uploaded files.
But, it seems that the 'upload_to' option is only available for a models, not for forms. So the following code is wrong.
class MyForm(forms.Form):
fichier = forms.FileField(**upload_to='files/%m-%Y/'**)
In the view handling the uploaded file, the destination is static. How can I make it dynamic ?
Thank you.
class YourFileModel(models.Model)
def upload_path(self, name):
name = do_sth_with_name(name)
folder = generate_folder_name(self.id, self.whatever_field)
return 'uploads/' + folder + '/' + name
file = models.FileField(upload_to=upload_path)
edit after comment
def handle_uploaded_file(file):
# generate dynamic path
# save file to that path
example here http://docs.djangoproject.com/en/dev/topics/http/file-uploads/#handling-uploaded-files
if form from model, override the save() method
class YourForm(forms.ModelForm):
fichier = forms.FileField()
def save(self):
if self.cleaned_data['fichier']:
file = handle_uploaded_file(self.cleaned_data['fichier'])
super(YourForm, self).save()
if not form from model, call the upload handler in your view
def your_view(request):
#####
if form.is_valid():
file = handle_uploaded_file(form.cleaned_data['fichier'])
Instead of a string, supply a callable -- i.e. the name of a function that takes the model instance and a string, and returns the desired name. See FileField docs for specifics. One thing they don't say (at least I can't find it in the docs) is that if the returned filename starts with '/' then it is an absolute path, otherwise it is relative to your /media directory.