Django Rest Framework read file upload - django

I need to read the contents of a csv file and save into a model.
# MODEL
class FileUpload(models.Model):
datafile = models.FileField(upload_to=file_path_name)
# SIGNAL TO READ THE FILEUPLOAD INSTANCE
#receiver(post_save, sender=FileUpload)
def fileupload_post_save(sender, instance, *args, **kwargs):
with open(instance.datafile, 'rb') as f:
reader = csv.DictReader(f, delimiter='\t')
for row in reader:
print row
The serializer file.
# SERIALIZER
class FileUploadSerializer(serializers.ModelSerializer):
class Meta:
model = FileUpload
When I upload the file, appears this error.
Got a `TypeError` when calling `FileUpload.objects.create()`.
This may be because you have a writable field on the serializer class that is not a valid argument to `FileUpload.objects.create()`. You may need to make the field read-only, or override the FileUploadSerializer.create() method to handle this correctly.
Original exception text was: coercing to Unicode: need string or buffer, FieldFile found.
The open() method should not open an instance of this FileField file?
Does anyone have a better idea for parsing this file? I upload the file and then read or could read before saving? Thanks!!

This is the solution. It's necessary to pass the request directly to DictReader:
if serializer.is_valid():
data = self.request.data.get('datafile')
reader = csv.DictReader(data, delimiter='\t')
for row in reader:
print row['customer']

FieldFile is the data stored on a FileField. If you're looking to open it using the Python open method, you should instead be calling FieldFile.open(). The error is coming from within your post-save signal handler, because open expects the name of a file and you are passing in a FieldFile.

Related

Writing in memory zip file to django FileField

Im trying to read files from FileField, put them all to zip and save that zip to another FileField.
Im trying to avoid using temp file but it seems that i might have to.
Here is what i got so far:
def generate_codified_batch(modeladmin, request, queryset):
for batch in queryset:
pieces = Pieces.objects.filter(batch=batch)
mem_zip = InMemoryZipFile(file_name=batch.name)
for piece in pieces:
in_file = open(piece.file.path, 'rb')
data = in_file.read()
mem_zip.append(filename_in_zip=f'/{piece.folder_assigned} /{piece.period}/{piece.codification}. \
{piece.file_name.rsplit(".")[-1]}'
, file_contents=data)
in_file.close()
files_codified = ContentFile(mem_zip.data)
Batches.objects.filter(pk=batch.id).update(file_codified=files_codified)
InMemoryZipFile is a class from this packet: https://bitbucket.org/ruamel/std.zipfile/src/faa2c8fc9e0072f57857078059ded42192af5435/init.py?at=default&fileviewer=file-view-default#init.py-57
Important are only two last lines
files_codified = ContentFile(mem_zip.data)
Batches.objects.filter(pk=batch.id).update(file_codified=files_codified)
mem_zip.data is a property of InMemoryZip and returns bytes object
(from InMemoryZip class):
self.in_memory_data = StringIO()
#property
def data
return self.in_memory_data.getvalue()
I cannot for the life of me figure out how to read from that bytes object and pass it to FileField.
To assign an in-memory file to a FileField of a Model, you can should use InMemoryUploadedFile or even easier, its subclass SimpleUploadedFile.
Also you should not use a QuerySet's update() function because that only performs the database query but doesn't call a model's save() method which is what saves the file to disk:
So in your code do this:
files_codified = SimpleUploadedFile.from_dict({
'content': mem_zip.data,
'filename': batch.name + ".zip",
'content-type': 'application/zip'})
batch.files_codified = files_codified
batch.save()

How to use validators on FileField content

In my model, I want to use a validator to analyze the content of a file, the thing I can not figure out is how to access the content of the file to parse through it as the file has not yet been saved (which is good) when the validators are running.
I'm not understanding how to get the data from the value passed to the validator into a file (I assume I should use tempfile) so I can then open it and evaluate the data.
Here's a simplified example, in my real code, I want to open the file and evaluate it with csv.
in Models.py
class ValidateFile(object):
....
def __call__(self, value):
# value is the fieldfile object but its not saved
# I believe I need to do something like:
temp_file = tempfile.TemporaryFile()
temp_file.write(value.read())
# Check the data in temp_file
....
class MyItems(models.Model):
data = models.FileField(upload_to=get_upload_path,
validators=[FileExtensionValidator(allowed_extensions=['cv']),
ValidateFile()])
Thanks for the help!
Take a look how this is done in the ImageField implementation:
So your ValidateFile class may be something like this:
from io import BytesIO
class ValidateFile(object):
def __call__(self, value):
if value is None:
#do something when None
return None
if hasattr(value, 'temporary_file_path'):
file = value.temporary_file_path()
else:
if hasattr(value, 'read'):
file = BytesIO(value.read())
else:
file = BytesIO(value['content'])
#Now validate your file
No need for tempfile:
The value passed to a FileField validator is an instance of FieldFile, as already mentioned by the OP.
Under the hood, the FieldFile instance might already use a tempfile.NamedTemporaryFile (source), or it might wrap an in-memory file, but you need not worry about that:
To "evaluate the data" you can simply treat the FieldFile instance as any Python file object.
For example, you could iterate over it:
def my_filefield_validator(value):
# note that value is a FieldFile instance
for line in value:
... # do something with line
The documentation says:
In addition to the API inherited from File such as read() and write(), FieldFile includes several methods that can be used to interact with the underlying file: ...
and the FieldFile class provides
... a wrapper around the result of the Storage.open() method, which may be a File object, or it may be a custom storage’s implementation of the File API.
An example of such an underlying file implementation is the InMemoryUploadedFile docs/source.
Also from the docs:
The File class is a thin wrapper around a Python file object with some Django-specific additions
Also note: class-based validators vs function-based validators

using csvimpoter in django

I want to import the entire csv file in a model without reading row by row from the file.Please help me on this by providing a example model and a source code to import.
If you're opening the file from disk, you can wrap your file object in django.core.files.File and pass it to the save method of the model field you're saving it to:
from django.core.files import File
csv_file = open("sample.csv", "rb")
csv_file = File(csv_file)
my_model_instance.my_file_field.save("sample.csv", csv_file)
See https://docs.djangoproject.com/en/1.3/ref/files/file/#additional-methods-on-files-attached-to-objects
If you're dealing with an uploaded file from request.FILES, you can assign it directly to your model instance's FileField:
my_model_instance.my_file = request.FILES["csvfile"]
my_model_instance.save()
Don't forget enctype="multipart/form-data" on the form or request.FILES will be empty.

Django form validation, clean(), and file upload

Can someone illuminate me as to exactly when an uploaded file is actually written to the location returned by "upload_to" in the FileField, in particular with regards to the order of field, model, and form validation and cleaning?
Right now I have a "clean" method on my model which assumes the uploaded file is in place, so it can do some validation on it. It looks like the file isn't yet saved, and may just be held in a temporary location or in memory. If that is the case, how do I "open" it or find a path to it if I need to execute some external process/program to validate the file?
Thanks,
Ian
The form cleansing has nothing to do with actually saving the file, or with saving any other data for that matter. The file isn't saved until to you run the save() method of the model instance (note that if you use ModelName.objects.create() this save() method is called for you automatically).
The bound form will contain an open File object, so you should be able to do any validation on that object directly. For example:
form = MyForm(request.POST, request.FILES)
if form.is_valid():
file_object = form.cleaned_data['myFile']
#run any validation on the file_object, or define a clean_myFile() method
# that will be run automatically when you call form.is_valid()
model_inst = MyModel('my_file' = file_object,
#assign other attributes here....
)
model_inst.save() #file is saved to disk here
What do you need to do on it? If your validation will work without a temporary file, you can access the data by calling read() on what your file field returns.
def clean_field(self):
_file = self.cleaned_data.get('filefield')
contents = _file.read()
If you do need it on the disk, you know where to go from here :) write it to a temporary location and do some magic on it!
Or write it as a custom form field. This is the basic idea how I go about verification of an MP3 file using the 'mutagen' library.
Notes:
first check the file size then if correct size write to tmp location.
Will write the file to temporary location specified in SETTINGS check its MP3 and then delete it.
The code:
from django import forms
import os
from mutagen.mp3 import MP3, HeaderNotFoundError, InvalidMPEGHeader
from django.conf import settings
class MP3FileField(forms.FileField):
def clean(self, *args, **kwargs):
super(MP3FileField, self).clean(*args, **kwargs)
tmp_file = args[0]
if tmp_file.size > 6600000:
raise forms.ValidationError("File is too large.")
file_path = getattr(settings,'FILE_UPLOAD_TEMP_DIR')+'/'+tmp_file.name
destination = open(file_path, 'wb+')
for chunk in tmp_file.chunks():
destination.write(chunk)
destination.close()
try:
audio = MP3(file_path)
if audio.info.length > 300:
os.remove(file_path)
raise forms.ValidationError("MP3 is too long.")
except (HeaderNotFoundError, InvalidMPEGHeader):
os.remove(file_path)
raise forms.ValidationError("File is not valid MP3 CBR/VBR format.")
os.remove(file_path)
return args

Processing file uploads before object is saved

I've got a model like this:
class Talk(BaseModel):
title = models.CharField(max_length=200)
mp3 = models.FileField(upload_to = u'talks/', max_length=200)
seconds = models.IntegerField(blank = True, null = True)
I want to validate before saving that the uploaded file is an MP3, like this:
def is_mp3(path_to_file):
from mutagen.mp3 import MP3
audio = MP3(path_to_file)
return not audio.info.sketchy
Once I'm sure I've got an MP3, I want to save the length of the talk in the seconds attribute, like this:
audio = MP3(path_to_file)
self.seconds = audio.info.length
The problem is, before saving, the uploaded file doesn't have a path (see this ticket, closed as wontfix), so I can't process the MP3.
I'd like to raise a nice validation error so that ModelForms can display a helpful error ("You idiot, you didn't upload an MP3" or something).
Any idea how I can go about accessing the file before it's saved?
p.s. If anyone knows a better way of validating files are MP3s I'm all ears - I also want to be able to mess around with ID3 data (set the artist, album, title and probably album art, so I need it to be processable by mutagen).
You can access the file data in request.FILES while in your view.
I think that best way is to bind uploaded files to a form, override the forms clean method, get the UploadedFile object from cleaned_data, validate it anyway you like, then override the save method and populate your models instance with information about the file and then save it.
a cleaner way to get the file before be saved is like this:
from django.core.exceptions import ValidationError
#this go in your class Model
def clean(self):
try:
f = self.mp3.file #the file in Memory
except ValueError:
raise ValidationError("A File is needed")
f.__class__ #this prints <class 'django.core.files.uploadedfile.InMemoryUploadedFile'>
processfile(f)
and if we need a path, ther answer is in this other question
You could follow the technique used by ImageField where it validates the file header and then seeks back to the start of the file.
class ImageField(FileField):
# ...
def to_python(self, data):
f = super(ImageField, self).to_python(data)
# ...
# We need to get a file object for Pillow. We might have a path or we might
# have to read the data into memory.
if hasattr(data, 'temporary_file_path'):
file = data.temporary_file_path()
else:
if hasattr(data, 'read'):
file = BytesIO(data.read())
else:
file = BytesIO(data['content'])
try:
# ...
except Exception:
# Pillow doesn't recognize it as an image.
six.reraise(ValidationError, ValidationError(
self.error_messages['invalid_image'],
code='invalid_image',
), sys.exc_info()[2])
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
return f