Writing in memory zip file to django FileField - django

Im trying to read files from FileField, put them all to zip and save that zip to another FileField.
Im trying to avoid using temp file but it seems that i might have to.
Here is what i got so far:
def generate_codified_batch(modeladmin, request, queryset):
for batch in queryset:
pieces = Pieces.objects.filter(batch=batch)
mem_zip = InMemoryZipFile(file_name=batch.name)
for piece in pieces:
in_file = open(piece.file.path, 'rb')
data = in_file.read()
mem_zip.append(filename_in_zip=f'/{piece.folder_assigned} /{piece.period}/{piece.codification}. \
{piece.file_name.rsplit(".")[-1]}'
, file_contents=data)
in_file.close()
files_codified = ContentFile(mem_zip.data)
Batches.objects.filter(pk=batch.id).update(file_codified=files_codified)
InMemoryZipFile is a class from this packet: https://bitbucket.org/ruamel/std.zipfile/src/faa2c8fc9e0072f57857078059ded42192af5435/init.py?at=default&fileviewer=file-view-default#init.py-57
Important are only two last lines
files_codified = ContentFile(mem_zip.data)
Batches.objects.filter(pk=batch.id).update(file_codified=files_codified)
mem_zip.data is a property of InMemoryZip and returns bytes object
(from InMemoryZip class):
self.in_memory_data = StringIO()
#property
def data
return self.in_memory_data.getvalue()
I cannot for the life of me figure out how to read from that bytes object and pass it to FileField.

To assign an in-memory file to a FileField of a Model, you can should use InMemoryUploadedFile or even easier, its subclass SimpleUploadedFile.
Also you should not use a QuerySet's update() function because that only performs the database query but doesn't call a model's save() method which is what saves the file to disk:
So in your code do this:
files_codified = SimpleUploadedFile.from_dict({
'content': mem_zip.data,
'filename': batch.name + ".zip",
'content-type': 'application/zip'})
batch.files_codified = files_codified
batch.save()

Related

Django Rest Framework read file upload

I need to read the contents of a csv file and save into a model.
# MODEL
class FileUpload(models.Model):
datafile = models.FileField(upload_to=file_path_name)
# SIGNAL TO READ THE FILEUPLOAD INSTANCE
#receiver(post_save, sender=FileUpload)
def fileupload_post_save(sender, instance, *args, **kwargs):
with open(instance.datafile, 'rb') as f:
reader = csv.DictReader(f, delimiter='\t')
for row in reader:
print row
The serializer file.
# SERIALIZER
class FileUploadSerializer(serializers.ModelSerializer):
class Meta:
model = FileUpload
When I upload the file, appears this error.
Got a `TypeError` when calling `FileUpload.objects.create()`.
This may be because you have a writable field on the serializer class that is not a valid argument to `FileUpload.objects.create()`. You may need to make the field read-only, or override the FileUploadSerializer.create() method to handle this correctly.
Original exception text was: coercing to Unicode: need string or buffer, FieldFile found.
The open() method should not open an instance of this FileField file?
Does anyone have a better idea for parsing this file? I upload the file and then read or could read before saving? Thanks!!
This is the solution. It's necessary to pass the request directly to DictReader:
if serializer.is_valid():
data = self.request.data.get('datafile')
reader = csv.DictReader(data, delimiter='\t')
for row in reader:
print row['customer']
FieldFile is the data stored on a FileField. If you're looking to open it using the Python open method, you should instead be calling FieldFile.open(). The error is coming from within your post-save signal handler, because open expects the name of a file and you are passing in a FieldFile.

Django REST Framework FileField Data in JSON

In Django REST Framework (DRF), how do I support de-Serializing base64 encoded binary data?
I have a model:
class MyModel(Model):
data = models.FileField(...)
and I want to be able to send this data as base64 encoded rather than having to multi-part form data or a "File Upload". Looking at the Parsers, only FileUploadParser and MultiPartParser seem to parse out the files.
I would like to be able to send this data in something like JSON (ie send the binary data in the data rather than the files:
{
'data':'...'
}
I solved it by creating a new Parser:
def get_B64_JSON_Parser(fields):
class Impl(parsers.JSONParser):
media_type = 'application/json+b64'
def parse(self, *args, **kwargs):
ret = super(Impl, self).parse(*args, **kwargs)
for field in fields:
ret[field] = SimpleUploadedFile(name=field, content=ret[field].decode('base64'))
return ret
return Impl
which I then use in the View:
class TestModelViewSet(viewsets.ModelViewSet):
parser_classes = [get_B64_JSON_Parser(('data_file',)),]
This is an old question, but for those looking for an up-to-date solution, there is a plugin for DRF (drf_base64) that handles this situation. It allows reading files encoded as base64 strings in the JSON request.
So given a model like:
class MyModel(Model):
data = models.FileField(...)
and an expected json like:
{
"data": " ....",
...
}
The (des) serialization can be handled just importing from drf_base modules instead of the drf itself.
from drf_base64.serializers import ModelSerializer
from .models import MyModel
class MyModel(ModelSerializer):
class Meta:
model = MyModel
Just remember that is posible to get a base64 encoded file in javascript with the FileReader API.
There's probably something clever you can do at the serialiser level but the first thing that comes to mind is to do it in the view.
Step 1: Write the file. Something like:
fh = open("/path/to/media/folder/fileToSave.ext", "wb")
fh.write(fileData.decode('base64'))
fh.close()
Step 2: Set the file on the model. Something like:
instance = self.get_object()
instance.file_field.name = 'folder/fileToSave.ext' # `file_field` was `data` in your example
instance.save()
Note the absolute path at Step 1 and the path relative to the media folder at Step 2.
This should at least get you going.
Ideally you'd specify this as a serialiser field and get validation and auto-assignment to the model instance for free. But that seems complicated at first glance.

How to validate contents of a CSV file using Django forms

I have a web app that needs to do the following:
Present a form to request a client side file for CSV import.
Validate the data in the CSV file or ask for another filename.
At one point, I was doing the CSV data validation in the view, after the form.is_valid() call from getting the filename (i.e. I have the imported CSV file into memory in a dictionary using csv.DictReader). After running into problems trying to pass errors back to the original form, I'm now trying to validate the CONTENTS of the CSV file in the form's clean() method.
I'm currently stumped on how to access the in memory file from clean() as the request.FILES object isn't valid. Note that I have no problems presenting the form to the client browser and then manipulating the resulting CSV file. The real issue is how to validate the contents of the CSV file - if I assume the data format is correct I can import it to my models. I'll post my forms.py file to show where I currently am after moving the code from the view to the form:
forms.py
import csv
from django import forms
from io import TextIOWrapper
class CSVImportForm(forms.Form):
filename = forms.FileField(label='Select a CSV file to import:',)
def clean(self):
cleaned_data = super(CSVImportForm, self).clean()
f = TextIOWrapper(request.FILES['filename'].file, encoding='ASCII')
result_csvlist = csv.DictReader(f)
# first line (only) contains additional information about the event
# let's validate that against its form definition
event_info = next(result_csvlist)
f_eventinfo = ResultsForm(event_info)
if not f_eventinfo.is_valid():
raise forms.ValidationError("Error validating 1st line of data (after header) in CSV")
return cleaned_data
class ResultsForm(forms.Form):
RESULT_CHOICES = (('Won', 'Won'),
('Lost', 'Lost'),
('Tie', 'Tie'),
('WonByForfeit', 'WonByForfeit'),
('LostByForfeit', 'LostByForfeit'))
Team1 = forms.CharField(min_length=10, max_length=11)
Team2 = forms.CharField(min_length=10, max_length=11)
Result = forms.ChoiceField(choices=RESULT_CHOICES)
Score = forms.CharField()
Event = forms.CharField()
Venue = forms.CharField()
Date = forms.DateField()
Div = forms.CharField()
Website = forms.URLField(required=False)
TD = forms.CharField(required=False)
I'd love input on what's the "best" method to validate the contents of an uploaded CSV file and present that information back to the client browser!
I assume that when you want to access that file is in this line inside the clean method:
f = TextIOWrapper(request.FILES['filename'].file, encoding='ASCII')
You can't use that line because request doesn't exist but you can access your form's fields so you can try this instead:
f = TextIOWrapper(self.cleaned_data.get('filename'), encoding='ASCII')
Since you have done super.clean in the first line in your method, that should work. Then, if you want to add custom error message to you form you can do it like this:
from django.forms.util import ErrorList
errors = form._errors.setdefault("filename", ErrorList())
errors.append(u"CSV file incorrect")
Hope it helps.

Django form validation, clean(), and file upload

Can someone illuminate me as to exactly when an uploaded file is actually written to the location returned by "upload_to" in the FileField, in particular with regards to the order of field, model, and form validation and cleaning?
Right now I have a "clean" method on my model which assumes the uploaded file is in place, so it can do some validation on it. It looks like the file isn't yet saved, and may just be held in a temporary location or in memory. If that is the case, how do I "open" it or find a path to it if I need to execute some external process/program to validate the file?
Thanks,
Ian
The form cleansing has nothing to do with actually saving the file, or with saving any other data for that matter. The file isn't saved until to you run the save() method of the model instance (note that if you use ModelName.objects.create() this save() method is called for you automatically).
The bound form will contain an open File object, so you should be able to do any validation on that object directly. For example:
form = MyForm(request.POST, request.FILES)
if form.is_valid():
file_object = form.cleaned_data['myFile']
#run any validation on the file_object, or define a clean_myFile() method
# that will be run automatically when you call form.is_valid()
model_inst = MyModel('my_file' = file_object,
#assign other attributes here....
)
model_inst.save() #file is saved to disk here
What do you need to do on it? If your validation will work without a temporary file, you can access the data by calling read() on what your file field returns.
def clean_field(self):
_file = self.cleaned_data.get('filefield')
contents = _file.read()
If you do need it on the disk, you know where to go from here :) write it to a temporary location and do some magic on it!
Or write it as a custom form field. This is the basic idea how I go about verification of an MP3 file using the 'mutagen' library.
Notes:
first check the file size then if correct size write to tmp location.
Will write the file to temporary location specified in SETTINGS check its MP3 and then delete it.
The code:
from django import forms
import os
from mutagen.mp3 import MP3, HeaderNotFoundError, InvalidMPEGHeader
from django.conf import settings
class MP3FileField(forms.FileField):
def clean(self, *args, **kwargs):
super(MP3FileField, self).clean(*args, **kwargs)
tmp_file = args[0]
if tmp_file.size > 6600000:
raise forms.ValidationError("File is too large.")
file_path = getattr(settings,'FILE_UPLOAD_TEMP_DIR')+'/'+tmp_file.name
destination = open(file_path, 'wb+')
for chunk in tmp_file.chunks():
destination.write(chunk)
destination.close()
try:
audio = MP3(file_path)
if audio.info.length > 300:
os.remove(file_path)
raise forms.ValidationError("MP3 is too long.")
except (HeaderNotFoundError, InvalidMPEGHeader):
os.remove(file_path)
raise forms.ValidationError("File is not valid MP3 CBR/VBR format.")
os.remove(file_path)
return args

Processing file uploads before object is saved

I've got a model like this:
class Talk(BaseModel):
title = models.CharField(max_length=200)
mp3 = models.FileField(upload_to = u'talks/', max_length=200)
seconds = models.IntegerField(blank = True, null = True)
I want to validate before saving that the uploaded file is an MP3, like this:
def is_mp3(path_to_file):
from mutagen.mp3 import MP3
audio = MP3(path_to_file)
return not audio.info.sketchy
Once I'm sure I've got an MP3, I want to save the length of the talk in the seconds attribute, like this:
audio = MP3(path_to_file)
self.seconds = audio.info.length
The problem is, before saving, the uploaded file doesn't have a path (see this ticket, closed as wontfix), so I can't process the MP3.
I'd like to raise a nice validation error so that ModelForms can display a helpful error ("You idiot, you didn't upload an MP3" or something).
Any idea how I can go about accessing the file before it's saved?
p.s. If anyone knows a better way of validating files are MP3s I'm all ears - I also want to be able to mess around with ID3 data (set the artist, album, title and probably album art, so I need it to be processable by mutagen).
You can access the file data in request.FILES while in your view.
I think that best way is to bind uploaded files to a form, override the forms clean method, get the UploadedFile object from cleaned_data, validate it anyway you like, then override the save method and populate your models instance with information about the file and then save it.
a cleaner way to get the file before be saved is like this:
from django.core.exceptions import ValidationError
#this go in your class Model
def clean(self):
try:
f = self.mp3.file #the file in Memory
except ValueError:
raise ValidationError("A File is needed")
f.__class__ #this prints <class 'django.core.files.uploadedfile.InMemoryUploadedFile'>
processfile(f)
and if we need a path, ther answer is in this other question
You could follow the technique used by ImageField where it validates the file header and then seeks back to the start of the file.
class ImageField(FileField):
# ...
def to_python(self, data):
f = super(ImageField, self).to_python(data)
# ...
# We need to get a file object for Pillow. We might have a path or we might
# have to read the data into memory.
if hasattr(data, 'temporary_file_path'):
file = data.temporary_file_path()
else:
if hasattr(data, 'read'):
file = BytesIO(data.read())
else:
file = BytesIO(data['content'])
try:
# ...
except Exception:
# Pillow doesn't recognize it as an image.
six.reraise(ValidationError, ValidationError(
self.error_messages['invalid_image'],
code='invalid_image',
), sys.exc_info()[2])
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
return f