Django REST Framework FileField Data in JSON - django

In Django REST Framework (DRF), how do I support de-Serializing base64 encoded binary data?
I have a model:
class MyModel(Model):
data = models.FileField(...)
and I want to be able to send this data as base64 encoded rather than having to multi-part form data or a "File Upload". Looking at the Parsers, only FileUploadParser and MultiPartParser seem to parse out the files.
I would like to be able to send this data in something like JSON (ie send the binary data in the data rather than the files:
{
'data':'...'
}

I solved it by creating a new Parser:
def get_B64_JSON_Parser(fields):
class Impl(parsers.JSONParser):
media_type = 'application/json+b64'
def parse(self, *args, **kwargs):
ret = super(Impl, self).parse(*args, **kwargs)
for field in fields:
ret[field] = SimpleUploadedFile(name=field, content=ret[field].decode('base64'))
return ret
return Impl
which I then use in the View:
class TestModelViewSet(viewsets.ModelViewSet):
parser_classes = [get_B64_JSON_Parser(('data_file',)),]

This is an old question, but for those looking for an up-to-date solution, there is a plugin for DRF (drf_base64) that handles this situation. It allows reading files encoded as base64 strings in the JSON request.
So given a model like:
class MyModel(Model):
data = models.FileField(...)
and an expected json like:
{
"data": " ....",
...
}
The (des) serialization can be handled just importing from drf_base modules instead of the drf itself.
from drf_base64.serializers import ModelSerializer
from .models import MyModel
class MyModel(ModelSerializer):
class Meta:
model = MyModel
Just remember that is posible to get a base64 encoded file in javascript with the FileReader API.

There's probably something clever you can do at the serialiser level but the first thing that comes to mind is to do it in the view.
Step 1: Write the file. Something like:
fh = open("/path/to/media/folder/fileToSave.ext", "wb")
fh.write(fileData.decode('base64'))
fh.close()
Step 2: Set the file on the model. Something like:
instance = self.get_object()
instance.file_field.name = 'folder/fileToSave.ext' # `file_field` was `data` in your example
instance.save()
Note the absolute path at Step 1 and the path relative to the media folder at Step 2.
This should at least get you going.
Ideally you'd specify this as a serialiser field and get validation and auto-assignment to the model instance for free. But that seems complicated at first glance.

Related

convert multiple files from base64 to pdf in memory

I am receiving several files in base64 format and I need to upload them to aws s3 in pdf format but so far I have tried everything and I still can't do it, is there any way to convert them to pdf without creating a file?
i'm using django rest framwork
"balance":"base64String",
"stateOfCashflow":"base64String",
"financialStatementAudit":"base64String",
"managementReport":"base64String",
"certificateOfStockOwnership":"base64String",
"rentDeclaration":"base64String",
I solved it on my own, I found a library called drf_extra_fields which does just what I need.
in the serializer is necesary to use base64filefield which takes the string in base 64 and transforms it into pdf behind the scenes
from drf_extra_fields.fields import Base64FileField
import PyPDF2
import io
class PDFBase64File(Base64FileField):
ALLOWED_TYPES = ['pdf']
def get_file_extension(self, filename, decoded_file):
try:
PyPDF2.PdfFileReader(io.BytesIO(decoded_file))
except PyPDF2.utils.PdfReadError as e:
print(e)
else:
return 'pdf'
class PDFSerializer(serializers.ModelSerializer):
pdf = PDFBase64File()
class Meta:
model = pdf
fields = "__all__"

How to retrieve a .wav file through POST in django and store it in a data model?

I am learning VXML and Django. I am trying to find out how to cleanly retrieve a recording from some voice-xml (vxml) browser and pass it to the server side where I use django to further handle the passed information. Then I want to store the file somewhere in a .wav file to replay it later. I have the following code snippets:
In the VXML file:
<record name="recording" />
[here i record the recording]
<filled>
<submit next="/url/" method="post" namelist="recording"/>
</filled>
In the urls.py of django, I would have
url(r'^url$', view.index, name='index')
The views.index definition
def index(request):
_recording = [..retrieve .wav from request here]
_modelObject = ModelObject(recording= _recording)
_modelObject.save() #store recording in some database
return render(request, 'genericfile.xml', content_type='text/xml')
In the model.py I'd guess I would have a class like:
from django.db import model
class ModelObject(model.Models)
recording = [declare type of .wav file here]
How would I go about completing the steps in the [..] in a clean manner?
I didn't work with vxml before but look like you want to store both .xml format and .wav format.
So here is my solution in this case:
from django.db import model
class ModelObject(model.Models)
# Define a text filed or anything that can store long string
# of _recording var above.
recording = models.TextField()
def save(self, *args, **kwargs):
if self.recording:
# Convert vxml to wav and store to a file
pass
super(ModelObject, self).save(*args, **kwargs)
#property
def recording_wav(self):
if not self.recording:
return None
return 'path/to/file.wav'
Remember use post_delete signal to remove file.wav once an instance of ModelObject is deleted.

How does one use magic to verify file type in a Django form clean method?

I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file types to pdfs, word, and open office documents.
To this end, I have installed python-magic and would like to check file types as follows per the specs for python-magic:
mime = magic.Magic(mime=True)
file_mime_type = mime.from_file('address/of/file.txt')
However, recently uploaded files lack addresses on my server. I also do not know of any method of the mime object akin to "from_file_content" that checks for the mime type given the content of the file.
What is an effective way to use magic to verify file types of uploaded files in Django forms?
Stan described good variant with buffer. Unfortunately the weakness of this method is reading file to the memory. Another option is using temporary stored file:
import tempfile
import magic
with tempfile.NamedTemporaryFile() as tmp:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
Also, you might want to check the file size:
if form.cleaned_data['file'].size < ...:
print(magic.from_buffer(form.cleaned_data['file'].read()))
else:
# store to disk (the code above)
Additionally:
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later).
So you might want to handle it like so:
import os
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
finally:
os.unlink(tmp.name)
tmp.close()
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Where uploaded data is stored
Why no trying something like that in your view :
m = magic.Magic()
m.from_buffer(request.FILES['my_file_field'].read())
Or use request.FILES in place of form.cleaned_data if django.forms.Form is really not an option.
mime = magic.Magic(mime=True)
attachment = form.cleaned_data['attachment']
if hasattr(attachment, 'temporary_file_path'):
# file is temporary on the disk, so we can get full path of it.
mime_type = mime.from_file(attachment.temporary_file_path())
else:
# file is on the memory
mime_type = mime.from_buffer(attachment.read())
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Example from Django code. Performed for image fields during validation.
You can use django-safe-filefield package to validate that uploaded file extension match it MIME-type.
from safe_filefield.forms import SafeFileField
class MyForm(forms.Form):
attachment = SafeFileField(
allowed_extensions=('xls', 'xlsx', 'csv')
)
In case you're handling a file upload and concerned only about images,
Django will set content_type for you (or rather for itself?):
from django.forms import ModelForm
from django.core.files import File
from django.db import models
class MyPhoto(models.Model):
photo = models.ImageField(upload_to=photo_upload_to, max_length=1000)
class MyForm(ModelForm):
class Meta:
model = MyPhoto
fields = ['photo']
photo = MyPhoto.objects.first()
photo = File(open('1.jpeg', 'rb'))
form = MyForm(files={'photo': photo})
if form.is_valid():
print(form.instance.photo.file.content_type)
It doesn't rely on content type provided by the user. But
django.db.models.fields.files.FieldFile.file is an undocumented
property.
Actually, initially content_type is set from the request, but when
the form gets validated, the value is updated.
Regarding non-images, doing request.FILES['name'].read() seems okay to me.
First, that's what Django does. Second, files larger than 2.5 Mb by default
are stored on a disk. So let me point you at the other answer
here.
For the curious, here's the stack trace that leads to updating
content_type:
django.forms.forms.BaseForm.is_valid: self.errors
django.forms.forms.BaseForm.errors: self.full_clean()
django.forms.forms.BaseForm.full_clean: self._clean_fields()
django.forms.forms.BaseForm._clean_fiels: field.clean()
django.forms.fields.FileField.clean: super().clean()
django.forms.fields.Field.clean: self.to_python()
django.forms.fields.ImageField.to_python

How to limit file types on file uploads for ModelForms with FileFields?

My goal is to limit a FileField on a Django ModelForm to PDFs and Word Documents. The answers I have googled all deal with creating a separate file handler, but I am not sure how to do so in the context of a ModelForm. Is there a setting in settings.py I may use to limit upload file types?
Create a validation method like:
def validate_file_extension(value):
if not value.name.endswith('.pdf'):
raise ValidationError(u'Error message')
and include it on the FileField validators like this:
actual_file = models.FileField(upload_to='uploaded_files', validators=[validate_file_extension])
Also, instead of manually setting which extensions your model allows, you should create a list on your setting.py and iterate over it.
Edit
To filter for multiple files:
def validate_file_extension(value):
import os
ext = os.path.splitext(value.name)[1]
valid_extensions = ['.pdf','.doc','.docx']
if not ext in valid_extensions:
raise ValidationError(u'File not supported!')
Validating with the extension of a file name is not a consistent way. For example I can rename a picture.jpg into a picture.pdf and the validation won't raise an error.
A better approach is to check the content_type of a file.
Validation Method
def validate_file_extension(value):
if value.file.content_type != 'application/pdf':
raise ValidationError(u'Error message')
Usage
actual_file = models.FileField(upload_to='uploaded_files', validators=[validate_file_extension])
An easier way of doing it is as below in your Form
file = forms.FileField(widget=forms.FileInput(attrs={'accept':'application/pdf'}))
Django since 1.11 has a FileExtensionValidator for this purpose:
class SomeDocument(Model):
document = models.FileFiled(validators=[
FileExtensionValidator(allowed_extensions=['pdf', 'doc'])])
As #savp mentioned, you will also want to customize the widget so that users can't select inappropriate files in the first place:
class SomeDocumentForm(ModelForm):
class Meta:
model = SomeDocument
widgets = {'document': FileInput(attrs={'accept': 'application/pdf,application/msword'})}
fields = '__all__'
You may need to fiddle with accept to figure out exactly what MIME types are needed for your purposes.
As others have mentioned, none of this will prevent someone from renaming badstuff.exe to innocent.pdf and uploading it through your form—you will still need to handle the uploaded file safely. Something like the python-magic library can help you determine the actual file type once you have the contents.
For a more generic use, I wrote a small class ExtensionValidator that extends Django's built-in RegexValidator. It accepts single or multiple extensions, as well as an optional custom error message.
class ExtensionValidator(RegexValidator):
def __init__(self, extensions, message=None):
if not hasattr(extensions, '__iter__'):
extensions = [extensions]
regex = '\.(%s)$' % '|'.join(extensions)
if message is None:
message = 'File type not supported. Accepted types are: %s.' % ', '.join(extensions)
super(ExtensionValidator, self).__init__(regex, message)
def __call__(self, value):
super(ExtensionValidator, self).__call__(value.name)
Now you can define a validator inline with the field, e.g.:
my_file = models.FileField('My file', validators=[ExtensionValidator(['pdf', 'doc', 'docx'])])
I use something along these lines (note, "pip install filemagic" is required for this...):
import magic
def validate_mime_type(value):
supported_types=['application/pdf',]
with magic.Magic(flags=magic.MAGIC_MIME_TYPE) as m:
mime_type=m.id_buffer(value.file.read(1024))
value.file.seek(0)
if mime_type not in supported_types:
raise ValidationError(u'Unsupported file type.')
You could probably also incorporate the previous examples into this - for example also check the extension/uploaded type (which might be faster as a primary check than magic.) This still isn't foolproof - but it's better, since it relies more on data in the file, rather than browser provided headers.
Note: This is a validator function that you'd want to add to the list of validators for the FileField model.
I find that the best way to check the type of a file is by checking its content type. I will also add that one the best place to do type checking is in form validation. I would have a form and a validation as follows:
class UploadFileForm(forms.Form):
file = forms.FileField()
def clean_file(self):
data = self.cleaned_data['file']
# check if the content type is what we expect
content_type = data.content_type
if content_type == 'application/pdf':
return data
else:
raise ValidationError(_('Invalid content type'))
The following documentation links can be helpful:
https://docs.djangoproject.com/en/3.1/ref/files/uploads/ and https://docs.djangoproject.com/en/3.1/ref/forms/validation/
I handle this by using a clean_[your_field] method on a ModelForm. You could set a list of acceptable file extensions in settings.py to check against in your clean method, but there's nothing built-into settings.py to limit upload types.
Django-Filebrowser, for example, takes the approach of creating a list of acceptable file extensions in settings.py.
Hope that helps you out.

How do you serialize a model instance in Django?

There is a lot of documentation on how to serialize a Model QuerySet but how do you just serialize to JSON the fields of a Model Instance?
You can easily use a list to wrap the required object and that's all what django serializers need to correctly serialize it, eg.:
from django.core import serializers
# assuming obj is a model instance
serialized_obj = serializers.serialize('json', [ obj, ])
If you're dealing with a list of model instances the best you can do is using serializers.serialize(), it gonna fit your need perfectly.
However, you are to face an issue with trying to serialize a single object, not a list of objects. That way, in order to get rid of different hacks, just use Django's model_to_dict (if I'm not mistaken, serializers.serialize() relies on it, too):
from django.forms.models import model_to_dict
# assuming obj is your model instance
dict_obj = model_to_dict( obj )
You now just need one straight json.dumps call to serialize it to json:
import json
serialized = json.dumps(dict_obj)
That's it! :)
To avoid the array wrapper, remove it before you return the response:
import json
from django.core import serializers
def getObject(request, id):
obj = MyModel.objects.get(pk=id)
data = serializers.serialize('json', [obj,])
struct = json.loads(data)
data = json.dumps(struct[0])
return HttpResponse(data, mimetype='application/json')
I found this interesting post on the subject too:
http://timsaylor.com/convert-django-model-instances-to-dictionaries
It uses django.forms.models.model_to_dict, which looks like the perfect tool for the job.
There is a good answer for this and I'm surprised it hasn't been mentioned. With a few lines you can handle dates, models, and everything else.
Make a custom encoder that can handle models:
from django.forms import model_to_dict
from django.core.serializers.json import DjangoJSONEncoder
from django.db.models import Model
class ExtendedEncoder(DjangoJSONEncoder):
def default(self, o):
if isinstance(o, Model):
return model_to_dict(o)
return super().default(o)
Now use it when you use json.dumps
json.dumps(data, cls=ExtendedEncoder)
Now models, dates and everything can be serialized and it doesn't have to be in an array or serialized and unserialized. Anything you have that is custom can just be added to the default method.
You can even use Django's native JsonResponse this way:
from django.http import JsonResponse
JsonResponse(data, encoder=ExtendedEncoder)
It sounds like what you're asking about involves serializing the data structure of a Django model instance for interoperability. The other posters are correct: if you wanted the serialized form to be used with a python application that can query the database via Django's api, then you would wan to serialize a queryset with one object. If, on the other hand, what you need is a way to re-inflate the model instance somewhere else without touching the database or without using Django, then you have a little bit of work to do.
Here's what I do:
First, I use demjson for the conversion. It happened to be what I found first, but it might not be the best. My implementation depends on one of its features, but there should be similar ways with other converters.
Second, implement a json_equivalent method on all models that you might need serialized. This is a magic method for demjson, but it's probably something you're going to want to think about no matter what implementation you choose. The idea is that you return an object that is directly convertible to json (i.e. an array or dictionary). If you really want to do this automatically:
def json_equivalent(self):
dictionary = {}
for field in self._meta.get_all_field_names()
dictionary[field] = self.__getattribute__(field)
return dictionary
This will not be helpful to you unless you have a completely flat data structure (no ForeignKeys, only numbers and strings in the database, etc.). Otherwise, you should seriously think about the right way to implement this method.
Third, call demjson.JSON.encode(instance) and you have what you want.
If you want to return the single model object as a json response to a client, you can do this simple solution:
from django.forms.models import model_to_dict
from django.http import JsonResponse
movie = Movie.objects.get(pk=1)
return JsonResponse(model_to_dict(movie))
If you're asking how to serialize a single object from a model and you know you're only going to get one object in the queryset (for instance, using objects.get), then use something like:
import django.core.serializers
import django.http
import models
def jsonExample(request,poll_id):
s = django.core.serializers.serialize('json',[models.Poll.objects.get(id=poll_id)])
# s is a string with [] around it, so strip them off
o=s.strip("[]")
return django.http.HttpResponse(o, mimetype="application/json")
which would get you something of the form:
{"pk": 1, "model": "polls.poll", "fields": {"pub_date": "2013-06-27T02:29:38.284Z", "question": "What's up?"}}
.values() is what I needed to convert a model instance to JSON.
.values() documentation: https://docs.djangoproject.com/en/3.0/ref/models/querysets/#values
Example usage with a model called Project.
Note: I'm using Django Rest Framework
from django.http import JsonResponse
#csrf_exempt
#api_view(["GET"])
def get_project(request):
id = request.query_params['id']
data = Project.objects.filter(id=id).values()
if len(data) == 0:
return JsonResponse(status=404, data={'message': 'Project with id {} not found.'.format(id)})
return JsonResponse(data[0])
Result from a valid id:
{
"id": 47,
"title": "Project Name",
"description": "",
"created_at": "2020-01-21T18:13:49.693Z",
}
I solved this problem by adding a serialization method to my model:
def toJSON(self):
import simplejson
return simplejson.dumps(dict([(attr, getattr(self, attr)) for attr in [f.name for f in self._meta.fields]]))
Here's the verbose equivalent for those averse to one-liners:
def toJSON(self):
fields = []
for field in self._meta.fields:
fields.append(field.name)
d = {}
for attr in fields:
d[attr] = getattr(self, attr)
import simplejson
return simplejson.dumps(d)
_meta.fields is an ordered list of model fields which can be accessed from instances and from the model itself.
Here's my solution for this, which allows you to easily customize the JSON as well as organize related records
Firstly implement a method on the model. I call is json but you can call it whatever you like, e.g.:
class Car(Model):
...
def json(self):
return {
'manufacturer': self.manufacturer.name,
'model': self.model,
'colors': [color.json for color in self.colors.all()],
}
Then in the view I do:
data = [car.json for car in Car.objects.all()]
return HttpResponse(json.dumps(data), content_type='application/json; charset=UTF-8', status=status)
Use list, it will solve problem
Step1:
result=YOUR_MODELE_NAME.objects.values('PROP1','PROP2').all();
Step2:
result=list(result) #after getting data from model convert result to list
Step3:
return HttpResponse(json.dumps(result), content_type = "application/json")
Use Django Serializer with python format,
from django.core import serializers
qs = SomeModel.objects.all()
serialized_obj = serializers.serialize('python', qs)
What's difference between json and python format?
The json format will return the result as str whereas python will return the result in either list or OrderedDict
To serialize and deserialze, use the following:
from django.core import serializers
serial = serializers.serialize("json", [obj])
...
# .next() pulls the first object out of the generator
# .object retrieves django object the object from the DeserializedObject
obj = next(serializers.deserialize("json", serial)).object
All of these answers were a little hacky compared to what I would expect from a framework, the simplest method, I think by far, if you are using the rest framework:
rep = YourSerializerClass().to_representation(your_instance)
json.dumps(rep)
This uses the Serializer directly, respecting the fields you've defined on it, as well as any associations, etc.
It doesn't seem you can serialize an instance, you'd have to serialize a QuerySet of one object.
from django.core import serializers
from models import *
def getUser(request):
return HttpResponse(json(Users.objects.filter(id=88)))
I run out of the svn release of django, so this may not be in earlier versions.
ville = UneVille.objects.get(nom='lihlihlihlih')
....
blablablab
.......
return HttpResponse(simplejson.dumps(ville.__dict__))
I return the dict of my instance
so it return something like {'field1':value,"field2":value,....}
how about this way:
def ins2dic(obj):
SubDic = obj.__dict__
del SubDic['id']
del SubDic['_state']
return SubDic
or exclude anything you don't want.
This is a project that it can serialize(JSON base now) all data in your model and put them to a specific directory automatically and then it can deserialize it whenever you want... I've personally serialized thousand records with this script and then load all of them back to another database without any losing data.
Anyone that would be interested in opensource projects can contribute this project and add more feature to it.
serializer_deserializer_model
Let this is a serializers for CROPS, Do like below. It works for me, Definitely It will work for you also.
First import serializers
from django.core import serializers
Then you can write like this
class CropVarietySerializer(serializers.Serializer):
crop_variety_info = serializers.serialize('json', [ obj, ])
OR you can write like this
class CropVarietySerializer(serializers.Serializer):
crop_variety_info = serializers.JSONField()
Then Call this serializer inside your views.py
For more details, Please visit https://docs.djangoproject.com/en/4.1/topics/serialization/
serializers.JSONField(*args, **kwargs) and serializers.JSONField() are same. you can also visit https://www.django-rest-framework.org/api-guide/fields/ for JSONField() details.