Django Elasticsearch AWS httplib UnicodeDecodeError - django

I'm trying to setup Elasticsearch with Django (without Haystack).
Everything works perfectly locally.
But when I try to use the elasticsearch-py client with IAM based authentication on AWS I get this error :
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 827, in _send_output
msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position132: ordinal not in range(128)
I tried to use aws-es-connection and requests-aws4auth but I get the same error.
It only works when I allow open access on AWS and use only elasticsearch-py like so
from elasticsearch import Elasticsearch, RequestsHttpConnection
ES_CLIENT = Elasticsearch(
['search-domain-xxx.us-east-1.es.amazonaws.com'],
connection_class=RequestsHttpConnection
)
But I want something more secure ...
I think it is a utf-8/unicode/str problem but I can't manage to resolve it :(

I finally manage to fix the bug by using a custom serializer :
from elasticsearch import Elasticsearch, RequestsHttpConnection, serializer, compat, exceptions
class JSONSerializerPython2(serializer.JSONSerializer):
"""Override elasticsearch library serializer to ensure it encodes utf characters during json dump.
See original at: https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/serializer.py#L42
A description of how ensure_ascii encodes unicode characters to ensure they can be sent across the wire
as ascii can be found here: https://docs.python.org/2/library/json.html#basic-usage
"""
def dumps(self, data):
# don't serialize strings
if isinstance(data, compat.string_types):
return data
try:
return json.dumps(data, default=self.default, ensure_ascii=True)
except (ValueError, TypeError) as e:
raise exceptions.SerializationError(data, e)
and then pass the serializer to elasticsearch
from elasticsearch import Elasticsearch
es = Elasticsearch(..., serializer=JSONSerializerPython2())
I found the solution here

If someone can find the reason behind this error, I would be immensely grateful. Things I have tried with no luck:
encode things to 'utf-8' before writing (result: no errors thrown, es index not even created)
something like the code-snippet below (result: no errors thrown, es index not even created)
def convert_unicode_to_str(item):
if isinstance(item, basestring):
return str(item)
.....
.....
[Note: This is a bit of a hack, but it gets the job done]
So, sys.setdefaultencoding(..) is not available in the sys module's namespace. This is solved by calling the reload function on sys.
Now, prior to writing to ES, setting the default encoding to 'utf-8' solves this problem. Just to be sure that nothing else breaks as a result of this, I think it is a good idea to reset the default encoding to what it was before the switch.
reload(sys)
# keep track of what the old encoding was
old_encoding = sys.getdefaultencoding()
# set the default encoding to `itf-8`
sys.setdefaultencoding('utf-8')
#################
## write to es ##
#################
# reset the state of the world to what it was
sys.setdefaultencoding(old_encoding)

extending Debosmit Ray's solution for django-haystack.
add to search_indexes:
def set_default_utf8(method):
"""
Set utf-8 as default encoding, perform `method`,
then set the old encoding back.
"""
def wrapper(*args, **kwargs):
reload(sys)
old_encoding = sys.getdefaultencoding()
sys.setdefaultencoding('utf-8')
result = method(*args, **kwargs)
sys.setdefaultencoding(old_encoding)
return result
return wrapper
class EntryIndex(SearchIndex):
text = CharField(document=True, use_template=True)
...
#set_default_utf8
def update(self, using=None):
"""Overrides entire index update to use UTF-8."""
super(StreamEntryIndex, self).update(using)
#set_default_utf8
def update_object(self, instance, using=None, **kwargs):
"""Overrides index update for a single object using UTF-8."""
super(StreamEntryIndex, self).update_object(instance, using, **kwargs)

Related

google-ml-engine custom prediction routine error responses

I have a custom prediction routine in google-ml-engine. Works very well.
I now am doing input checking on the instance data, and want to return error responses from my predict routine.
The example: https://cloud.google.com/ai-platform/prediction/docs/custom-prediction-routines
Raises exceptions on input errors, etc. However, when this happens the response body always has {'error': Prediction failed: unknown error}. I can see the correct errors are being logged in google cloud console, but the https response is always the same unknown error.
My question is:
How to make the Custom prediction routine return a proper error code and error message string?
Instead of returning a prediction, I can return an error string/code in prediction -but it ends up in the prediction part of the response which seems hacky and doesn't get any of the google errors eg based on instance size.
root:test_deployment.py:35 {'predictions': {'error': "('Instance does not include required sensors', 'occurred at index 0')"}}
What's the best way to do this?
Thanks!
David
Please take a look at the following code, I created a _validate function inside predict and use a custom Exception class.
Basically, I validate instances, before I call the model predict method and handle the exception.
There may be some overhead to the response time when doing this validation, which you need to test for your use case.
requests = [
"god this episode sucks",
"meh, I kinda like it",
"what were the writer thinking, omg!",
"omg! what a twist, who would'v though :o!",
99999
]
api = discovery.build('ml', 'v1')
parent = 'projects/{}/models/{}/versions/{}'.format(PROJECT, MODEL_NAME, VERSION_NAME)
parent = 'projects/{}/models/{}'.format(PROJECT, MODEL_NAME)
response = api.projects().predict(body=request_data, name=parent).execute()
{'predictions': [{'Error code': 1, 'Message': 'Invalid instance type'}]}
Custom Prediction class:
import os
import pickle
import numpy as np
import logging
from datetime import date
import tensorflow.keras as keras
class CustomModelPredictionError(Exception):
def __init__(self, code, message='Error found'):
self.code = code
self.message = message # you could add more args
def __str__(self):
return str(self.message)
def isstr(s):
return isinstance(s, str) or isinstance(s, bytes)
def _validate(instances):
for instance in instances:
if not isstr(instance):
raise CustomModelPredictionError(1, 'Invalid instance type')
return instances
class CustomModelPrediction(object):
def __init__(self, model, processor):
self._model = model
self._processor = processor
def _postprocess(self, predictions):
labels = ['negative', 'positive']
return [
{
"label":labels[int(np.round(prediction))],
"score":float(np.round(prediction, 4))
} for prediction in predictions]
def predict(self, instances, **kwargs):
try:
instances = _validate(instances)
except CustomModelPredictionError as c:
return [{"Error code": c.code, "Message": c.message}]
else:
preprocessed_data = self._processor.transform(instances)
predictions = self._model.predict(preprocessed_data)
labels = self._postprocess(predictions)
return labels
#classmethod
def from_path(cls, model_dir):
model = keras.models.load_model(
os.path.join(model_dir,'keras_saved_model.h5'))
with open(os.path.join(model_dir, 'processor_state.pkl'), 'rb') as f:
processor = pickle.load(f)
return cls(model, processor)
Complete code in this notebook.
If it is still relevant to you, I found a way by using google internal libraries (not sure if it would be endorsed by Google though).
AI platform custom prediction wrapping code only returns custom error message if the Exception thrown is a specific one from their internal library.
It might also not be super reliable as you would have very little control in case Google wants to change it.
class Predictor(object):
def predict(self, instances, **kwargs):
# Your prediction code here
# This is an internal google library, it should be available at prediction time.
from google.cloud.ml.prediction import prediction_utils
raise prediction_utils.PredictionError(0, "Custom error message goes here")
#classmethod
def from_path(cls, model_dir):
# Your logic to load the model here
You would get the following message in your HTTP response
Prediction failed: Custom error message goes here

Google Cloud Storage giving ServiceUnavailable: 503 exception Backend Error

I'm trying to upload a file to Google Cloud Storage Bucket. While making it public, intermittently I'm getting this exception from Google. This error comes almost once in 20 uploads.
google.api_core.exceptions.ServiceUnavailable: 503 GET https://www.googleapis.com/storage/v1/b/bucket_name/o/folder_name%2FPolicy-APP-000456384.2019-05-16-023805.pdf/acl: Backend Error
I'm using python3 and have tried updating the version of google-cloud-storage to 1.15.0 but it didn't help.
class GoogleStorageHelper:
def __init__(self, project_name):
self.client = storage.Client(project=project_name)
def upload_file(self, bucket_name, file, file_name, content_type, blob_name, is_stream):
safe_file_name = self.get_safe_filename(file_name)
bucket = self.client.bucket(bucket_name)
blob = bucket.blob(safe_file_name)
if is_stream:
blob.upload_from_string(file, content_type=content_type)
else:
blob.upload_from_filename(file, content_type=content_type)
blob.make_public() // Getting Error here
url = blob.public_url
if isinstance(url, six.binary_type):
url = url.decode('utf-8')
logger.info('File uploaded, URL: {}'.format(url))
return url
#staticmethod
def get_safe_filename(file_name):
basename, extension = file_name.rsplit('.', 1)
return '{0}.{1}.{2}'.format(basename, datetime.now().strftime('%Y-%m-%d-%H%M%S'), extension)
Have you faced this kind of problem and solved it? Or have any ideas to fix this issue?
This is a known issue recently with GCS using Python make_public() method. The problem is now being worked on by the GCS team.
I'd suggest, as a quick mitigation strategy, to enable retries. This documentation could be helpful in setting up Retry Handling Strategy.
This one is a bit tricky. I ran into the same issue and found the Python API client doesn't enable retries for the upload_from_string() method.
All upload_from_string() does is call the upload_from_file() method, which has retries but the implementation ignores retries.
def upload_from_string(self,
data,
content_type="text/plain",
client=None,
predefined_acl=None):
data = _to_bytes(data, encoding="utf-8")
string_buffer = BytesIO(data)
self.upload_from_file(
file_obj=string_buffer,
size=len(data),
content_type=content_type,
client=client,
predefined_acl=predefined_acl,
)
You can hack the upload_from_string() method by using the upload_from_file() implementation, adding retries:
from google.cloud._helpers import _to_bytes
from io import BytesIO
from google.cloud.storage import Blob
def upload_from_string(
data, file_path, bucket, client, content_type, num_retries
):
data = _to_bytes(data, encoding="utf-8")
string_buffer = BytesIO(data)
blob = Blob(file_path, bucket)
blob.upload_from_file(
file_obj=string_buffer,
size=len(data),
client=client,
num_retries=num_retries,
content_type=content_type
)
To handle this error gracefully and wait as suggested by the 503 docs, note that these errors inherit from GoogleAPICallError, therefore can be parsed for the error code:
from google.api_core.exceptions import GoogleAPICallError
try:
blob.upload_from_filename(YOUR_UPLOAD_PARAMETERS)
except GoogleAPICallError as e:
if e.code == 503:
print(f'GCP storage unavailable: {e}')
... # handle the error gracefully, or simply ignore
else:
raise
Additionally, you may use the retry.Retry as suggested in the doc:
blob.upload_from_filename(YOUR_UPLOAD_PARAMETERS, retry=retry.Retry())

Django body encoding vs slack-api secret

I am following the instruction from this page. I am building a slack slash command handling server and I can't rebuild the signature to validate slash request authenticity.
here is the code snippet from my django application (the view uses the django rest-framework APIView):
#property
def x_slack_req_ts(self):
if self.xsrts is not None:
return self.xsrts
self.xsrts = str(self.request.META['HTTP_X_SLACK_REQUEST_TIMESTAMP'])
return self.xsrts
#property
def x_slack_signature(self):
if self.xss is not None:
return self.xss
self.xss = self.request.META['HTTP_X_SLACK_SIGNATURE']
return self.xss
#property
def base_message(self):
if self.bs is not None:
return self.bs
self.bs = ':'.join(["v0", self.x_slack_req_ts, self.raw.decode('utf-8')])
return self.bs
#property
def encoded_secret(self):
return self.app.signing_secret.encode('utf-8')
#property
def signed(self):
if self.non_base is not None:
return self.non_base
hashed = hmac.new(self.encoded_secret, self.base_message.encode('utf-8'), hashlib.sha256)
self.non_base = "v0=" + hashed.hexdigest()
return self.non_base
This is within a class where self.raw = request.body the django request and self.app.signing_secret is a string with the appropriate slack secret string. It doesn't work as the self.non_base yield an innaccurate value.
Now if I open an interactive python repl and do the following:
>>> import hmac
>>> import hashlib
>>> secret = "8f742231b10e8888abcd99yyyzzz85a5"
>>> ts = "1531420618"
>>> msg = "token=xyzz0WbapA4vBCDEFasx0q6G&team_id=T1DC2JH3J&team_domain=testteamnow&channel_id=G8PSS9T3V&channel_name=foobar&user_id=U2CERLKJA&user_name=roadrunner&command=%2Fwebhook-collect&text=&response_url=https%3A%2F%2Fhooks.slack.com%2Fcommands%2FT1DC2JH3J%2F397700885554%2F96rGlfmibIGlgcZRskXaIFfN&trigger_id=398738663015.47445629121.803a0bc887a14d10d2c447fce8b6703c"
>>> ref_signature = "v0=a2114d57b48eac39b9ad189dd8316235a7b4a8d21a10bd27519666489c69b503"
>>> base = ":".join(["v0", ts, msg])
>>> hashed = hmac.new(secret.encode(), base.encode(), hashlib.sha256)
>>> hashed.hexdigest()
>>> 'a2114d57b48eac39b9ad189dd8316235a7b4a8d21a10bd27519666489c69b503'
You will recognise the referenced link example. If I use the values from my django app with one of MY examples, it works within the repl but doesn't within the django app.
MY QUESTION: I believe this is caused by the self.raw.decode() encoding not being consistent with the printout I extracted to copy/paste in the repl. Has anyone encountered that issue and what is the fix? I tried a few random things with the urllib.parse library... How can I make sure that the request.body encoding is consistent with the example from flask with get_data() (as suggested by the doc in the link)?
UPDATE: I defined a custom parser:
class SlashParser(BaseParser):
"""
Parser for form data.
"""
media_type = 'application/x-www-form-urlencoded'
def parse(self, stream, media_type=None, parser_context=None):
"""
Parses the incoming bytestream as a URL encoded form,
and returns the resulting QueryDict.
"""
parser_context = parser_context or {}
request = parser_context.get('request')
raw_data = stream.read()
data = QueryDict(raw_data, encoding='utf-8')
setattr(data, 'raw_body', raw_data) # setting a 'body' alike custom attr with raw POST content
return data
To test based on this question and the raw_body in the custom parser generates the exact same hashed signature as the normal "body" but again, copy pasting in the repl to test outside the DRF works. Pretty sure it's an encoding problem but completely at loss...
I found the problem which is very frustrating.
It turns out that the signing secret was stored in too short a str array and were missing trailing characters which obviously, resulted in bad hashing of the message.

stop django from automatically unicodifing POST stuff

I upload some data to a django view. Client:
from poster.encode import multipart_encode
def upload_data(upload_url, data, filename):
print "Uploading %d bytes to server, file=%s..." % (len(data), filename)
datagen, headers = multipart_encode({filename: data})
request = urllib2.Request(upload_url, datagen, headers)
# Actually do the request, and get the response
try:
resp_f = urllib2.urlopen(request, timeout=120)
except urllib2.URLError:
return None
res = resp_f.read()
resp_f.close()
return res
#...
def foo(self, event_dicts_td):
event_dicts_td_json = json.dumps(event_dicts_td)
res = upload_data(self.upload_url, event_dicts_td_json.encode('utf8').encode('zlib'), "event_dicts_td.json.gz")
The view:
def my_view(request):
event_dicts_td_json_gz = request.POST.get('event_dicts_td.json.gz')
if not event_dicts_td_json_gz:
return HttpResponse("fail")
print type(event_dicts_td_json_gz), repr(event_dicts_td_json_gz[:10])
event_dicts_td_json_gz = event_dicts_td_json_gz.encode("utf8")
print type(event_dicts_td_json_gz), repr(event_dicts_td_json_gz[:10])
event_dicts_td_json = event_dicts_td_json_gz.decode("zlib").decode("utf8")
return HttpResponse("it still failed")
The output:
<type 'unicode'> u'x\ufffd\ufffd]s\ufffd\u0192\ufffd\ufffd\n'
<type 'str'> 'x\xef\xbf\xbd\xef\xbf\xbd]s\xef'
This is not acceptable. I just need the raw bytes. I'm not uploading unicode - I'm uploading raw bytes - and I want those raw bytes back. I don't know how it's trying to decode it into unicode - apparently not using utf8 cause zlib was unable to decompress the data. (It was unable to decompress it even when I didn't try to do an .encode("utf8") before zlibbing-it, that was just a test.)
How do I make django not unicodify the POST variables? Or, if it does, how do I undo it?
You can undo this.
Try to use *smart_str* from django.utils.encoding:
from django.utils.encoding import smart_str
event_dicts_td_json_gz = smart_str( event_dicts_td_json_gz )
View the docs here please: https://docs.djangoproject.com/en/dev/ref/unicode/#useful-utility-functions

Only accept a certain file type in FileField, server-side

How can I restrict FileField to only accept a certain type of file (video, audio, pdf, etc.) in an elegant way, server-side?
One very easy way is to use a custom validator.
In your app's validators.py:
def validate_file_extension(value):
import os
from django.core.exceptions import ValidationError
ext = os.path.splitext(value.name)[1] # [0] returns path+filename
valid_extensions = ['.pdf', '.doc', '.docx', '.jpg', '.png', '.xlsx', '.xls']
if not ext.lower() in valid_extensions:
raise ValidationError('Unsupported file extension.')
Then in your models.py:
from .validators import validate_file_extension
... and use the validator for your form field:
class Document(models.Model):
file = models.FileField(upload_to="documents/%Y/%m/%d", validators=[validate_file_extension])
See also: How to limit file types on file uploads for ModelForms with FileFields?.
Warning
For securing your code execution environment from malicious media files
Use Exif libraries to properly validate the media files.
Separate your media files from your application code
execution environment
If possible use solutions like S3, GCS, Minio or
anything similar
When loading media files on client side, use client native methods (for example if you are loading the media files non securely in a
browser, it may cause execution of "crafted" JavaScript code)
Django in version 1.11 has a newly added FileExtensionValidator for model fields, the docs is here: https://docs.djangoproject.com/en/dev/ref/validators/#fileextensionvalidator.
An example of how to validate a file extension:
from django.core.validators import FileExtensionValidator
from django.db import models
class MyModel(models.Model):
pdf_file = models.FileField(
upload_to="foo/", validators=[FileExtensionValidator(allowed_extensions=["pdf"])]
)
Note that this method is not safe. Citation from Django docs:
Don’t rely on validation of the file extension to determine a file’s
type. Files can be renamed to have any extension no matter what data
they contain.
There is also new validate_image_file_extension (https://docs.djangoproject.com/en/dev/ref/validators/#validate-image-file-extension) for validating image extensions (using Pillow).
A few people have suggested using python-magic to validate that the file actually is of the type you are expecting to receive. This can be incorporated into the validator suggested in the accepted answer:
import os
import magic
from django.core.exceptions import ValidationError
def validate_is_pdf(file):
valid_mime_types = ['application/pdf']
file_mime_type = magic.from_buffer(file.read(1024), mime=True)
if file_mime_type not in valid_mime_types:
raise ValidationError('Unsupported file type.')
valid_file_extensions = ['.pdf']
ext = os.path.splitext(file.name)[1]
if ext.lower() not in valid_file_extensions:
raise ValidationError('Unacceptable file extension.')
This example only validates a pdf, but any number of mime-types and file extensions can be added to the arrays.
Assuming you saved the above in validators.py you can incorporate this into your model like so:
from myapp.validators import validate_is_pdf
class PdfFile(models.Model):
file = models.FileField(upload_to='pdfs/', validators=(validate_is_pdf,))
You can use the below to restrict filetypes in your Form
file = forms.FileField(widget=forms.FileInput(attrs={'accept':'application/pdf'}))
There's a Django snippet that does this:
import os
from django import forms
class ExtFileField(forms.FileField):
"""
Same as forms.FileField, but you can specify a file extension whitelist.
>>> from django.core.files.uploadedfile import SimpleUploadedFile
>>>
>>> t = ExtFileField(ext_whitelist=(".pdf", ".txt"))
>>>
>>> t.clean(SimpleUploadedFile('filename.pdf', 'Some File Content'))
>>> t.clean(SimpleUploadedFile('filename.txt', 'Some File Content'))
>>>
>>> t.clean(SimpleUploadedFile('filename.exe', 'Some File Content'))
Traceback (most recent call last):
...
ValidationError: [u'Not allowed filetype!']
"""
def __init__(self, *args, **kwargs):
ext_whitelist = kwargs.pop("ext_whitelist")
self.ext_whitelist = [i.lower() for i in ext_whitelist]
super(ExtFileField, self).__init__(*args, **kwargs)
def clean(self, *args, **kwargs):
data = super(ExtFileField, self).clean(*args, **kwargs)
filename = data.name
ext = os.path.splitext(filename)[1]
ext = ext.lower()
if ext not in self.ext_whitelist:
raise forms.ValidationError("Not allowed filetype!")
#-------------------------------------------------------------------------
if __name__ == "__main__":
import doctest, datetime
doctest.testmod()
First. Create a file named formatChecker.py inside the app where the you have the model that has the FileField that you want to accept a certain file type.
This is your formatChecker.py:
from django.db.models import FileField
from django.forms import forms
from django.template.defaultfilters import filesizeformat
from django.utils.translation import ugettext_lazy as _
class ContentTypeRestrictedFileField(FileField):
"""
Same as FileField, but you can specify:
* content_types - list containing allowed content_types. Example: ['application/pdf', 'image/jpeg']
* max_upload_size - a number indicating the maximum file size allowed for upload.
2.5MB - 2621440
5MB - 5242880
10MB - 10485760
20MB - 20971520
50MB - 5242880
100MB 104857600
250MB - 214958080
500MB - 429916160
"""
def __init__(self, *args, **kwargs):
self.content_types = kwargs.pop("content_types")
self.max_upload_size = kwargs.pop("max_upload_size")
super(ContentTypeRestrictedFileField, self).__init__(*args, **kwargs)
def clean(self, *args, **kwargs):
data = super(ContentTypeRestrictedFileField, self).clean(*args, **kwargs)
file = data.file
try:
content_type = file.content_type
if content_type in self.content_types:
if file._size > self.max_upload_size:
raise forms.ValidationError(_('Please keep filesize under %s. Current filesize %s') % (filesizeformat(self.max_upload_size), filesizeformat(file._size)))
else:
raise forms.ValidationError(_('Filetype not supported.'))
except AttributeError:
pass
return data
Second. In your models.py, add this:
from formatChecker import ContentTypeRestrictedFileField
Then instead of using 'FileField', use this 'ContentTypeRestrictedFileField'.
Example:
class Stuff(models.Model):
title = models.CharField(max_length=245)
handout = ContentTypeRestrictedFileField(upload_to='uploads/', content_types=['video/x-msvideo', 'application/pdf', 'video/mp4', 'audio/mpeg', ],max_upload_size=5242880,blank=True, null=True)
Those are the things you have to when you want to only accept a certain file type in FileField.
after I checked the accepted answer, I decided to share a tip based on Django documentation. There is already a validator for use to validate file extension. You don't need to rewrite your own custom function to validate whether your file extension is allowed or not.
https://docs.djangoproject.com/en/3.0/ref/validators/#fileextensionvalidator
Warning
Don’t rely on validation of the file extension to determine a file’s
type. Files can be renamed to have any extension no matter what data
they contain.
I think you would be best suited using the ExtFileField that Dominic Rodger specified in his answer and python-magic that Daniel Quinn mentioned is the best way to go. If someone is smart enough to change the extension at least you will catch them with the headers.
You can define a list of accepted mime types in settings and then define a validator which uses python-magic to detect the mime-type and raises ValidationError if the mime-type is not accepted. Set that validator on the file form field.
The only problem is that sometimes the mime type is application/octet-stream, which could correspond to different file formats. Did someone of you overcome this issue?
Additionally i Will extend this class with some extra behaviour.
class ContentTypeRestrictedFileField(forms.FileField):
...
widget = None
...
def __init__(self, *args, **kwargs):
...
self.widget = forms.ClearableFileInput(attrs={'accept':kwargs.pop('accept', None)})
super(ContentTypeRestrictedFileField, self).__init__(*args, **kwargs)
When we create instance with param accept=".pdf,.txt", in popup with file structure as a default we will see files with passed extension.
Just a minor tweak to #Thismatters answer since I can't comment. According to the README of python-magic:
recommend using at least the first 2048 bytes, as less can produce incorrect identification
So changing 1024 bytes to 2048 to read the contents of the file and get the mime type base from that can give the most accurate result, hence:
def validate_extension(file):
valid_mime_types = ["application/pdf", "image/jpeg", "image/png", "image/jpg"]
file_mime_type = magic.from_buffer(file.read(2048), mime=True) # Changed this to 1024 to 2048
if file_mime_type not in valid_mime_types:
raise ValidationError("Unsupported file type.")
valid_file_extensions = [".pdf", ".jpeg", ".png", ".jpg"]
ext = os.path.splitext(file.name)[1]
if ext.lower() not in valid_file_extensions:
raise ValidationError("Unacceptable file extension.")