How to validate contents of a CSV file using Django forms - django

I have a web app that needs to do the following:
Present a form to request a client side file for CSV import.
Validate the data in the CSV file or ask for another filename.
At one point, I was doing the CSV data validation in the view, after the form.is_valid() call from getting the filename (i.e. I have the imported CSV file into memory in a dictionary using csv.DictReader). After running into problems trying to pass errors back to the original form, I'm now trying to validate the CONTENTS of the CSV file in the form's clean() method.
I'm currently stumped on how to access the in memory file from clean() as the request.FILES object isn't valid. Note that I have no problems presenting the form to the client browser and then manipulating the resulting CSV file. The real issue is how to validate the contents of the CSV file - if I assume the data format is correct I can import it to my models. I'll post my forms.py file to show where I currently am after moving the code from the view to the form:
forms.py
import csv
from django import forms
from io import TextIOWrapper
class CSVImportForm(forms.Form):
filename = forms.FileField(label='Select a CSV file to import:',)
def clean(self):
cleaned_data = super(CSVImportForm, self).clean()
f = TextIOWrapper(request.FILES['filename'].file, encoding='ASCII')
result_csvlist = csv.DictReader(f)
# first line (only) contains additional information about the event
# let's validate that against its form definition
event_info = next(result_csvlist)
f_eventinfo = ResultsForm(event_info)
if not f_eventinfo.is_valid():
raise forms.ValidationError("Error validating 1st line of data (after header) in CSV")
return cleaned_data
class ResultsForm(forms.Form):
RESULT_CHOICES = (('Won', 'Won'),
('Lost', 'Lost'),
('Tie', 'Tie'),
('WonByForfeit', 'WonByForfeit'),
('LostByForfeit', 'LostByForfeit'))
Team1 = forms.CharField(min_length=10, max_length=11)
Team2 = forms.CharField(min_length=10, max_length=11)
Result = forms.ChoiceField(choices=RESULT_CHOICES)
Score = forms.CharField()
Event = forms.CharField()
Venue = forms.CharField()
Date = forms.DateField()
Div = forms.CharField()
Website = forms.URLField(required=False)
TD = forms.CharField(required=False)
I'd love input on what's the "best" method to validate the contents of an uploaded CSV file and present that information back to the client browser!

I assume that when you want to access that file is in this line inside the clean method:
f = TextIOWrapper(request.FILES['filename'].file, encoding='ASCII')
You can't use that line because request doesn't exist but you can access your form's fields so you can try this instead:
f = TextIOWrapper(self.cleaned_data.get('filename'), encoding='ASCII')
Since you have done super.clean in the first line in your method, that should work. Then, if you want to add custom error message to you form you can do it like this:
from django.forms.util import ErrorList
errors = form._errors.setdefault("filename", ErrorList())
errors.append(u"CSV file incorrect")
Hope it helps.

Related

"form.populate_by returns" ERROR:'list' object has no attribute

I am creating a view function to edit the database using a wtform, I want to populate the form with information held on the database supplied by a differente form, My problem is the query that provides the details
I have read the manual https://wtforms.readthedocs.io/en/stable/crash_course.html
and the following question Python Flask-WTF - use same form template for add and edit operations
but my query does not seem to supply the correct format of data
datatbase model:
class Sensors(db.Model):
id = db.Column(db.Integer, primary_key=True)
sensorID = db.Column(db.String, unique=True)
name = db.Column(db.String(30), unique=True)
form model:
class AddSensorForm(FlaskForm):
sensorID = StringField('sensorID', validators=[DataRequired()])
sensorName = StringField('sensorName', validators=[DataRequired()])
submit = SubmitField('Register')
view function:
#bp.route('/sensors/editsensor/<int:id>', methods=('GET', 'POST'))
#login_required
def editsensor(id):
edit = [(s.sensorID, s.sensorName) for s in db.session.\
query(Sensors).filter_by(id=id).all()]
form = AddSensorForm(obj=edit)
form.populate_obj(edit)
if form.validate_on_submit():
sensors = Sensors(sensorID=form.sensorID.data, sensorName=form.sensorNa$
db.session.add(sensors)
db.session.commit()
shell code for query:
from homeHeating import db
from homeHeating import create_app
app = create_app()
app.app_context().push()
def editsensor(id):
edit = [(s.sensorID, s.sensorName) for s in db.session.query(Sensors).filter_by(id=id).all()]
print(edit)
editsensor(1)
[('28-0000045680fde', 'Boiler input')]
I expect that the two form fields will be populated with the in formation concerning the sensor called by its 'id'
but I get this error
File "/home/pi/heating/homeHeating/sensors/sensors.py", line 60, in
editsensor
form.populate_obj(edit)
File "/home/pi/heating/venv/lib/python3.7/site-
packages/wtforms/form.py", line 96, in populate_obj
Open an interactive python shell in this
framefield.populate_obj(obj, name)
File "/home/pi/heating/venv/lib/python3.7/site-
packages/wtforms/fields/core.py", line 330, in populate_obj
setattr(obj, name, self.data)
AttributeError: 'list' object has no attribute 'sensorID'
The error indicates that it wants 2 parts for each field "framefield.populate_obj(obj, name) mine provides only one the column data but not the column name, "sensorID"
If i hash # out the line "edit = ..." then there are no error messages and the form is returned but the fields are empty. So I want the form to be returned with the information in the database, filled in so that i can modify the name or the sensorID and then update the database.
I hope that this is clear
Warm regards
paul.
ps I have followed the instruction so the ERROR statement is only the part after "field.populate_by".
You are trying to pass a 1-item list to your form.
Typically, when you are selecting a single record based on the primary key of your model, use Query.get() instead of Query.filter(...).all()[0].
Furthermore, you need to pass the request data to your form to validate it on submit, and also to pre-fill the fields when the form reports errors.
Form.validate_on_submit will be return True only if your request method is POST and your form passes validation; it is the step where your form tells you "the user provided syntactically correct information, now you may do more checks and I may populate an existing object with the data provided to me".
You also need to handle cases where the form is being displayed to the user for the first time.
#bp.route('/sensors/editsensor/<int:id>', methods=('GET', 'POST'))
#login_required
def editsensor(id):
obj = Sensors.query.get(id) or Sensors()
form = AddSensorForm(request.form, obj=obj)
if form.validate_on_submit():
form.populate_obj(obj)
db.session.add(obj)
db.session.commit()
# return response or redirect here
return redirect(...)
else:
# either the form has errors, or the user is displaying it for
# the first time (GET)
return render_template('sensors.html', form=form, obj=obj)

Populate model with metadata of file uploaded through django admin

I have two models,Foto and FotoMetadata. Foto just has one property called upload, that is an upload field. FotoMetadata has a few properties and should receive metadata from the foto uploaded at Foto. This can be done manually at the admin interface, but I want to do it automatically, i.e: when a photo is uploaded through admin interface, the FotoMetadata is automatically filled.
In my model.py I have a few classes, including Foto and FotoMetadata:
class Foto(models.Model):
upload = models.FileField(upload_to="fotos")
def __str__(self):
return '%s' %(self.upload)
class FotoMetadata(models.Model):
image_formats = (
('RAW', 'RAW'),
('JPG', 'JPG'),
)
date = models.DateTimeField()
camera = models.ForeignKey(Camera, on_delete=models.PROTECT)
format = models.CharField(max_length=8, choices=image_formats)
exposure = models.CharField(max_length=8)
fnumber = models.CharField(max_length=8)
iso = models.IntegerField()
foto = models.OneToOneField(
Foto,
on_delete=models.CASCADE,
primary_key=True,
)
When I login at the admin site, I have an upload form related to the Foto, and this is working fine. My problem is that I can't insert metadata at FotoMetadata on the go. I made a function that parse the photo and give me a dictionary with the info I need. This function is called GetExif is at a file called getexif.py. This will be a simplified version of it:
def GetExif(foto):
# Open image file for reading (binary mode)
f = open(foto, 'rb')
# Parse file
...
<parsing code>
...
f.close()
#create dictionary to receive data
meta={}
meta['date'] = str(tags['EXIF DateTimeOriginal'].values)
meta['fnumber'] = str(tags['EXIF FNumber'])
meta['exposure'] = str(tags['EXIF ExposureTime'])
meta['iso'] = str(tags['EXIF ISOSpeedRatings'])
meta['camera'] =str( tags['Image Model'].values)
return meta
So, basically, what I'm trying to do is use this function at admin.py to automatically populate the FotoMetadata when uploading a photo at Foto, but I really couldn't figure out how to make it. Does any one have a clue?
Edit 24/03/2016
Ok, after a lot more failures, I'm trying to use save_model in admin.py:
from django.contrib import admin
from .models import Autor, Camera, Lente, Foto, FotoMetadata
from fotomanager.local.getexif import GetExif
admin.site.register(Autor)
admin.site.register(Camera)
admin.site.register(Lente)
admin.site.register(FotoMetadata)
class FotoAdmin(admin.ModelAdmin):
def save_model(self, request, obj, form, change):
# populate the model
obj.save()
# get metadata
metadados = GetExif(obj.upload.url)
# Create instance of FotoMetadata
fotometa = FotoMetadata()
# FotoMetadata.id = Foto.id
fotometa.foto = obj.pk
# save exposure
fotometa.exposure = metadados['exposure']
admin.site.register(Foto, FotoAdmin)
I thought it would work, or that I will have problems saving data to the model, but actually I got stucked before this. I got this error:
Exception Type: FileNotFoundError
Exception Value:
[Errno 2] No such file or directory: 'http://127.0.0.1:8000/media/fotos/IMG_8628.CR2'
Exception Location: /home/ricardo/Desenvolvimento/fotosite/fotomanager/local/getexif.py in GetExif, line 24
My GetExif function can't read the file, however, the file path is right! If I copy and paste it to my browser, it downloads the file. I'm trying to figure out a way to correct the address, or to pass the internal path, or to pass the real file to the function instead of its path. I'm also thinking about a diferent way to access the file at GetExif() function too. Any idea of how to solve it?
Solution
I solved the problem above! By reading the FileField source, I've found a property called path, which solve the problem. I also made a few other modifications and the code is working. The class FotoAdmin, at admin.py is like this now:
class FotoAdmin(admin.ModelAdmin):
def save_model(self, request, obj, form, change):
# populate the model
obj.save()
# get metadata
metadados = GetExif(obj.upload.path)
# Create instance of FotoMetadata
fotometa = FotoMetadata()
# FotoMetadata.id = Foto.id
fotometa.foto = obj
# set and save exposure
fotometa.exposure = metadados['exposure']
fotometa.save()
I also had to set null=True at some properties in models.py and everything is working as it should.
I guess you want to enable post_save a signal
read : django signals
Activate the post_save signal - so after you save a FOTO you have a hook to do other stuff, in your case parse photometa and create a FotoMetadata instance.
More, if you want to save the foto only if fotometa succeed , or any other condition you may use , pre_save signal and save the foto only after meta foto was saved.

How does one use magic to verify file type in a Django form clean method?

I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file types to pdfs, word, and open office documents.
To this end, I have installed python-magic and would like to check file types as follows per the specs for python-magic:
mime = magic.Magic(mime=True)
file_mime_type = mime.from_file('address/of/file.txt')
However, recently uploaded files lack addresses on my server. I also do not know of any method of the mime object akin to "from_file_content" that checks for the mime type given the content of the file.
What is an effective way to use magic to verify file types of uploaded files in Django forms?
Stan described good variant with buffer. Unfortunately the weakness of this method is reading file to the memory. Another option is using temporary stored file:
import tempfile
import magic
with tempfile.NamedTemporaryFile() as tmp:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
Also, you might want to check the file size:
if form.cleaned_data['file'].size < ...:
print(magic.from_buffer(form.cleaned_data['file'].read()))
else:
# store to disk (the code above)
Additionally:
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later).
So you might want to handle it like so:
import os
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
finally:
os.unlink(tmp.name)
tmp.close()
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Where uploaded data is stored
Why no trying something like that in your view :
m = magic.Magic()
m.from_buffer(request.FILES['my_file_field'].read())
Or use request.FILES in place of form.cleaned_data if django.forms.Form is really not an option.
mime = magic.Magic(mime=True)
attachment = form.cleaned_data['attachment']
if hasattr(attachment, 'temporary_file_path'):
# file is temporary on the disk, so we can get full path of it.
mime_type = mime.from_file(attachment.temporary_file_path())
else:
# file is on the memory
mime_type = mime.from_buffer(attachment.read())
Also, you might want to seek(0) after read():
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Example from Django code. Performed for image fields during validation.
You can use django-safe-filefield package to validate that uploaded file extension match it MIME-type.
from safe_filefield.forms import SafeFileField
class MyForm(forms.Form):
attachment = SafeFileField(
allowed_extensions=('xls', 'xlsx', 'csv')
)
In case you're handling a file upload and concerned only about images,
Django will set content_type for you (or rather for itself?):
from django.forms import ModelForm
from django.core.files import File
from django.db import models
class MyPhoto(models.Model):
photo = models.ImageField(upload_to=photo_upload_to, max_length=1000)
class MyForm(ModelForm):
class Meta:
model = MyPhoto
fields = ['photo']
photo = MyPhoto.objects.first()
photo = File(open('1.jpeg', 'rb'))
form = MyForm(files={'photo': photo})
if form.is_valid():
print(form.instance.photo.file.content_type)
It doesn't rely on content type provided by the user. But
django.db.models.fields.files.FieldFile.file is an undocumented
property.
Actually, initially content_type is set from the request, but when
the form gets validated, the value is updated.
Regarding non-images, doing request.FILES['name'].read() seems okay to me.
First, that's what Django does. Second, files larger than 2.5 Mb by default
are stored on a disk. So let me point you at the other answer
here.
For the curious, here's the stack trace that leads to updating
content_type:
django.forms.forms.BaseForm.is_valid: self.errors
django.forms.forms.BaseForm.errors: self.full_clean()
django.forms.forms.BaseForm.full_clean: self._clean_fields()
django.forms.forms.BaseForm._clean_fiels: field.clean()
django.forms.fields.FileField.clean: super().clean()
django.forms.fields.Field.clean: self.to_python()
django.forms.fields.ImageField.to_python

Storing user's avatar upon registration

I have an extended UserProfile for registering new users. My user_created function connects to signals sent upon registering basic User instance and creates new UserProfile with extended fields from my form. Here's the code :
from registration.signals import user_registered
from accounts.forms import ExtendedRegistrationForm
import accounts
from accounts.models import UserProfile
def user_created(sender, user, request, **kwargs):
form = ExtendedRegistrationForm(request.POST, request.FILES)
data = UserProfile(user=user)
data.is_active = False
data.first_name = form.data['first_name']
data.last_name = form.data['last_name']
data.pid = form.data['pid']
data.image = form.data['image']
data.street = form.data['street']
data.number = form.data['number']
data.code = form.data['code']
data.city = form.data['city']
data.save()
user_registered.connect(user_created)
Problem is that on this form I have an image field for avatar. As you can see from the code, I'm getting data from form's data list. But apparently imageField does not send it's data with POST request(as I'm getting MultiValueDictKeyError at /user/register/, Key 'image' not found in <QueryDict...) so I can't get it from data[] .
alt text http://img38.imageshack.us/img38/3839/61289917.png
If the usual variables are inside 'data', where should I look for files ? Or is the problem more complicated ? Strange thing is that my form doesn't have attribute cleaned_data... I was using dmitko's method here : http://dmitko.ru/?p=546&lang=en . My :
forms : http://paste.pocoo.org/show/230754/
models : http://paste.pocoo.org/show/230755/
You should be validating the form before using it, which will create the "cleaned_data" attribute you're used to. Just check form.is_valid() and the "cleaned_data" attribute will be available, and should contain the file.
The form's "data" attribute is going to be whatever you passed in as its first initalization argument (in this case, request.POST), and files are stored separately in the "files" attribute (whatever you pass in as the second argument, in this case, request.FILES). You don't want to be accessing the form's "data" or "files" attributes directly, as, if you do, you're just reading data straight from the request and not getting any benefit from using forms.
Are you sure the <form enctype="..."> attribute is set to multipart/form-data ? Otherwise the browser is not able to upload the file data.

Processing file uploads before object is saved

I've got a model like this:
class Talk(BaseModel):
title = models.CharField(max_length=200)
mp3 = models.FileField(upload_to = u'talks/', max_length=200)
seconds = models.IntegerField(blank = True, null = True)
I want to validate before saving that the uploaded file is an MP3, like this:
def is_mp3(path_to_file):
from mutagen.mp3 import MP3
audio = MP3(path_to_file)
return not audio.info.sketchy
Once I'm sure I've got an MP3, I want to save the length of the talk in the seconds attribute, like this:
audio = MP3(path_to_file)
self.seconds = audio.info.length
The problem is, before saving, the uploaded file doesn't have a path (see this ticket, closed as wontfix), so I can't process the MP3.
I'd like to raise a nice validation error so that ModelForms can display a helpful error ("You idiot, you didn't upload an MP3" or something).
Any idea how I can go about accessing the file before it's saved?
p.s. If anyone knows a better way of validating files are MP3s I'm all ears - I also want to be able to mess around with ID3 data (set the artist, album, title and probably album art, so I need it to be processable by mutagen).
You can access the file data in request.FILES while in your view.
I think that best way is to bind uploaded files to a form, override the forms clean method, get the UploadedFile object from cleaned_data, validate it anyway you like, then override the save method and populate your models instance with information about the file and then save it.
a cleaner way to get the file before be saved is like this:
from django.core.exceptions import ValidationError
#this go in your class Model
def clean(self):
try:
f = self.mp3.file #the file in Memory
except ValueError:
raise ValidationError("A File is needed")
f.__class__ #this prints <class 'django.core.files.uploadedfile.InMemoryUploadedFile'>
processfile(f)
and if we need a path, ther answer is in this other question
You could follow the technique used by ImageField where it validates the file header and then seeks back to the start of the file.
class ImageField(FileField):
# ...
def to_python(self, data):
f = super(ImageField, self).to_python(data)
# ...
# We need to get a file object for Pillow. We might have a path or we might
# have to read the data into memory.
if hasattr(data, 'temporary_file_path'):
file = data.temporary_file_path()
else:
if hasattr(data, 'read'):
file = BytesIO(data.read())
else:
file = BytesIO(data['content'])
try:
# ...
except Exception:
# Pillow doesn't recognize it as an image.
six.reraise(ValidationError, ValidationError(
self.error_messages['invalid_image'],
code='invalid_image',
), sys.exc_info()[2])
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
return f