Validation Error with Multiple File Uploads in Django via Ajax - django

I have a view to which I am trying to submit multiple ajax uploads via raw post data (e.g. via an octet-stream). These requests are submitted one after the other so that they process in parallel. The problem is that django thinks that only the last request is valid. For example, if I submit 5 files, the first four give:
Upload a valid image. The file you uploaded was either not an image or a corrupted image.
I'm guessing this occurs because somehow the requests overlap? And so the image isn't completely loaded before the form attempts to validate it?
And the last one works fine.
My upload view:
def upload(request):
form = UploadImageForm(request.POST, request.FILES)
print form
if form.is_valid():
# ..process image..
And my upload image form:
class UploadImageForm(forms.Form):
upload = forms.ImageField()
To submit the requests I'm using the html5uploader js pretty much right out of the box.

On a different not, have you tried https://github.com/blueimp/jQuery-File-Upload/ - is a pretty good non-flash based file uploader with progress bar.

Related

google cloud storage images differ for authenticated url and public url

I cant seem to understand how is it possible that for GCS the authenticated URL shows a different image then the public URL ?
Im uploading the images via a python django script
def upload_to_cloud(blob_name, file_obj):
file_type = imghdr.what(file_obj)
blob_name = str(blob_name) + '.' + file_type # concatenate string to create 'file_name.format'
stats = storage.Blob(bucket=bucket, name=blob_name).exists(client) # check if logo with the same reg.nr exists
if stats is True: # if exists then delete before uploading new logo
storage.Blob(bucket=bucket, name=blob_name).delete()
blob = bucket.blob(blob_name)
blob.upload_from_file(file_obj=file_obj, content_type=f'image/{file_type}')
path = blob.public_url
return path
class CompanyProfile(SuccessMessageMixin, UpdateView): # TODO why company logo differs from the one in ads_list?
model = Company
form_class = CompanyProfileCreationForm
def form_valid(self, form):
"""
Check if user uploaded a new logo. If yes
then upload the new logo to google cloud
"""
if 'logo' in self.request.FILES:
blob_name = self.request.user.company.reg_nr # get company registration number
file_obj = self.request.FILES['logo'] # store uploaded file in variable
form.instance.logo_url = upload_to_cloud(blob_name, file_obj) # update company.logo_url with path to uploaded file
company = Company.objects.get(pk=self.request.user.company.pk)
company.save()
return super().form_valid(form)
else:
return super().form_valid(form)
Any ideas on what Im doing wrong and how its even possible? The file that I actually uploaded is the one under authenticated url. The file thats under public url is a file that I uploaded for a different blob
EDIT
Im adding screenshot of the different images because after some time the images appear to be the same as they should be. Some people are confused by this and comment that the images are the same after all
Public URL
Authenticated URL
Note that caching issue is ruled out since I sent the public URL to my friend and he also saw that the image is the HTML text although the image in the authenticated URL (the correct image) was a light bulb. He also noted that the URL preview in fb messenger showed the light bulb image but when he actually opened the URL the HTML text image appeared
This problem persists in case a file is uploaded with the same blob name. This happens regardless if its overwritten by gcs or if I previously execute blob delete function and then create a new file with the same name as the deleted blob.
In general the same object will be served by storage.googleapis.com and storage.cloud.google.com.
The only exception is if there is some caching (either in your browser, in a proxy, with Cloud CDN or in GCS). If you read the object via storage.cloud.google.com before uploading a new version, then reading after by storage.cloud.google.com may serve the old version while storage.googleapis.com returns the new one. Caching can also be location dependent.
If you can't allow an hour of caching, set Cache control to no-cache.

Serving images asynchronously using django and celery?

I have a django app that serves images when a certain page is loaded. The images are stored on S3 and I retrieve them using boto and send the image content as an HttpResponse with the appropriate content type.
The problem here is, this is a blocking call. Sometimes, it takes a long time (few secs for the image of few hundred KBs) to retrieve the images and serve them to the client.
I tried doing converting this process to a celery task (async, non-blocking), but I am not sure how I can send back the data (images) when they are done downloading. Just returning HttpResponse from a celery task does not work. I found docs related to http callback tasks in an old celery docs here, but this is not supported in the newer celery versions.
So, should I use polling in the js? (I have used celery tasks in other parts of my website, but all of them are socket based) or is this even the right way to approach the problem?
Code:
Django views code that fetches the images (from S3 using boto3): (in views.py)
#csrf_protect
#ensure_csrf_cookie
def getimg(request, public_hash):
if request.user.is_authenticated:
query = img_table.objects.filter(public_hash=public_hash)
else:
query = img_table.objects.filter(public_hash=public_hash, is_public=True)
if query.exists():
item_dir = construct_s3_path(s3_map_thumbnail_folder, public_map_hash)
if check(s3, s3_bucket, item_dir): #checks if file exists
item = s3.Object(s3_bucket, item_dir)
item_content = item.get()['Body'].read()
return HttpResponse(item_content, content_type="image/png",status=200)
else: #if no image found, return a blank image
blank = Image.new('RGB', (1000,1000), (255,255,255))
response = HttpResponse(content_type="image/jpeg")
blank.save(response, "JPEG")
return response
else: #if request image corresponding to hash is not found in db
return render(request, 'core/404.html')
I call the above django view in a page like this:
<img src='/api/getimg/123abc' alt='img'>
In urls.py I have:
url(r'^api/getimg/(?P<public_hash>[a-zA-Z0-9]{6})$', views.getimg, name='getimg')

Image not displayed after Django form save

I have a formset (more specifically a generic inline formset) whose forms have an ImageField. After working around what seems to be a few Django bugs, everything is working fine, except that when I fill the blank form with a new image and click save, the page refreshes after the POST request and the new image is not there. The other fields of the form I just saved are there and are properly filled, but the image URL doesn't show up. The database record is saved correctly and if I simply refresh the page, the image shows up correctly. But I can't figure out how to return from the POST with all the newly saved information without having to refresh the page an extra time.
In case it's relevant, for this ImageField I am using a custom Storage class which handles saving the image in a remote server through an API.
A workaround that solves the problem but which in my opinion shouldn't be necessary:
class ProductImagesView(View):
...
def post(self, request, id):
product = get_object_or_404(Product.objects.by_id(id))
image_formset = ProductImageInlineFormset(
request.POST, request.FILES, instance=product)
if image_formset.is_valid():
image_formset.save()
image_formset = ProductImageInlineFormset(instance=product) # Workaround
return render(...)
You can find more details about my code in this other question:
Django BaseGenericInlineFormSet forms not inheriting FormSet instance as form instance related_object
Any idea why this is happening? Am I doing something wrong or is this a Django bug? Thanks.
UPDATE
One thing I forgot to say: the formset that shows up after saving not only has a blank Image field for the newly created Image object, it also doesn't have the extra blank form for new records that should be there. It also only appears after a refresh.
(I pass extra=1 to the generic_inlineformset_factory):
ProductImageInlineFormset = generic_inlineformset_factory(
Image, form=ProductImageForm, extra=1)

Django FileField in model, S3 Storage + Boto, 400 Bad Request

I'm developing a web application in Django, and one of its features is adding new articles with a photo.
My Article model class contains models.FileField. I use S3BotoStorage as DEFAULT_FILE_STORAGE (Amazon S3).
Users can add a photo to an article in two ways:
1) Upload a photo from disk (by using input type=file)
2) Paste URL to existing photo online
If users uses option 1), everything works. I get uploaded file in view from request.FILES dictionary and assing it to FileField in Article object. The photo is uploaded to S3.
But when user pastes URL to a photo, the first thing I have to do in view is to download this photo. I do it by using function:
def downloadPhotoFromURL(url):
try:
img = urllib.urlretrieve(url)[0]
return img
except Exception:
return None
Then I save this image to FileField in model, so my whole code responsible for downloading image, and uploading it to S3 is like this:
articleImg = downloadPhotoFromURL(url)
f = File(open(articleImg), 'rb')
newArticle.image.save('tmp', f)
In this situation, I cannot upload it to S3 and after 2 minutes I'm receiving BotoServerError: 400 Bad Request. Unfortunately I don't have any other information why this request is bad. Any idea what can be going wrong? By the time I saved an image to model, I had saved a model, so model exists when I try to upload photo to S3.

Django Drama: (Drag-n-Drop File Uploader in Multi-part Form) + (Bind File to Form / Model & Save to Postgres)

There are a few tiny related questions buried in here, but they really point to one big, hairy best practice question. This is kind of a tough feature to implement because it's supposed to do a couple tricky things at once...
drag-and-drop multi-file uploader (via Javascript)
multi-page form (page one: upload and associate files with an existing document model;
page two: update and save file/document objects and meta-data to database)
...and I haven't found a pre-existing code sample or implementation anywhere. (Depending on one's approach, it could sweep off the table or automagically answer all the related/embedded/follow-on questions.) Bottom-line, the purpose of this post is to answer this question: What's the most elegant approach which minimizes the intervening questions/problems?
I'm using this implementation of a drag-and-drop JQuery File Uploader in Django to upload files...
https://github.com/miki725/Django-jQuery-File-Uploader-Integration-demo
The solution I link to above saves files on the filesystem, of course, but in batches per upload session, via creating a directory for each batch of files, and then assigning a UUID to each of those directories. Each uniquely named directory on the filesystem contains files uploaded during that particular upload session. That means any sort of database storage method first has to tease apart and iterate over all the files in the filesystem directory created for each upload session by this solution.
Note: the JQuery solution linked to above doesn't use a form (in forms.py) inside the app directory. The form is hardcoded into the template, which is already a bit of a bummer...'cause now I also have to find a nice way to bind each of the above files in each batch to a form.
I think the simplest--albeit perhaps least performant solution--is to create two views, for two forms...to save each file to the database in the view on the first page, and then update the database on the second page. Here's the direction I'm presently rolling in:
IN THE TEMPLATE...
...uploader javascripts in header...
<form action="{% url my_upload_handler %}" method="POST" enctype="multipart/form-data">
<input type="file" name="files[]" multiple
</form>
IN VIEWS.PY...
def my_upload_handler_0r_form_part_one(request):
# POST (in the upload handler; request triggered by an upload action)
if request.method == 'POST':
if not ("f" in request.GET.keys()):
...validators and exception handling...
...response_data, which is a dict...
uid = request.POST[u"uid"]
file = request.FILES[u'files[]']
filename = os.path.join(temp_path, str(uuid.uuid4()) + file.name)
destination = open(filename, "wb+")
for chunk in file.chunks():
destination.write(chunk)
destination.close()
response_data = simplejson.dumps([response_data])
response_type = "application/json"
# return the data to the JQuery uploader plugin...
return HttpResponse(response_data, mimetype=response_type)
# GET (in the same upload handler)
else:
return render_to_response('my_first_page_template.html',
{ <---NO 'form':form HERE
'uid': uuid.uuid4(),
},
context_instance = RequestContext(request))
def form_part_two(request):
#here I need to retrieve and update stuff uploaded on first page
return render_to_response('my_second_page_template.html',
{},
context_instance = RequestContext(request))
This view for the first page leverages the JQuery uploader, which works great for multi-file uploads per session and does what it's supposed to do. However, as hinted above, the view, as an upload handler, is only the first page in what needs to be a two page form. On page two, the end user would subsequently need to retrieve each uploaded file, attach additional data to the files they just uploaded on page one, and re-save to the database.
I've tried to make this work as a two-part form via various solutions, including form wizards and/or generic class based views...following examples mainly enabling data persistence via the session. These solutions get rather thorny very quickly.
In summary, I need to...
upload multiple files in a uniquely identified batch (via drag and drop)
tease apart and iterate over each batch of uploaded files
bind each file in the batch to a form and associate it with an existing document model
submit / save all of these files at once to the database
retrieve each of those files on the following page/template of a potentially new form
update metadata for each file
resubmit / save all of those files at once to the database
So...you can see how all of the above compounds the complexity of a simple file upload, and increases the complexity of providing the feature, by involving related questions like:
forms.py: how best to bind each file to a form
models.py: how to associate each file with a pre-existing document model
views.py how to save each file in accordance with pre-existing document model in Postgres in the first page; update and save each document in the second page
...and, again, I'd like to do all of that without a form wizard, and without class-based views. (CBVs, especially, for this use case elude me a bit.) In other words: I'm looking for advice leading toward the most bulletproof and easy to read/understand solution possible. If it causes multiple hits to the database, that's fine by me. (If saving a file to the database seems anti best practice, please see this other post: Storing file content in DB
Might I be able to just create a separate view for two forms, and subclass a standard upload form, like so...
In forms.py...
class FileUploadForm(forms.Form):
files = forms.FileField(widget=forms.ClearableFileInput(attrs={'name':'files[]', 'multiple':'multiple'}))
#how to iterate over files in list or batch of files here...?
file = forms.FileField()
file = forms.FileField()
def clean_file(self):
data = self.cleaned_data["file"]
# read, parse, and create `data_dict` from file...
# subclass pre-existing UploadModelForm
**form = UploadModelForm(data_dict)**
if form.is_valid():
self.instance = form.save(commit=False)
else:
raise forms.ValidationError
return data
...and then refactor the earlier upload handler above with something like...
In views.py, substituting the following for present upload handler...
def view_for_form_one(request):
...
# the aforementioned upload handler logic, plus...
...
form = FileUploadForm(request.POST, request.FILES)
if form.is_valid():
form.save()
else:
# display errors
pass
...
def view_for_form_two(request):
# update and commit all data here
...?
In general, with this type of problem, I like to create single page with one <form> on it, but multiple sections which the user progresses through with javascript.
Breaking a form into a multi-part, wizard-style form series is much easier with javascript, especially if the data it produces is dynamic in nature.
If you absolutely must break it out into multiple pages, I would advise you to set up your app to be able to save the data into the database at the end of each step.
You can do that by making the metadata which the user adds at step 2 a nullable field, or even moving the metadata to a separate model.