Need help setting up django-filetransfers - django

My setup is: Django 1.3/Python 2.7.2/Win Server 2008 R2/IIS 7.5/MS SQL Server 2008 R2. I am developing an application whose main function is to analyze uploaded files and produce a report.
Reading over the documentation for django-filetransfers, I believe this is a solution to a problem I've been trying to solve for a while (i.e. form-based file uploads completely block all Django responses until the file-transfer finishes...horror for even moderate-sized files).
The documentation talks about piping uploads to S3 or Blobstore, and that might be what I end up doing eventually, but during development I thought maybe I could just set up my own "poor-man's S3" on a server that I control. This would basically just be another Django instance (or possibly a simple ASP.NET app) whose sole purpose is to receive uploaded files. This sounds like it should be possible with django-filetransfers and would solve the problem of Django responsiveness (???).
But I am missing some bits of understanding how this works in general, as well as some specifics. Maybe an example will help: let's say I have MyMainDjangoServer and MyFileUploadServer. MyMainDjangoServer will serve the views, including the upload form. MyFileUploadServer will "catch" the uploaded files. My questions/confusion are as follows:
My upload form will contain additional fields beyond just the file(s)...do I understand correctly that MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?
I assume that what I would need to do on MyFileUploadServer is have a view/URL that handles the form request and sucks out the request.FILES data. What else needs to happen? What happens to the rest of the form data?
How would I set up my settings.py for this scenario? The django-filetransfers examples seem to assume either S3 or GAE/Blobstore but maybe I am missing some basics.
Any advice/answers appreciated...this is a confusing and frustrating area of Django for me.

"MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?"
I know the GAE Blobstore, presumably S3 as well, handles this by requiring you to give it a success_url. In your case that would be the url on MyMainDjangoServer where your file receiving view on MyFileUploadServer would re-post the non-files form data to once the upload is complete.
Have a look at the create_upload_url method here: https://developers.google.com/appengine/docs/python/blobstore/functions
You need to recreate this functionality in some form (see below).
"How would I set up my settings.py for this scenario?"
You'd need to create your own filetransfers backend which would be a file with a prepare_upload function in it.
You can see the App Engine one here:
https://github.com/django-nonrel/djangoappengine/blob/develop/storage.py
The prepare_upload method just wraps the GAE create_upload_url method mentioned above.
So in your settings.py you'd have something like:
PREPARE_UPLOAD_BACKEND = 'myapp.filetransfers_backend.prepare_upload'
(i.e. the import path to your prepare_upload function)
For the rest you can start with the ones provided by filetransfers already:
SERVE_FILE_BACKEND = 'filetransfers.backends.url.serve_file'
# if you need it:
PUBLIC_DOWNLOAD_URL_BACKEND = 'filetransfers.backends.url.public_download_url'
These rely on the file_field.url being set (see Django docs) and since your files will be on a separate server you probably need to look into writing a custom storage backend for Django too. (the S3 and GAE cases assume you're using the custom Django storage backends from here)

Related

Simple file upload to Django via HTTP request

I'm really confused with the process of uploading a file (image or pdf in my case) to a Django app programmatically (via HTTP POST or PUT request). I've read the doc but I must admit it hasn't helped me get through. The most confusing part to me is the template (and form) : why do I need to specify a template and form in the view ?
I just want to send a file to a directory, and I'd like to know what exactly is needed in order to do so on the Django part as well as on the request part (content-type... ?)
I'd be really grateful to anyone able to show me some direction here..
You don't say what doc you're reading, so it's hard for us to tell what you mean. But if you're planning on doing a programmatic upload, you don't need a template, of course. You do however need some code that accepts the POST and processes the upload: you can do that with a form, or simply access the data in request.FILES and do what you want with it yourself.
Edit It's true that that page doesn't make any reference to uploading programmatically, because most people's use cases are uploading through the browser, via a form. But the page does explain how to handle uploaded files, which is the only bit that you need.

How do web apps receive/process uploaded images?

I can't seem to wrap my head around it. How do websites create forms for image uploads, store, and process it for display on their sites? My current situation is with django storing images.
You place a <form> on your page. It can have different inputs in it, like text, numbers, etc. There is also file input. Then the user submits the form and all data is sent to the webserver. The webserver, basically, receives the file as an array of bytes. You can read more about it here.
In case of Django you'll receive this data in request.FILES. You'll get name, size and other data of the file selected by user. You can then save it on disk or in your database. Read about Django-specific things here.
In general (in a very resumed/simplistic way):
A web page renders a form with an widget in it.
When you tell the browser what's the file you want to upload, it prepare an HTTP request to the webserver in which it stores the binary data of the file.
The request is sent to the web server through a TCP connection.
The server takes the request, see the file an decides what to do with it (depending on configuration files)
Later, the program handling the request, ask the global variables of the server (or something like that) what the heck it did with it files
The server give it the address where it was stored
The app do the rest of the job with the file uploaded :)
In Django:
Three first step are the same
The server ask django what to do with the file
Depending on settings.py the file is uploaded to memory or to a temporary file in disk
The application handles the file accessing it to request.FILES dictionary in your view
When you have it, you can pass it to your models, forms or do whatever you want.
If you want to go deeper with Django file uploads, this section of the documentation is really good. You can take a look at it.
Hope this helps!

More Blobstore upload woes with standard Django

I'm implementing an image upload feature for my Django app (plain Django 1.4 , NOT the non-rel version) running on Google App Engine. The uploaded image is wrapped in a Django model which allows the user to add attributes like a caption and search tags.
The upload is performed by creating a Blobstore upload url through the function call blobstore.create_upload_url(url). The function argument is the url to which the Bobstore redirects when the upload is complete. I want this to be the url of the default Django form handler that performs the save/update of the model that wraps the image so I don't have to duplicate default Django behaviour for form validation, error reporting and database update.
I tried supplying reverse('admin:module_images_add') to create_upload_url() but this doesn't work as it throws an [Errno 30] Read-only file system exception. I presume this originates from the default Django form handler again trying to upload the file the standard Django way but then hits the brick wall of Google App Engine not allowing access to the file system.
At the moment, the only way I can see to get this working without duplicating code is by strictly separating processes: one for defining an image model instance and the second for uploading the actual image. Not very intuitive.
See also this question and answer which I posted earlier.
Any suggestions on how to get this working using one form and reusing Django default form handlers?
EDIT:
I've been reading up on decorators (I'm relatively new to Python) and from what I read, decorators appear to able to modify the behaviour of existing Python code. Would it be possible to change the runtime behaviour of the existing form handler to solve the above using a decorator? I obviously have to (1) develop the decorator and (2) attach it to the default handler. I'm not sure if (2) is possible as it has to be done runtime. I cannot patch the Django code running on GAE...
Well, I finally managed to get this working. Here's what I did in case anyone runs into this as well:
(1) I removed the ImageFile attribute from my model. It ended up causing Django to try and do a file upload from the file system which is not allowed in GAE.
(2) I added a Blobstore key to my model which is basically the key to the GAE BlobStore blob and is required to be able to serve the image at a later stage. On a side note: this attribute has limited length using the GAE SDK but is considerably longer in GAE production. I ended up defining a TextField for it.
(3) Use storage.py with Daniel Roseman's adaption from this question and add the BlobstoreFileUploadHandler to the file handlers in your SETTINGS.PY. It will ensure that the Blobstore key is there in the request for you to save with your model.
(4) I created a custom admin form which contains an ImageField named "image". This is required as it allows you to pick a file. The ImageField is actually "virtual" as its only purpose on the form is to allow me to pick a file for uploading. This is crucial as per (1).
(5) I overwrote render_change_form() method of my ModelAdmin class which will prepare a Blobstore upload url. The upload url has two versions: one for adding new images and one saving changes to existing. Upload urls are passed to the template via the context object.
(6) I modified the change_form.html to include the Blobstore upload url from (5) as the form's action.
(7) I overwrote the save_model() method of my ModelAdmin:
def save_model(self, request, obj, form, change):
if request.FILES.has_key("blobkey"):
blob_key = request.FILES["blobkey"].blobstore_info._BlobInfo__key
obj.blobstore_key = blob_key
super(PhotoFeatureAdmin, self).save_model(request, obj, form, change)
This allows me to retrieve the blob key as set by the upload handler and set it as a property of my model.
For deletion of image models, I added a special function which is triggered by the delete signal of the model. This will keep the Blobstore in sync with the image models in the app.
That's it. The above allows to upload images to the blob store of GAE where each blob is neatly wrapped in a Django model object which admin users can maintain. The good thing is that there's no need to duplicate standard Django behaviour and the model object of the image can easily be extended with attributes in the future.
Final word: in my opinion the support for blobs in plain Django on GAE is currently very poor considering the above. It should be much easier to achieve this, without having to rely on Django non-rel code and a rather long list of modifications; alternatively Google should state something about this in their developer documents. Unless I missed something, this is undocumented territory.

Django: control access to "static" files

Ok, I know that serving media files through Django is a not recommended. However, I'm in a situation where I'd like to serve "static" files using fine-grained access control through Django models.
Example: I want to serve my movie library to myself over the web. I'm often travelling and I'd like to be able to view any of my movies wherever I am, provided I have internet access. So I rip my DVDs, upload them to my server and build this simple Django application coupled with some embeddable video player.
To avoid any legal repercussions, I'd like to ensure that only logged-on users with the proper permissions (i.e. myself and people living in the same household, which can, like me, access the real DVDs at their convenience), but denies it to other users (i.e. people who posted comments on my blog) and returns an HTTP 404.
Now, serving these files directly using Apache and mod_wsgi is rather troublesome because when an HTTP request for the media files (i.e. http://video.mywebsite.com/my-favorite-movie/) comes in, I need to validate against my user database that the person at the other end has the proper permissions.
Question: can I achieve this effect without serving the media files directly through a Django view? What are my options?
One thing I did think of is to write a simple script that takes a session ID and a video's slug and returns some boolean indicating if the user may (or may not) access the video file. Then, somehow request mod_wsgi to execute this script before accessing the requested URL and return an HTTP 404 if the script failed. However, I don't have a clue if this is even possible.
Edit: Posting this question clarified some of my ideas for search and I've come across mod_python's file wrapper extension. Does anyone have enough experience with that to validate that it is a viable solution?
Yes, you can hook into Django's authentication from Apache. See this how-to:
Authenticating against Django’s user database from Apache

Some basic questions about Django, Pyjamas and Clean URLs

I am farily new to the topic, but I am trying to combine both Django and Pyjamas. What would be the smart way to combine the two? I am not asking about communication, but rather about the logical part.
Should I just put all the Pyjamas generated JS in the base of the domain, say http://www.mysite.com/something and setup Django on a subdirectory, or even subdomain, so all the JSON calls will go for http://something.mysite.com/something ?
As far as I understand now in such combination theres not much point to create views in Django?
Is there some solution for clean urls in Pyjamas, or that should be solved on some other level? How? Is it a standard way to pass some arguments as GET parameteres in a clean url while calling a Pyjamas generated JS?
You should take a look at the good Django With Pyjamas Howto.
I've managed to get the following to work, but it's not ideal. Full disclosure: I haven't figured out how to use the django's template system to get stuff into the pyjamas UI elements, and I have not confirmed that this setup works with django's authentication system. The only thing I've confirmed is that this gets the pyjamas-generated page to show up. Here's what I did.
Put the main .html file generated by pyjamas in django's "templates" directory and serve it from your project the way you'd serve any other template.
Put everything else in django's "static" files directory.
Make the following changes to the main .html file generated by pyjamas: in the head section find the meta element with name="pygwt:module" and change the content="..." attribute to content="/static/..." where "/static/" is the static page URL path you've configured in django; in the body section find the script element with src="bootstrap.js" and replace the attribute with src="/static/bootstrap.js".
You need to make these edits manually each time you regenerate the files with pyjamas. There appears to be no way to tell pyjamas to use a specific URL prefix when generating together its output. Oh well, pyjamas' coolness makes up for a lot.
acid, I'm not sure this is as much an answer as you would hope but I've been looking for the same answers as you have.
As far as I can see the most practical way to do it is with an Apache server serving Pyjamas output and Django being used as simply a service API for JSONrpc calls and such.
On a side note I am starting to wonder if Django is even the best option for this considering using it simply for this feature is not utilizing most of it's functionality.
The issue so far as I have found with using Django to serve Pyjamas output as Django Views/Templates is that Pyjamas loads as such
Main html page loads "bootstrap.js" and depending on the browser used bootstrap.js will load the appropriate app page. Even if you appropriately setup the static file links using the Django templating language to reference and load "bootstrap.js", I can't seem to do the same for bootstrap.js referencing each individual app page.
This leaves me sad since I do so love the "cruftless URLS" feature of Django.