Migrating Django app to use online media storage - django

I have a Django app running on a server. Currently user uploads are stored on the server filesystem.
I already know how to set up S3 storage. What is the best way to migrate existing uploads to S3 without breaking the API and having existing uploads still available?
These files are served to the front end in two ways:
Directly through a /media/path/to/upload endpoint:
/media/<path> django.views.static.serve
Through a viewset:
/api/v1/users/<user_pk>/items/<item_pk>/attachments/<pk>/ project.app.views.ItemAttachmentsViewSet
Does the following make sense:
Change the storage backend
Go through all model objects
Save each model object to get the files uploaded
Have /media/path go to a new view that will serve the files similar to how ItemAttachmentsViewSet does it.
? Or is there a better way?

The procedure outlined in the question was what I ended up doing, with the exception of step 4 which turned out to be unnecessary.

Related

Where should I store user uploaded pics and files

I am working on a django app to store user pics and photos.
What is the optimal approach to store individual user media.
File Sizes are no more than 5MB.
The data is persistent.
The approach i have in mind is:
On form data submission, Upload it to an FTP server using django-storages.
Store the url and fetch it via http later for user.
How to save upload files to another server
I have seen the answers and I don't know what type of queue needs to be used.
you'd usually save the file locally and then latter upload it to some cloud service asynchronously, preferably using something like django-celery
see this answer

django internal file sharing with privacy

I am trying to write a job board / application system and I need the ability for clients to upload a CV and then share it with employers but I can't figure out the best way to do this. The CV needs to be kept private except for who it is shared with and there needs to be the ability for clients to update the cv after submitting it to an employer.
Is there a django app that does this already, or how would I go about setting up the privacy, file sharing etc so that the files can be copied and still private to just those shared with?
Use Apache's x-sendfile, for an example see: Having Django serve downloadable files
Store the files in a private folder. Django authorizes the request and let Apache serve the file using the x-sendfile header.
Use S3, and django-storages.
Upload the CV to S3, with the file set as private.
Create a view which will fetch a given CV from the S3 bucket, producing an "expiring URL", or that will just fetch the raw data from S3 and pass it through to the user through a view.
The file's privacy is completely controlled this way.
You could also do this by storing the uploaded file outside of your projects STATICs directory (which is assumed to be publicly accessible), and doing step 3 for that.
Or, if you want to make a DBA's head explode, store the CV as a BLOB in the database and use a view in the same way.

Restricting access to static files in Django/Nginx

I am building a system that allows users to generate a documents and then download them. The documents are PDFs (not that it matters for the sake of this question) and when they are generated I store them on my local file system that the web server is running on with uuid file names
c7d43358-7532-4812-b828-b10b26694f0f.pdf
but I know "security through obscurity" is not the right solution ...
I want to restrict access to they files on a per account basis if possible. One thing I think I could do is upload them to S3 and provide a signed URL, but I want to avoid that for now if possible.
I am using Nginx/Django/Gunicorn/EC2/S3
What are some other solutions?
If you are serving small files, you can indeed use Django to serve them directly, writing the file into the HttpResponse object.
If you're serving large files however, you might want to leave that task to your webserver, you can use the X-Accel-Redirect header on Nginx (and X-Sendfile for Apache & Lighttpd) to have your webserver serve the file for you.
You can find more information about the header itself in Nginx's documentation here, and you could find some inspiration as to how to use that in Django here.
Once you're done sending files through Django views, enforcing user authentication should be pretty straightfoward using Django's auth framework.
How about enforcing user==owner at the view level, preventing access to the files, storing them as FileFields, and only retrieving the file if that condition is met.
e.g. You could use the #login_required decorator on the view to allow access only if logged in. This could be refined using request.user to check against the owner of the file. The User Auth section of the Django documentation is likely to be helpful here.
The other option, as you mention is via S3 itself, generating urls within Django which have a querystring allowing an authenticated user access to download a particular s3 object with a time limit. Details on that can be found at the s3 documentation. A similar question has been asked before here on SO.
I've used django-private-files with great success, it enforces protection at the view level and uses differente backends to do the actual file transfer.

Need help setting up django-filetransfers

My setup is: Django 1.3/Python 2.7.2/Win Server 2008 R2/IIS 7.5/MS SQL Server 2008 R2. I am developing an application whose main function is to analyze uploaded files and produce a report.
Reading over the documentation for django-filetransfers, I believe this is a solution to a problem I've been trying to solve for a while (i.e. form-based file uploads completely block all Django responses until the file-transfer finishes...horror for even moderate-sized files).
The documentation talks about piping uploads to S3 or Blobstore, and that might be what I end up doing eventually, but during development I thought maybe I could just set up my own "poor-man's S3" on a server that I control. This would basically just be another Django instance (or possibly a simple ASP.NET app) whose sole purpose is to receive uploaded files. This sounds like it should be possible with django-filetransfers and would solve the problem of Django responsiveness (???).
But I am missing some bits of understanding how this works in general, as well as some specifics. Maybe an example will help: let's say I have MyMainDjangoServer and MyFileUploadServer. MyMainDjangoServer will serve the views, including the upload form. MyFileUploadServer will "catch" the uploaded files. My questions/confusion are as follows:
My upload form will contain additional fields beyond just the file(s)...do I understand correctly that MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?
I assume that what I would need to do on MyFileUploadServer is have a view/URL that handles the form request and sucks out the request.FILES data. What else needs to happen? What happens to the rest of the form data?
How would I set up my settings.py for this scenario? The django-filetransfers examples seem to assume either S3 or GAE/Blobstore but maybe I am missing some basics.
Any advice/answers appreciated...this is a confusing and frustrating area of Django for me.
"MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?"
I know the GAE Blobstore, presumably S3 as well, handles this by requiring you to give it a success_url. In your case that would be the url on MyMainDjangoServer where your file receiving view on MyFileUploadServer would re-post the non-files form data to once the upload is complete.
Have a look at the create_upload_url method here: https://developers.google.com/appengine/docs/python/blobstore/functions
You need to recreate this functionality in some form (see below).
"How would I set up my settings.py for this scenario?"
You'd need to create your own filetransfers backend which would be a file with a prepare_upload function in it.
You can see the App Engine one here:
https://github.com/django-nonrel/djangoappengine/blob/develop/storage.py
The prepare_upload method just wraps the GAE create_upload_url method mentioned above.
So in your settings.py you'd have something like:
PREPARE_UPLOAD_BACKEND = 'myapp.filetransfers_backend.prepare_upload'
(i.e. the import path to your prepare_upload function)
For the rest you can start with the ones provided by filetransfers already:
SERVE_FILE_BACKEND = 'filetransfers.backends.url.serve_file'
# if you need it:
PUBLIC_DOWNLOAD_URL_BACKEND = 'filetransfers.backends.url.public_download_url'
These rely on the file_field.url being set (see Django docs) and since your files will be on a separate server you probably need to look into writing a custom storage backend for Django too. (the S3 and GAE cases assume you're using the custom Django storage backends from here)

Django: control access to "static" files

Ok, I know that serving media files through Django is a not recommended. However, I'm in a situation where I'd like to serve "static" files using fine-grained access control through Django models.
Example: I want to serve my movie library to myself over the web. I'm often travelling and I'd like to be able to view any of my movies wherever I am, provided I have internet access. So I rip my DVDs, upload them to my server and build this simple Django application coupled with some embeddable video player.
To avoid any legal repercussions, I'd like to ensure that only logged-on users with the proper permissions (i.e. myself and people living in the same household, which can, like me, access the real DVDs at their convenience), but denies it to other users (i.e. people who posted comments on my blog) and returns an HTTP 404.
Now, serving these files directly using Apache and mod_wsgi is rather troublesome because when an HTTP request for the media files (i.e. http://video.mywebsite.com/my-favorite-movie/) comes in, I need to validate against my user database that the person at the other end has the proper permissions.
Question: can I achieve this effect without serving the media files directly through a Django view? What are my options?
One thing I did think of is to write a simple script that takes a session ID and a video's slug and returns some boolean indicating if the user may (or may not) access the video file. Then, somehow request mod_wsgi to execute this script before accessing the requested URL and return an HTTP 404 if the script failed. However, I don't have a clue if this is even possible.
Edit: Posting this question clarified some of my ideas for search and I've come across mod_python's file wrapper extension. Does anyone have enough experience with that to validate that it is a viable solution?
Yes, you can hook into Django's authentication from Apache. See this how-to:
Authenticating against Django’s user database from Apache