Simple file upload to Django via HTTP request - django

I'm really confused with the process of uploading a file (image or pdf in my case) to a Django app programmatically (via HTTP POST or PUT request). I've read the doc but I must admit it hasn't helped me get through. The most confusing part to me is the template (and form) : why do I need to specify a template and form in the view ?
I just want to send a file to a directory, and I'd like to know what exactly is needed in order to do so on the Django part as well as on the request part (content-type... ?)
I'd be really grateful to anyone able to show me some direction here..

You don't say what doc you're reading, so it's hard for us to tell what you mean. But if you're planning on doing a programmatic upload, you don't need a template, of course. You do however need some code that accepts the POST and processes the upload: you can do that with a form, or simply access the data in request.FILES and do what you want with it yourself.
Edit It's true that that page doesn't make any reference to uploading programmatically, because most people's use cases are uploading through the browser, via a form. But the page does explain how to handle uploaded files, which is the only bit that you need.

Related

Places where a website stores its data

I have just started with Python Web scraping through Requests. This could be a broad question, I will try to make it as brief as possible.
I came through situation where sometimes an entire page source can be downloaded with r.content (where r is a response object of requests's get call)
Sometimes some part of the data is stored in json format... In files that can be accessed by deeply observing the get and post calls made.
However, I even found websites where the entire content is in DOM but part of it is neither in Page source nor in Json files.
I am wondering how many of such places can a website store a data in?
(Just the names, I am not looking for how to get there)
For these last type of websites, I have observed almost every requests call made, but couldn't find where the data is.
So are there any other place except the 2 mentioned above? Or those are the only two indicating I am not doing my job right of observing the requests call?
You may answer it in brief bullet points and I can take my study from there.
Thanks in advance.
Lets assume we are talking only about HTML data. A web server could serve you data in many other formats (JSON/XML, etc.)
Please note that what I have described is generalisation, and like most generalisations, you could find exceptions that do not fit in it.
Broadly we could divide the type of data displayed (for the end user) into two categories
Pre render
Post render
Pre render
The entire HTML page is constructed at server-side and sent across to the client. Here, the JS side is concerned with the user interaction, and not with the structure of the data.
We are slowly moving away from this type of structure, but currently a large majority of all web pages uses this.
Web scraping is relatively easy here, as we can programatically pull the html page, and not bother about the javascript code that accompanies it.
a combination of requests and beautifulsoup should work in almost all of the cases (assuming that you could identify the general structure of the document).
Post render
Here the HTML page that is returned from the server is just a "skeleton" or placeholders for the actual data. The data is rendered by the accompanying JS code.
In such cases, if you fetch the source file via for eg., requests, you will get an empty shell, with no data in it.
for this if you inspect the calls made by a browser while rendering, (chrome's network tab or firefox's inspect tool or the more popular firebug), you will most likely see ajax requests that brings back the actual data from the server)
depending on how the requests are made, you could hit that ajax endpoint, and get the data in JSON.
you could use response.json() function to extract it into python-dicts.
In certain (rare) cases, there would not be an ajax call, but the HTML served from the server will still be a shell. The actual data is part of that file served, but stored as part of the JS code itself. This could be done for a variety of reasons, for example for dynamic data to be sent to static js files, or just to deter simple attempts of scraping the page.
One approach to scraping such pages would be to 'render' the page in a headless browser, which executes the JS code and returns an HTML that could be parsed via parsers like beautifulsoup
beautifulsoup has the ability to work with many parsers, one of which is html5lib, which could solve this issue.
you could also look at selenium or mechanize
or you could try parsing the js code yourself which might be faster.
Arriving at a conclusion as to what to use requires careful inspection of how the page is rendered on a browser. Even if you don't see an ajax request, the html that is served by the server need not be how the browser displays it.
A good way to start is by looking at the bare-html that is being served, by either downloading the page via curl or requests.get or simply rendering it in your browser with javascript disabled.
Good luck.

How to send a post request via excel to a RESTful Web Service without using XML?

Here's the deal: I had a excel table that fulfills a MySQL table. I already made a procedure in server side who receives the sheet, read it and put it on the database. Saddly the sheet and data table doesn't have the same structure, so I need to use a php object/script in server side to manipulate it. I have a interface to upload the file (excel file), so the PHP program can read it...
...but my boss job isn't make my life easier, is it? NO! He says that is a lot of work have to upload every excel file by the web interface. So, he asked me to make a button in the sheet that he might click after his "job" is done. That would replace the web interface.
But, the system itself is a interface that would be saled one day (well, it's the plan!). So, I just can't just role out the web interface.
WHAT I'M ASKING IS: There's a way that I could send a file (the sheet itself) in a post method straight from the VBA Macro without using XML files and name each data that I'm sending, like a form post?
So far, I've found some tutorials or even some SO posts that made me get somewhere. But all of them were talking about a XML, and I already have a method that receives a HTTP POST (from a form) and work. I aiming to reuse the same method. From my VBA script I'm already able to make the request (not a big deal) and post it. But, in the server-side script, I'm expecting a POST come out from a form, so it calls a field's name. I don't seen to be able to do that from a VBA post. =/
Here's the answer... the two first functions/methods define how to send a file to a web service. You only need the file path and the URL from service. It has answered even more than I expected. :D

Temporarily upload profile image in Django during registration?

In a Django application, during registration, I want the user to be able to see the profile image he/she selects, rather than just see a path as done purely using django forms (for an example of what I want see pinterests registration form). I assume it should involve some ajax upload and it should be stored somewhere temporarily since the user might choose not to proceed with the registration even if the profile image has been uploaded, in which case the uploaded picture should be deleted.
I was wondering what is the best way of handeling this? Any examples out there you can point to?
Thanks!!!
You are correct that an AJAX upload will be needed.
Whether the upload is temporary or permanent, things will not change much in your implementation much. In both cases you will need to upload the image to a directory on your web server. In the temporary case, you may delete it after a short amount of time passes.
Here is a Django AJAX uploader: https://github.com/GoodCloud/django-ajax-uploader
Option 1
You can use the HTML5 FileAPI to show a thumbnail of a user-selected image before they upload it.
Option 2
You can upload the file using AJAX and then send back a thumbnail for them to preview

Need help setting up django-filetransfers

My setup is: Django 1.3/Python 2.7.2/Win Server 2008 R2/IIS 7.5/MS SQL Server 2008 R2. I am developing an application whose main function is to analyze uploaded files and produce a report.
Reading over the documentation for django-filetransfers, I believe this is a solution to a problem I've been trying to solve for a while (i.e. form-based file uploads completely block all Django responses until the file-transfer finishes...horror for even moderate-sized files).
The documentation talks about piping uploads to S3 or Blobstore, and that might be what I end up doing eventually, but during development I thought maybe I could just set up my own "poor-man's S3" on a server that I control. This would basically just be another Django instance (or possibly a simple ASP.NET app) whose sole purpose is to receive uploaded files. This sounds like it should be possible with django-filetransfers and would solve the problem of Django responsiveness (???).
But I am missing some bits of understanding how this works in general, as well as some specifics. Maybe an example will help: let's say I have MyMainDjangoServer and MyFileUploadServer. MyMainDjangoServer will serve the views, including the upload form. MyFileUploadServer will "catch" the uploaded files. My questions/confusion are as follows:
My upload form will contain additional fields beyond just the file(s)...do I understand correctly that MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?
I assume that what I would need to do on MyFileUploadServer is have a view/URL that handles the form request and sucks out the request.FILES data. What else needs to happen? What happens to the rest of the form data?
How would I set up my settings.py for this scenario? The django-filetransfers examples seem to assume either S3 or GAE/Blobstore but maybe I am missing some basics.
Any advice/answers appreciated...this is a confusing and frustrating area of Django for me.
"MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?"
I know the GAE Blobstore, presumably S3 as well, handles this by requiring you to give it a success_url. In your case that would be the url on MyMainDjangoServer where your file receiving view on MyFileUploadServer would re-post the non-files form data to once the upload is complete.
Have a look at the create_upload_url method here: https://developers.google.com/appengine/docs/python/blobstore/functions
You need to recreate this functionality in some form (see below).
"How would I set up my settings.py for this scenario?"
You'd need to create your own filetransfers backend which would be a file with a prepare_upload function in it.
You can see the App Engine one here:
https://github.com/django-nonrel/djangoappengine/blob/develop/storage.py
The prepare_upload method just wraps the GAE create_upload_url method mentioned above.
So in your settings.py you'd have something like:
PREPARE_UPLOAD_BACKEND = 'myapp.filetransfers_backend.prepare_upload'
(i.e. the import path to your prepare_upload function)
For the rest you can start with the ones provided by filetransfers already:
SERVE_FILE_BACKEND = 'filetransfers.backends.url.serve_file'
# if you need it:
PUBLIC_DOWNLOAD_URL_BACKEND = 'filetransfers.backends.url.public_download_url'
These rely on the file_field.url being set (see Django docs) and since your files will be on a separate server you probably need to look into writing a custom storage backend for Django too. (the S3 and GAE cases assume you're using the custom Django storage backends from here)

Download a generated file and redirect in Django

I have a form with which users submit data to my application, and the response to submitting the form is a download with data depending on what they submitted. Since the submission affects the data in the database I want to redirect from this page to prevent the submission accidentally being made twice.
The only solution I've come across is to save the file on the server and redirect to a page which causes the file to download. However I don't really want to be keeping these files or having to manage them on the server.
Is there a way to download the file and then cause the page to redirect?
Consider also the case when the user's internet connection happens to break during the download. Should the user have a possibility to request the same download again in this case? Then you need to store either the generated file or all data needed to regenerate it anyway.
presumably there are two parts to your code, one adding their data to your db, and a second generating their download. could you just do the first part on form submit, and then redirect them (perhaps including some get parameters) to a second page which reads the db and generates their download?