Can we replace urlopen in this code with requests library? - concurrency

Can we replace urlopen library in this example for concurrent requests with the requests library in python 2.7?
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
Thanks!

Yes, you can.
Your code seems to do a simple HTTP get with timeout, so the equivalent with requests is:
import requests
def load_url(url, timeout):
r = requests.get(url, timeout=timeout)
return r.content

Related

Error sending a file using Django - file turns out empty

This is my views.py files:
from django.http import HttpResponse
def render(request):
response = HttpResponse(content_type='application/pdf')
response['Content-Disposition'] = 'attachment; filename="somefilename.pdf"'
response['X-Sendfile'] = '/files/filename.pdf'
# path relative to views.py
return response
When I run the server and request
http://localhost:8080/somestring
I get an empty file called somefilename.pdf. I suspect that there is some crucial part missing in render.
The other parts of this app outside of views.py are correct to my understanding.
Here is the code that solved my problem:
from django.http import HttpResponse
from wsgiref.util import FileWrapper
def render(request):
response = HttpResponse(FileWrapper(open('file.pdf', 'rb')), content_type='application/pdf')
response['Content-Disposition'] = 'attachment; filename="somefilename.pdf"'
return response
The manage.py runserver development serer doesn't support X-Sendfile. In production, you need to enable X-Sendfile for your server (e.g. Apache).
You may find the django-sendfile package useful. It has a backend that you can use in development. However, it hasn't had a release in some time, and I found that I had to apply pull request 62 to get Python 3 support.

Stream file from remote url to Django view response

Is there any way to stream file from remote URL with Django Response (without downloading the file locally)?
# view.py
def file_recover(request, *args, **kwargs):
file_url = "http://remote-file-storage.com/file/111"
return StreamFileFromURLResponse(file_url)
We have file storage (files can be large - 1 GB and more). We can't share download url (there are security issues). File streaming can significantly
increase download speed by forwarding download stream to Django response.
Django has built in StreamingHttpResponse class which should be given an iterator that yields strings as content. In example below I'm using requests Raw Response Content
import requests
from django.http import StreamingHttpResponse
def strem_file(request, *args, **kwargs):
r = requests.get("http://host.com/file.txt", stream=True)
resp = StreamingHttpResponse(streaming_content=r.raw)
# In case you want to force file download in a browser
# resp['Content-Disposition'] = 'attachment; filename="saving-file-name.txt"'
return resp

python post with auth

ok here is my code , what I am trying to do is post to a page that is password protected can you have a look at the code below at see where I am going wrong getting
!/usr/bin/python
import requests, sys, socket, json
from requests.auth import HTTPDigestAuth ,HTTPBasicAuth
172.168.101.214
params = {'#Generate': 'New'}
response = requests.post('https://TerraceQ.internal.ca/views/Debug_Dump/1', auth=HTTPDigestAuth('user', 'fakepassword'), data=params)
print response.status_code
there this worked
ip="172.168.99.99"
try:
response = requests.get('https://' + ip + '/views', auth=HTTPDigestAuth('username', 'password'), verify=False)
except urllib3.exceptions.SSLError as e:
sys.exit('test')

Python urllib2 does not respect timeout

The following two lines of code hangs forever:
import urllib2
urllib2.urlopen('https://www.5giay.vn/', timeout=5)
This is with python2.7, and I have no http_proxy or any other env variables set. Any other website works fine. I can also wget the site without any issue. What could be the issue?
If you run
import urllib2
url = 'https://www.5giay.vn/'
urllib2.urlopen(url, timeout=1.0)
wait for a few seconds, and then use C-c to interrupt the program, you'll see
File "/usr/lib/python2.7/ssl.py", line 260, in read
return self._sslobj.read(len)
KeyboardInterrupt
This shows that the program is hanging on self._sslobj.read(len).
SSL timeouts raise socket.timeout.
You can control the delay before socket.timeout is raised by calling
socket.setdefaulttimeout(1.0).
For example,
import urllib2
import socket
socket.setdefaulttimeout(1.0)
url = 'https://www.5giay.vn/'
try:
urllib2.urlopen(url, timeout=1.0)
except IOError as err:
print('timeout')
% time script.py
timeout
real 0m3.629s
user 0m0.020s
sys 0m0.024s
Note that the requests module succeeds here although urllib2 did not:
import requests
r = requests.get('https://www.5giay.vn/')
How to enforce a timeout on the entire function call:
socket.setdefaulttimeout only affects how long Python waits before an exception is raised if the server has not issued a response.
Neither it nor urlopen(..., timeout=...) enforce a time limit on the entire function call.
To do that, you could use eventlets, as shown here.
If you don't want to install eventlets, you could use multiprocessing from the standard library; though this solution will not scale as well as an asynchronous solution such as the one eventlets provides.
import urllib2
import socket
import multiprocessing as mp
def timeout(t, cmd, *args, **kwds):
pool = mp.Pool(processes=1)
result = pool.apply_async(cmd, args=args, kwds=kwds)
try:
retval = result.get(timeout=t)
except mp.TimeoutError as err:
pool.terminate()
pool.join()
raise
else:
return retval
def open(url):
response = urllib2.urlopen(url)
print(response)
url = 'https://www.5giay.vn/'
try:
timeout(5, open, url)
except mp.TimeoutError as err:
print('timeout')
Running this will either succeed or timeout in about 5 seconds of wall clock time.

Exporting data from App Engine datastore as a Google Drive spreadsheet

I have an app engine application on which I mark my monthly expenses along with some comments or reason. I would like to export these data into a
Google Drive Spreadsheet. I use Django framework.
I had gone through the tutorials provided by Google here.
But they have implemented it using webapp2 and jinja. Moreover, the doc for Implementing Using Django seems way too obsolete since I do not use Django ORM.
Below is my code sample which I use to upload. I strongly apologize if what I paste below is rubbish. Please help.
from django.utils.datastructures import SortedDict
import os
from apiclient.discovery import build
from apiclient.http import MediaFileUpload
from oauth2client.appengine import OAuth2DecoratorFromClientSecrets
decorator = OAuth2DecoratorFromClientSecrets(os.path.join(os.path.dirname(__file__ ), 'clientSecrets.json'), 'https://www.googleapis.com/auth/drive')
drive_service = build('drive', 'v2')
class Exporter(object):
serializedObjects = []
mime_type = 'text/plain'
fileToExport = None
request = None
def __init__(self, serializedObjects, request):
self.serializedObjects = serializedObjects
self.request = request
def createCSV(self):
import csv
import StringIO
stdout = StringIO.StringIO()
writer = csv.writer(stdout)
for obj in self.serializedObjects:
for value in obj.values():
writer.writerow([value])
# I will get the csv produced from my datastore objects here.
# I would like to upload this into a Google Spreadsheet.
# The child class ExportToSpreadSheet tries to do this.
return stdout.getvalue()
class ExportToSpreadSheet(Exporter):
def __init__(self, *args, **kwargs):
super(ExportToSpreadSheet, self).__init__(*args, **kwargs)
self.mime_type = 'application/vnd.google-apps.spreadsheet'
def create(self):
import datetime
valueToDrive = self.createCSV()
media_body = MediaFileUpload(valueToDrive, mimetype=self.mime_type, resumable=True)
body = {
'title' : 'MyExpense_%s' % datetime.datetime.now().strftime('%d_%b_%Y_%H_%M_%S'),
'description' : '',
'mimeType' : self.mime_type
}
self.fileToExport = drive_service.files().insert(body=body, media_body=media_body, convert=True)
return self.fileToExport
#decorator.oauth_aware
def upload(self):
if decorator.has_credentials():
self.create()
self.fileToExport.execute(decorator.http())
return self.fileToExport
raise Exception('user does not have the credentials to upload to google drive.')
#decorator.oauth_aware only works on webapp.RequestHandler subclasses. Why I am saying it is because I got this error while I ran the code.
INFO 2013-09-19 11:28:04,550 discovery.py:190] URL being requested: https://www.googleapis.com/discovery/v1/apis/drive/v2/rest?userIp=%3A%3A1
ERROR 2013-09-19 11:28:05,670 main.py:13] Exception in request:
Traceback (most recent call last):
File "/home/dev5/divya/jk/MyApp/ItsMyTuition/SDK/google_appengine/lib/django-1.2/django/core/handlers/base.py", line 100, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/home/dev5/divya/jk/MyApp/ItsMyTuition/ItsMyTuition/src/tuition/json/ajaxHandler.py", line 27, in mainHandler
responseValues = funtionToCall(*args)
File "/home/dev5/divya/jk/MyApp/ItsMyTuition/ItsMyTuition/src/tuition/json/ajaxHandler.py", line 69, in export
uploadedFile = exporterInstance.upload()
File "/home/dev5/divya/jk/MyApp/ItsMyTuition/ItsMyTuition/src/oauth2client/appengine.py", line 770, in setup_oauth
self._create_flow(request_handler)
File "/home/dev5/divya/jk/MyApp/ItsMyTuition/ItsMyTuition/src/oauth2client/appengine.py", line 734, in _create_flow
redirect_uri = request_handler.request.relative_url(
AttributeError: 'ExportToSpreadSheet' object has no attribute 'request'
INFO 2013-09-19 11:28:05,777 module.py:593] default: "POST /ajaxCall/export HTTP/1.1" 200 964
Since I am using Django framework I cannot get a Request handler as they except.
How can I integrate or do it in my scenario? I would very much appreciate any code samples or relevant links I may have missed.
Moreover, the whole thing happens in an ajax call.
Thanks in advance.
Use mimeType=text/csv and during the upload, request a conversion from csv to Spreadsheets:
drive_service.files().insert(covert=True, body=body, media_body=media_body, convert=True)