AWS broken pipe error when uploading files bigger than 1 MB - django

I am a django newbie, and I inherited a django back-end with little documentation. I am making a request to said server, which is hosted on AWS. To store the files in the request we use S3.
I have found nothing on the django code that limits the size of the file uploads, and I suspect it may be AWS closing the connection because of file size.
This is the code I use, and below the error I get whenever the total size of the files is over 1 MB:
import requests
json_dict = {'key_1':'value_1','video':video,'image':,image}
requests.post('https://api.test.whatever.io/v1/register', json=dict_reg)
video is a video file ('.mov','.avi','.mp4',etc) with base64 encoding, and image is an image file ('.jpg','.png') with base64 encoding.
And this is the trace I get, ONLY when the total size is over 1 MB:
/usr/local/lib/python2.7/dist-
packages/requests/packages/urllib3/util/ssl_.py:132: InsecurePlatfo
rmWarning: A true SSLContext object is not available. This prevents urllib3
from configuring SSL
appropriately and may cause certain SSL connections to fail. You can upgrade
to a newer version of Python to solve this. For more information, see
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecurePlatformWarningTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 110, in
post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 56, in
request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line
488, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line
609, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line
473, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', error(32,
'Broken pipe'))
As mentioned previously, I have not found anywhere in the django code a limit to the file size, any hints where I should be looking at?
I also did not find anything on the AWS S3 policy.

Assuming you have a Nginx to reverse proxy your HTTP requests? if yes check this link.
Also see the value set for the below value in settings for the Upload Handlers in django
FILE_UPLOAD_MAX_MEMORY_SIZE

In the end it was the nginx configuration. changing the variable client_max_body_size in the nginx.conffile from 1M to 2M did the trick.

Related

elasticsearch exception SerializationError

From python script i am sending data to elasticsearch server
this will help me to connect to ES
es = Elasticsearch('localhost:9200',use_ssl=False,verify_certs=True)
and by using the bellow code i am able to send all data to my local ES server
es.index(index='alertnagios', doc_type='nagios', body=jsonvalue)
But when i am trying to send data to cloud ES server,the script is executing fine and it is indexing few documents after indexing few documents i am getting following error
Traceback (most recent call last):
File "scriptfile.py", line 78, in <module>
es.index(index='test', doc_type='test123', body=jsonvalue)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 298, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 342, in perform_request
data = self.deserializer.loads(data, headers.get('content-type'))
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/serializer.py", line 76, in loads
return deserializer.loads(s)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/serializer.py", line 40, in loads
raise SerializationError(s, e)
elasticsearch.exceptions.SerializationError: (u'order=0></iframe>', JSONDecodeError('No JSON object could be decoded: line 1 column 0 (char 0)',))
The same script is working fine when i am sending data to my localhost ES server , I don't know why it is not working when i am sending data to cloud instance
Please help me
The problem is resolved by using Bulk indexing method ,when we are indexing to local server it won't be a matter if we index documents one after the other ,but while indexing to cloud instance we have to follow bulk indexing method to overcome from memory issues and connection issues
if we follow bulk indexing method it will index all the documnets once into a elasticsearch and no need to open connection again and again,it won't take much time.
here is my code
from elasticsearch import Elasticsearch, helpers
jsonobject= {
'_index': 'index',
'_type': 'index123',
'_source':jsonData
}
actions = [jsonobject]
helpers.bulk(es, actions, chunk_size=1000, request_timeout=200)

Flask-JWT generates error when Debug=False

I am playing around with Flask. I have created an API using Flask-Restful and Flask-JWT. When Debug=True in Flask, and I do not send the Authorization Header, I get the response as However, when the debug=False, the response returned is Internal Server Error with this stack trace,
[2017-01-19 19:43:10,753] ERROR in app: Exception on /api_0_1/deals [GET]
Traceback (most recent call last):
File "C:\Users\ARFATS~1\Desktop\Dealflow\venv\lib\site-packages\flask\app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\ARFATS~1\Desktop\Dealflow\venv\lib\site-packages\flask\app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "C:\Users\ARFATS~1\Desktop\Dealflow\venv\lib\site-packages\flask_restful\__init__.py", line 477, in wrapper
resp = resource(*args, **kwargs)
File "C:\Users\ARFATS~1\Desktop\Dealflow\venv\lib\site-packages\flask_jwt\__init__.py", line 176, in decorator
_jwt_required(realm or current_app.config['JWT_DEFAULT_REALM'])
File "C:\Users\ARFATS~1\Desktop\Dealflow\venv\lib\site-packages\flask_jwt\__init__.py", line 155, in _jwt_required
headers={'WWW-Authenticate': 'JWT realm="%s"' % realm})
JWTError: Authorization Required. Request does not contain an access token
I would like Flask-JWT to respond with the response which is there when Debug=True. However, I cannot use debug on Production servers. One way is to use my own jwt_required decorator. Is there any other way?Also, I would be happy to know what I am missing, if any. Thanks
You will need to add this setting to your flask app:
app.config['PROPAGATE_EXCEPTIONS'] = True
When debug is true, PROPAGATE_EXCEPTIONS is also set to true by default.
Perhaps consider checking out flask-jwt-extended instead (https://github.com/vimalloc/flask-jwt-extended), it takes care of the PROPAGATE_EXCEPTIONS for you. It aims to replace the abandoned flask-jwt library, and add some conviences when working with JWTs (such as refresh tokens, easily adding custom data to the JWTs, fresh vs non-fresh tokens, and more). Full disclosure, I'm the author of that extension.
Cheers.

SSL certificates download

I am attempting to use requests package from python to access this site: https://egov.uscis.gov/casestatus/landing.do
When I ran this command:
requests.get('https://egov.uscis.gov/casestatus/landing.do')
I got the usual SSL error when your authentication verification fails..
Read through stackoverflow and adopted one of the solutions: download the certificate in (.crt) and then used openssl to convert to .pem file. I then copied the contents from this .pem file to the end of cacert.pem. However this did not work.
>>> requests.get('https://egov.uscis.gov/casestatus/landing.do')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Sandra\Anaconda\lib\site-packages\requests\api.py", line 69, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\Sandra\Anaconda\lib\site-packages\requests\api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "C:\Users\Sandra\Anaconda\lib\site-packages\requests\sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Sandra\Anaconda\lib\site-packages\requests\sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Sandra\Anaconda\lib\site-packages\requests\adapters.py", line 431, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
Any pointers as to how I can overcome this without resorting to verify=False
Also Is there any difference in downloading the file via https://superuser.com/a/97203 and https://superuser.com/a/176721?
Because I have no issue with requests.get('https://www.google.com'), do other websites place restrictions on the certificate you download?
egov.usics.gov does not provide a complete chain in its SSL handshake.
You'll need to employ a workaround similar to what's suggested here until the site administrator fixes the certificate chain issue. The intermediate certificate in your case can be obtained from https://ssl-tools.net/certificates/yuox7i-symantec-class-3-secure-server-ca
There are three ways to setup CA cert:
$ pip install certifi then
>>> requests.get(url, verify=certifi.where())
>>> requests.get(url, verify='/path/to/cert_bundle_file')
>>> os.environ['REQUESTS_CA_BUNDLE'] = '/path/to/cert_bundle_file'
>>> requests.get(url)

python-requests hanging when downloading a mass amount of files

I am trying to use python-request package to download a mass amount of files(like 10k+) from the web, each file size from several k to the largest as 100mb.
my script can run through fine for maybe 3000 files but suddenly it will hang.
I ctrl-c it and see it stuck at
r = requests.get(url, headers=headers, stream=True)
File "/Library/Python/2.7/site-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 327, in send
timeout=timeout
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 493, in urlopen
body=body, headers=headers)
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 319, in _make_request
httplib_response = conn.getresponse(buffering=True)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1045, in getresponse
response.begin()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 409, in begin
version, status, reason = self._read_status()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline(_MAXLINE + 1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)
Here is my python code to do the download
basedir = os.path.dirname(filepath)
if not os.path.exists(basedir):
os.makedirs(basedir)
r = requests.get(url, headers=headers, stream=True)
with open(filepath, 'w') as f:
for chunk in r.iter_content(1024):
if chunk:
f.write(chunk)
f.flush()
I am not sure what went wrong, if anyone has a clue, please kindly share some insights.
Thanks.
This is not a duplicate of the question that #alfasin linked in their comment. Judging by the (limited) traceback you posted, the request itself is hanging (the first line shows it was executing r = requests.get(url, headers=headers, stream=True)).
What you should do is set a timeout and catch the exception that is raised when the request times out. Once you have the URL try it in a browser or with curl to ensure it responds properly, otherwise remove it from your list of URLs to request. If you find the misbehaving URL, please update your question with it.
I faced a similar situation and it seems like a bug in the requests package was causing this issue. Upgrading to requests package 2.10.0 fixed it for me.
For your reference the change log for Requests 2.10.0 shows that the embedded urllib3 was updated to version 1.15.1 Release history
And the release history for urllib3 (Release history ) shows that version 1.15.1 included fixes for:
Chunked transfer encoding when requesting with chunked=True. (Issue #790)
Fixed AppEngine handling of transfer-encoding header and bug in Timeout defaults checking. (Issue #763)

Django-allauth - Error Accessing FB User Profile - Max Retries Exceeded

I'm trying to finish the setup of django-allauth for my site (in development).
Using Django==1.6.5 and django-allauth==0.17.0.
After following the documentation, I have been able to get the FB dialog. When I click OK, it hangs on localhost:8000/accounts/facebook/login/token/ for about 2 minutes, before returning with an error. The console is showing:
Error accessing FB user profile
Traceback (most recent call last):
File "/home/amir/claudius/lib/python2.7/site-packages/allauth/socialaccount/providers/facebook/views.py", line 73, in login_by_token
login = fb_complete_login(request, app, token)
File "/home/amir/claudius/lib/python2.7/site-packages/allauth/socialaccount/providers/facebook/views.py", line 26, in fb_complete_login
params={'access_token': token.token})
File "/home/amir/claudius/lib/python2.7/site-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/home/amir/claudius/lib/python2.7/site-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/home/amir/claudius/lib/python2.7/site-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/home/amir/claudius/lib/python2.7/site-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/home/amir/claudius/lib/python2.7/site-packages/requests/adapters.py", line 375, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPSConnectionPool(host='graph.facebook.com', port=443): Max retries exceeded with url: /me?access_token=CAAUi8RJCRZAkBAPdHFKhckONnLwjOExZCeVXpW39GZAZBLdD5rTsukQqTPi9KP6neMDxtwdhZAQvmzCS92rxR0rIZCNlzenQ2jHiyANvToy6tOWrOh5ZAYFmJFYeYvbXGNc9fuPIa0hAUqGfPzFtZB0tepoxoO7bpt01izuTYBkmS9NJChXaX9iDZAQlDTDvtLTZBvLesjFtSfwp6RusbArRzH (Caused by <class 'socket.error'>: [Errno 101] Network is unreachable)
[26/Jul/2014 06:14:36] "POST /accounts/facebook/login/token/ HTTP/1.1" 200 1205
Anyone knows the cause for this?
Well, I found out that Facebook Graph API requires IPv6 in a PHP post by Etienne Rached, of all places. Tethering through my mobile phone solved it straight away after that for the development process, and this will be a non-issue when our website is deployed.
I hope this will help someone out there.