raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) - python-2.7

I try to read this url - http://malc0de.com/bl/BOOT in Python
import urllib2
threats = urllib2.urlopen("http://malc0de.com/bl/BOOT")
But I got this error:
Traceback (most recent call last):
File "C:\Android\android_workspace\pro2\test.py", line 2, in <module>
threats = urllib2.urlopen("http://malc0de.com/bl/BOOT")
File "C:\Python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
What can I do to fix it?

This is a HTTP error unrelated to python or urllib. It says that, for some reason, you are not allowed to view this particular page.
It seems to me that the site owner filters access by bots/crawlers, because I can open it in Firefox, but not via urllib. It might filter based on user agent, which may be changed, see Changing user agent on urllib2.urlopen, although this might be bad etiquette.

Related

How to connect to HTTPS through proxy using urllib2 (in Python)

If the website I'm trying to connect to via a proxy is unsecured (HTTP), then I'm able to connect, however if it's secured (HTTPS), then I can't.
The following code works:
import urllib2
proxy_support = urllib2.ProxyHandler({'http':'xxx.xxx.xxx.xx'})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
html = urllib2.urlopen('http://www.example.com').read()
However the code below does not work,
proxy_support = urllib2.ProxyHandler({'https':'xxx.xxx.xxx.xx'})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
html = urllib2.urlopen('https://www.example.com').read()
Instead I get the following traceback:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1240, in https_open
context=self._context)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 61] Connection refused>
According to https://docs.python.org/2/library/urllib2.html:
Changed in version 2.7.9: cafile, capath, cadefault, and context were added.
This one allowed me to connect to my local HTTPS site that is using a self-signed SSL certificate:
html = urllib2.urlopen('http://www.example.com'),\
context=ssl._https_verify_certificates(False)
I noticed in your traceback the similarities with mine. The code, just like you posted, works on Ubuntu 14.04 (Python 2.7.6) but not in 16.04 (Python 2.7.13) with exception to the last one:
File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1241, in https_open
context=self._context)
File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)>
I'm not sure if this work on your end.

Python Traceback Error

Unable to create a response with this api.I am unable to call the function locu_search('new york'). I get the following error shown below. I am using Komodo as my IDE, this started when I created a new python shell.
import urllib2
import json
local_api = '0d5897aae41eeafbd62ad0815af15cc42b2ed7c0'
def locu_search(query):
api_key = local_api
url = 'https://api.locu.com/v1_0/venue/search/?api_key=' + api_key
locality = query.replace('','%20')
final_url = url + "&locality=" + locality + "&category=restaurant"
json_obj = urllib2.urlopen(final_url)
data = json.load(json_obj)
for item in data['objects']:
print item['name'],item['phone']
locu_search('new york')
The error is listed below:
**Traceback (most recent call last):
File "<console>", line 0, in <module>
File "<console>", line 0, in locu_search
File "c:\python27\lib\urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "c:\python27\lib\urllib2.py", line 437, in open
response = meth(req, response)
File "c:\python27\lib\urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "c:\python27\lib\urllib2.py", line 475, in error
return self._call_chain(*args)
File "c:\python27\lib\urllib2.py", line 409, in _call_chain
result = func(*args)
File "c:\python27\lib\urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 400: BAD_REQUEST**
400 Bad Request should give you a headsup about the problem , this is basically due to a malformed request and I strongly suspect the culprit is in th line url = 'https://api.locu.com/v1_0/venue/search/?api_key=' + api_key , check if api_key token is invalid or no longer valid.

celery task is not working on prod server

I am getting:
Traceback (most recent call last):
File "/var/www/virtualenvs/myProj/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/var/www/virtualenvs/myProj/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/var/www/myProj/myProj/apps/myProj/tasks.py", line 43, in get_somestuff
str_response = urllib2.urlopen(req)
File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 407, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 520, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 445, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 379, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 528, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 500: Internal Server Error
when I run task in prod server, locally it is working perfectly.
the bit is
urllib2.urlopen(req)
what may I be missing? the url I am reading is available thru browser

403 error during webpage request in python

I'm trying to pull the source from a url in terminal and I'm getting this: As far I can tell everything is in the right place.
response = urllib2.urlopen(url)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

Unable to open a URL link with using urllib2 in chrome

I'm able to open the webpage by simply entering the URL link in my chrome browser
But when i move this URL link to below code, it will prompt me the error message:
CODE:
import urllib2
url = 'http://www.klse.info/companies/listed-companies/alphabet/A'
page = urllib2.urlopen(url).read()
ERROR:
File "C:\Python27\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
Anyone have idea with this?
I tried to change the URL link to other link address and it does work.
Is the website set restriction or anything i should take care for?
How to get rid from HTTP Error 403: Forbidden?
Please refer to the below code...
url = 'http://www.klse.info/companies/listed-companies/alphabet/A'
req = urllib2.Request(url, headers={'User-Agent' : "Magic Browser"})
page = urllib2.urlopen(req).read()