So I'm trying to make a discord bot that checks a website and sends a message if the price of an item falls below a certain price. I got it working on my local machine, so I decided to host it on Google Compute Engine so it can run 24/7, but I ran into some issues.
After much testing, I've determined it's because Google Compute Engine doesn't like making aiohttp GET requests. Below is the isolated part of my code that is causing issues on Google Compute Engine, but works fine on my local machine.
import asyncio
from bs4 import BeautifulSoup
import aiohttp
async def myDriver():
await httpReq()
async def httpReq():
async with aiohttp.ClientSession() as session:
async with session.get("https://www.newegg.com/p/N82E16824569005?Item=N82E16824569005&cm_sp=Homepage_BS-_-P1_24-569-005-_-12062020") as page:
pageContent = await page.text()
content = BeautifulSoup(pageContent, 'lxml')
price = content.find("li", {"class": "price-current"}).strong.text.replace(",", "")
print(price)
asyncio.run(myDriver())
Error:
File "GCEtestAiohttp.py", line 19, in httpReq
price = content.find("li", {"class": "price-current"}).strong.text.replace(",", "") AttributeError: 'NoneType' object has no attribute 'strong'
notes:
"Debian GNU/Linux 10 (buster)"
python 3.7.3
aiohttp 3.6.3
I've tried similar code with the normal requests library on Google Compute Engine, and everything works fine, so I really believe its an issue with using aiohttp requests.
Related
I'm trying to use Dialogflow's detect_intent in Python and I keep getting:
404 com.google.apps.framework.request.NotFoundException: Agent metadata not found for agentId: ####-####-####-####-####
Here's a snippet of my code:
import google.cloud.dialogflow as dialogflow
from CONFIG import DIALOGFLOW_PROJECT_ID
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = 'credentials/dialogflow.json'
def predict_intent(text, language):
session_client = dialogflow.SessionsClient()
session = session_client.session_path(DIALOGFLOW_PROJECT_ID, SESSION_ID)
text_input = dialogflow.TextInput(text=text, language_code=language)
query_input = dialogflow.QueryInput(text=text_input)
response = session_client.detect_intent(session=session, query_input=query_input) # ERROR
return response.query_result.intent.display_name
I tried running the function multiple times and some of them succeed, but most fall in the exception.
I can train the bot using the same interface and it works fine.
I'm using Python 3.7 and the following Google Cloud modules: google-api-core==2.0.1, google-auth==2.0.2, google-cloud-dialogflow==2.7.1, googleapis-common-protos==1.53.0.
Got the following code:
import time
from google.protobuf.timestamp_pb2 import Timestamp
from google.cloud import bigquery_datatransfer_v1
def runQuery (parent, requested_run_time):
client = bigquery_datatransfer_v1.DataTransferServiceClient()
projectid = '[enter your projectId here]' # Enter your projectID here
transferid = '[enter your transferId here]' # Enter your transferId here
parent = client.project_transfer_config_path(projectid, transferid)
start_time = bigquery_datatransfer_v1.types.Timestamp(seconds=int(time.time() + 10))
response = client.start_manual_transfer_runs(parent, requested_run_time=start_time)
print(response)
We used it in few different projects and cases and everything works fine. Today I deployed another function using this code and keep getting the following error:
AttributeError: 'DataTransferServiceClient' object has no attribute
'project_transfer_config_path'
What am I missing?
Thank you!
You are probably using a newer version (2.0.0 or 2.1.0) of the google-cloud-bigquery-datatransfer client library. In these versions, most utility methods have been removed, one of them being project_transfer_config_path.
You can use the method transfer_config_path of the client to achieve the same result.
I would strongly suggest that you study the Migration Guide to 2.0.0 as there might be other changes that you need to make too.
In case you are using version 2.0.0 and not 2.1.0, I would recommend upgrading to the latest since there are breaking changes between them, for example the import paths that were changed in 2.0.0 have been reverted in 2.1.0.
I'm working on a script that checks multiple domain names if they are registered or not. It loops through a list of the domains read from a file, and the registration check is done using enom's API. My problem is the code accessing the API in Python 2:
import urllib2
import xml.etree.ElementTree as ET
...
response = urllib2.urlopen(url)
content = ET.fromstring(response.read())
...
return content.find("RRPCode").text
generates the error: 'NoneType' object has no attribute 'text' in nearly 30% of the checks. While the Python 3 code:
import urllib.request
import xml.etree.ElementTree as ET
...
response = urllib.request.urlopen(url)
content = ET.fromstring(response.read())
...
return content.find("RRPCode").text
works fine. I should also mention that the number of errors returned are random and not related to specific domain names.
What could be the cause of these errors?
I am by the way using Python 2.7.3 and Python 3.2.3 on a VPS running Ubuntu 12.04 server.
Thanks.
I am facing a bit of a situation,
Scenario: I got a django rest api running on my localhost:8000 and I want to access the api using my command line. I have tried urllib2 and python requests libs to talk to the api but failed(i'm getting a 503 error). But when I pass google.com as the url, I am getting the expected response. So I believe my approach is correct but I'm doing something wrong. please see the code below :
import urllib, urllib2, httplib
url = 'http://localhost:8000'
httplib.HTTPConnection.debuglevel = 1
print "urllib"
data = urllib.urlopen(url);
print "urllib2"
request = urllib2.Request(url)
opener = urllib2.build_opener()
feeddata = opener.open(request).read()
print "End\n"
Envioroments:
OS Win7
python v2.7.5
Django==1.6
Markdown==2.3.1
colorconsole==0.6
django-filter==0.7
django-ping==0.2.0
djangorestframework==2.3.10
httplib2==0.8
ipython==1.0.0
jenkinsapi==0.2.14
names==0.3.0
phonenumbers==5.8b1
requests==2.1.0
simplejson==3.3.1
termcolor==1.1.0
virtualenv==1.10.1
Thanks
I had a similar problem, but found that it was the company's proxy that was preventing from pinging myself.
503 Reponse when trying to use python request on local website
Try:
>>> import requests
>>> session = requests.Session()
>>> session.trust_env = False
>>> r = session.get("http://localhost:5000/")
>>> r
<Response [200]>
>>> r.content
'Hello World!'
If you are registering your serializers with DefaultRouter then your api will appear at
http://localhost:8000/api/ for an html view of the index
http://localhost:8000/api/.json for a JSON view of the index
http://localhost:8000/api/appname for an html view of the individual resource
http://localhost:8000/api/appname/.json for a JSON view of the individual resource
you can check the response in your browser to make sure your URL is working as you expect.
I'm learning web scraping and building a simple web app at the moment, and I decided to practice scraping a schedule of classes. Here's a code snippet I'm having trouble with in my application, using Python 2.7.4, Flask, Heroku, BeautifulSoup4, and Requests.
import requests
from bs4 import BeautifulSoup as Soup
url = "https://telebears.berkeley.edu/enrollment-osoc/osc"
code = "26187"
values = dict(_InField1 = "RESTRIC", _InField2 = code, _InField3 = "13D2")
html = requests.post(url, params=values)
soup = Soup(html.content, from_encoding="utf-8")
sp = soup.find_all("div", {"class" : "layout-div"})[2]
print sp.text
This works great locally. It gives me back the string "Computer Science 61A P 001 LEC:" as expected. However, when I tried to run it on Heroku (using heroku run bash and then run python), I got back an error,403 Forbidden.
Am I missing some settings on Heroku? At first I thought it's the school settings, but then I was wondering why it works locally without any trouble... Any explanation/suggestion would be really appreciated! Thank you in advance.
I was having a similar issue, request was working locally but getting blocked on Heroku. It looks like the issue is that some websites block requests coming from Heroku (which on on AWS Servers). To get around this you can send your requests via a proxy server.
There are a bunch of different add-ons in heroku to achieve this, I went with fixie which has a reasonably sized free tier.
To install:
heroku addons:create fixie:tricycle
Then import into your local environment so you can try locally:
heroku config -s | grep FIXIE_URL >> .env
then in your python file you just add a couple of lines:
import os
import requests
from bs4 import BeautifulSoup as Soup
proxyDict = {
"http" : os.environ.get('FIXIE_URL', ''),
"https" : os.environ.get('FIXIE_URL', '')
}
url = "https://telebears.berkeley.edu/enrollment-osoc/osc"
code = "26187"
values = dict(_InField1 = "RESTRIC", _InField2 = code, _InField3 = "13D2")
html = requests.post(url, params=values, proxies=proxyDict)
soup = Soup(html.content, from_encoding="utf-8")
sp = soup.find_all("div", {"class" : "layout-div"})[2]
print sp.text
Docs for Fixie are here:
https://devcenter.heroku.com/articles/fixie