How can I retrieve the number of Likes or Dislikes of a video using Python?
The entry.rating element will show me:
<ns0:rating xmlns:ns0="http://schemas.google.com/g/2005" average="4.936976" max="5" min="1" numRaters="101501" rel="http://schemas.google.com/g/2005#overall" />
Which according to developers.google.com, is a deprecated element
<gd:rating>
but I don’t know how to use the new element
<yt:rating>.
Can someone help me?
Thank you.
you can use pafy api
http://np1.github.io/pafy/
import pafy
url = "https://www.youtube.com/watch?v=bMt47wvK6u0"
video = pafy.new(url)
print video.likes
print video.dislikes
You can use Data API v3 instead.
videos->rate and videos->getRating are your calls.
You can use Python library and checkout Python code samples.
Related
I have a project and i need to be take some info's on www.wikizero.com because wikipedia is not work on my country so.. I tried to make API on https://www.wikizero.com/tr/Mustafa_Kemal_Atat%C3%BCrk this site and <div class="mw-parser-output"> under this and first <p> "Mustafa Kemal Atatürk[n 2] (1881[n 3] - 10 Kasım 1938), Türk mareşal ve devlet adamı. Ülkesinde monarşinin kaldırılarak cumhuriyetin kurulmasına önderlik etti ve 1923'ten 1938'e kadar cumhurbaşkanı olarak görev yaptı." this is what i want to take on web site but i can only take this.
when i write this code
soup.find("div", attrs={"class":"mw-parser-output"}).select("p:nth-of-type(1)")
its shows me this.
enter image description here
So how i can take this blocks.? enter image description here
I'm trying to scrape certain contents from a webpage using Scrapy.
The html element looks like below.
'<p>\n 阪急宝塚線\xa0/\xa0石橋駅\xa0徒歩1分\n (<a href="javascript:void(0);" style="cursor:pointer;" onclic
k=\'window.open("http://athome.ekiworld.net/?id=athome&to=asso 302 ワンルーム&to_near_station1=25824&to_near_time1=1&to_near_traffic1=徒歩 1 分");return false;\'>電車ルート案内</a>)\n
</p>'
My goal is to extract only this part "阪急宝塚線\xa0/\xa0石橋駅\xa0徒歩1分\n".
I tried to use .re() with response and I thought ^(.+?<a) would work since it succeeded parsing on https://regex101.com/. But on scrapy shell, it doesn't parse anything (gives me []).
Could someone help me with this?
I use Python3/scrapy1.3.0.
Thanks!
import re
text = '''<p>\n 阪急宝塚線\xa0/\xa0石橋駅\xa0徒歩1分\n (<a href="javascript:void(0);" style="cursor:pointer;" onclic
k=\'window.open("http://athome.ekiworld.net/?id=athome&to=asso 302 ワンルーム&to_near_station1=25824&to_near_time1=1&to_near_traffic1=徒歩 1 分");return false;\'>電車ルート案内</a>)\n
</p>'''
re.search(r'\n.+?\n', text).group()
out:
'\n 阪急宝塚線\xa0/\xa0石橋駅\xa0徒歩1分\n'
I'm currently in the process of trying to scrape a website. The problem is the information is placed on google maps in an iframe. Specifically, Latitude and Longitude.
I'm able to get all the other information I currently need expect this. Searching around, and working with import.io tech support, I found I need to use specific xPath and Regex to pull this information but the code I found on the site has me lost. Ideally I'd like to pull Latitude and Longitude separately. This is the code I have to work with.
What are my options? Thank you.
<div class="padding-listItem--sm">
<iframe width="100%" height="310" frameborder="0" allowfullscreen="" src="https://www.google.com/maps/embed/v1/place?q=33.3929503,-111.908652&key=AIzaSyDK08tC4NRubbIiw-xwDR1WEp-YAXX1Mx8" style="border:0"></iframe>
</div>
1) Get the src attribute of the iframe element.
string srcText = driver.findElement(By.tagName("iframe")).getAttribute("src");
2) Parse the url (found in srcText) for the latitude and longitude values.
Regex to find both numbers:
/([-]?\d+\.\d+)/g
when the url is as you specified:
https://www.google.com/maps/embed/v1/place?q=33.3929503,-111.908652&key=AIzaSyDK08tC4NRubbIiw-xwDR1WEp-YAXX1Mx8"
The XPath to obtain the iframe source is:
//div[#class='padding-listItem--sm']/iframe/#src
Then you can apply a regex like this one to obtain latitude and longitude
/q=(-?[\d\.]*),(-?[\d\.]*)/g
Implementation online Here
I'm trying to make a simple webscaper using Python and the requests library.
r=requests.get(https://nustar.newcastle.edu.au/psp/CS9PRD/EMPLOYEE/HRMS/c/MANAGE_ACADEMIC_RECORDS.STDNT_ACTIVATION.GBL?FolderPath=PORTAL_ROOT_OBJECT.HCSR_RECORDS_AND_REGISTRATION.HCSR_STUDENT_TERM_INFORMATION.HC_STDNT_ACTIVATION_GBL&IsFolder=false&IgnoreParamTempl=FolderPath%2cIsFolder
I would like to POST a search input into this URL, but I'm struggling to work out how.
This is the search box code from the website:
<input id="STDNT_SRCH_EMPLID" class="PSEDITBOX" type="text" maxlength="11" style="width:140px; " value="" tabindex="13" name="STDNT_SRCH_EMPLID"></input>
I assume I have to somehow change value = "" to value = "foo".
Any help will appreciated, thanks.
See request's quick start here.
import requests
value1='foo'
payload = {'STDNT_SRCH_EMPLID': value1} # 'key2': 'value2' and so on (comma delimited)
r = requests.post("http://yourUrl.org/", data=payload)
print(r.text)
Do a network analysis in the developer tool of your browser and copy the curl command of the POST package.
Then you surf to [curl.trillworks.com][1] and convert the curl command by pasting it into a Python POST request.
Inside of your python request you can modify the values.
I have a simple html page that sources a javascript file. The javascript file's only purpose is to write the following...
document.write('<img src="http://www.location.com/image.png">');
Once the information has been written, im needing some javascript to extract the url and the image source and return the url and image locations alone.
Any help is appreciated. Thank you in advance!
Check out this answer which gives shows you how to do it with JQuery or just plain Javascript
UPDATE:
If you have the ability to modify the HTML, then why don't you put in a DOM element that you can hook on to right after where the image will be inserted? Then you can use the following JQuery:
var linkDest = $('#Anchor').prev().attr('href');
var imgSrc = $('#Anchor').prev().children().attr('src');
Which you can see in this JSFiddle example