What is AttributeError: object has no attribute 'w3c'? - python-2.7

I am trying to perform drag and drop, python-webdriver.
But I'm not successful at it. Used simple drag and drop apis & drag and drop by offset. And also used action chains, nothing worked out for me. I could see few ppl mentioned that it has worked for them Could someone please guide me here.
from selenium.webdriver.common.action_chains import ActionChains
def test_drag_and_drop(self):
source = self.find_elements("xpath=xpath_of_source")
destination = self.find_elements("id=id_of_destination")
ActionChains(self).drag_and_drop(source, destination).perform()
return(self)
Getting error : AttributeError: object has no attribute 'w3c'?
Draggable part HTML Code:
<div id="textBox" class="whiteBox textBox" style="height:160px;width:100%;">
<span style="padding-top:4px;padding-bottom:4px;clear:both;float:left;" _attr="constant" _type="textName">
<div class="simpleClass" contenteditable="false" dontcancelselect="true" onselectstart="GetBrowser().allowDrag(event, this)" draggable="true">
Text1
<img class="textBox_icon" contenteditable="false" src="img/text_box.gif" style="display:none">
</div>
</span>
<span> same for Text2 </span>
<span> same for Text3 </span>
Droppable part HTML Code :
<div id="messageDiv" class="contentEditableOuterContainer multiLine" style="position:relative">
<pre id="messagearea" class="contentEditableContainer multiLine inputpre" contenteditable="true" spellcheck="false">
Source : xpath=//div[#id='textBox']//div[contains(text(),'Text1')]
Destination : id=messagearea
We can drag "Text1" to droppable area as many times.

Related

Scrapy error loop xpath

I have the follow html structure:
<div id="mod_imoveis_result">
<a class="mod_res" href="#">
<div id="g-img-imo">
<div class="img_p_results">
<img src="/img/image.jpg">
</div>
</div>
</a>
</div>
This is a product result page, so is 7 blocks for page with that mod_imoveis_result id. I need get image src from all blocks. Each page have 7 blocks like above.
I try:
import scrapy
from scrapy.pipelines.images import ImagesPipeline
from scrapy.exceptions import DropItem
class QuotesSpider(scrapy.Spider):
name = "magichat"
start_urls = ['https://magictest/results']
def parse(self, response):
for bimb in response.xpath('//div[#id="mod_imoveis_result"]'):
yield {
'img_url': bimb.xpath('//div[#id="g-img-imo"]/div[#class="img_p_results"]/img/#src').extract_first(),
'text': bimb.css('#titulo_imovel::text').extract_first()
}
next_page = response.xpath('//a[contains(#class, "num_pages") and contains(#class, "pg_number_next")]/#href').extract_first()
if next_page is not None:
yield response.follow(next_page, self.parse)
I can't understand why text target is ok, but img_url get first result for all blocks for page. Example: each page have 7 blocks, so 7 texts and 7 img_urls, but, img_urls is the same for all other 6 blocks, and text is right, why?
If i change extract_first to extract i get others urls, but the result come in the same brackts. Example:
text: 1aaaa
img_url : a,b,c,d,e,f,g
but i need
text: 1aaaa
img_url: a
text: 2aaaa
img_url: b
What is wrong with that loop?
// selects the root node i.e. <div id="mod_imoveis_result"> of for node you're trying to get which is div[#id="g-img-imo"] so the two tage that were missed it the reason of NO DATA
**. **selects the current node which is mentioned in your xpath irrespective of how deep it is.
In your case xpath('./div[#id="g-img-imo"]/div[#class="img_p_results"]/img/#src') denotes selection from root node i.e. from arrow
<div id="mod_imoveis_result">
<a class="mod_res" href="#">
---> <div id="g-img-imo">
<div class="img_p_results">
<img src="/img/image.jpg">
</div>
</div>
</a>
</div>
I hope you i made it clear.
If all your classes have separate div names, in your case different class tag, then you can directly call image div and extract image URL.
//*[#class="img_p_results"]/img/#src

How do I scrape nested data using selenium and Python

I basically want to scrape Litigation Paralegal under <h3 class="Sans-17px-black-85%-semibold"> and Olswang under <span class="pv-entity__secondary-title Sans-15px-black-55%">, but I can't see to get to it. Here's the HTML at code:
<div class="pv-entity__summary-info">
<h3 class="Sans-17px-black-85%-semibold">Litigation Paralegal</h3>
<h4>
<span class="visually-hidden">Company Name</span>
<span class="pv-entity__secondary-title Sans-15px-black-55%">Olswang</span>
</h4>
<div class="pv-entity__position-info detail-facet m0"><h4 class="pv-entity__date-range Sans-15px-black-55%">
<span class="visually-hidden">Dates Employed</span>
<span>Feb 2016 – Present</span>
</h4><h4 class="pv-entity__duration de Sans-15px-black-55% ml0">
<span class="visually-hidden">Employment Duration</span>
<span class="pv-entity__bullet-item">1 yr 2 mos</span>
</h4><h4 class="pv-entity__location detail-facet Sans-15px-black-55% inline-block">
<span class="visually-hidden">Location</span>
<span class="pv-entity__bullet-item">London, United Kingdom</span>
</h4></div>
</div>
And here is what I've been doing at the moment with selenium in my code:
if tree.xpath('//*[#class="pv-entity__summary-info"]'):
experience_title = tree.xpath('//*[#class="Sans-17px-black-85%-semibold"]/h3/text()')
print(experience_title)
experience_company = tree.xpath('//*[#class="pv-position-entity__secondary-title pv-entity__secondary-title Sans-15px-black-55%"]text()')
print(experience_company)
My output:
Experience title : []
[]
Your XPath expressions are incorrect:
//*[#class="Sans-17px-black-85%-semibold"]/h3/text() means text content of h3 which is child of element with class name attribute "Sans-17px-black-85%-semibold". Instead you need
//h3[#class="Sans-17px-black-85%-semibold"]/text()
which means text content of h3 element with class name attribute "Sans-17px-black-85%-semibold"
In //*[#class="pv-position-entity__secondary-title pv-entity__secondary-title Sans-15px-black-55%"]text() you forgot a slash before text() (you need /text(), not just text()). And also target span has no class name pv-position-entity__secondary-title. You need to use
//span[#class="pv-entity__secondary-title Sans-15px-black-55%"]/text()
You can get both of these easily with CSS selectors and I find them a lot easier to read and understand than XPath.
driver.find_element_by_css_selector("div.pv-entity__summary-info > h3").text
driver.find_element_by_css_selector("div.pv-entity__summary-info span.pv-entity__secondary-title").text
. indicates class name
> indicates child (one level below only)
indicates a descendant (any levels below)
Here are some references to get you started.
CSS Selectors Reference
CSS Selectors Tips
Advanced CSS Selectors

Selenium Python click a link to javascript in an unordered list

I'm trying to click and activate the javascript link with Selenium. It's for a 5 star rating widget.
five-stars is the exact item below. The other items, IE 4 star are not fully shown.
<div id="percentages_and_ratings">
<div id="percentages">
<div id="rating">
<ul id="personality-rating" class="star-rating profile_rating " onmouseout="Votes.publicStarOut(this)" onmouseover="Votes.publicStarOver(this)">
<li id="current-personality-3198779465475184989-1" class="current-rating" style="width: 0%;"></li>
<li>...
<li>...
<li>...
<li>...
<li>
<a class="five-stars" title="" href="javascript:processVoteNote('vote', 'personality', 5, '222222222222222', false, '', '', Profile.profileHeadingVote);">5</a>
</li>
<li class="cant-tell" style="display: none;">
<li class="click-away">
The selenium unit test output looks like
driver.find_element_by_xpath("(//a[contains(text(),'5')])[2]").click()
but that doesn't work. Selecting the xpath, CSS, HTML with firebug doesn't work either. Any ideas? I've been at it for a few nights now so it's time to ask :-)
I'm using Selenium web driver and python 2.7
Here is how I ended up solving it..
id = self.getID(driver)
script = "$(processVoteNote('vote', 'personality', 5, '"+id+"', false, '', '', Profile.profileHeadingVote));"
driver.execute_script(script)
Based on the sample HTML you posted,
browser.find_element_by_class_name('five-stars').click() should successfully select and click that link. If there is more than one element on the page with that class name on the page, you could use browser.find_elements_by_class_name('five-stars'), iterate through that list to identify the relevant links, and then click them.
If you want to use an XPATH search, I'd recommend using xPath Tester to try out different patterns.

flex-video in reveal only showing once

I use the following code to display a youtube video inside a reveal-box
<a href='#' data-reveal-id='myModal1'><img alt="Some text" src="images/logo.png" class="large-6 medium-6 small-6 columns" /></a>
<div id="myModal1" class="reveal-modal small" data-reveal>
<div class='flex-video'>
<iframe src="http://www.youtube.com/embed/dQw4w9WgXcQ?rel=0" class='no-border' allowfullscreen></iframe>
</div>
<a class="close-reveal-modal">×</a>
</div>
but this works only once...
When I open the box, it shows the video perfectly, but when I close it, and re-open it, I only get a white area.
It seems like the flex-video is the problem; when I remove this div, so I put the iframe directly into the reveal-modal div, it works normally, but then obviously the video doesn't scale on different devices
It always worked fine in foundation 4, but now in foundation 5 it does this.
Please help.
Thank you
Give your reveal modal an ID (i.e. #myModal)
Go into your foundation.reveal.js file and look for the line of code that reads:
close_video : function (e) {
var video = $(this).find('.flex-video'),
iframe = video.find('#myModal iframe');
As you can see, in the third line of code I have added the id of #myModal
Open foundation.min.js and find (command+f) the two iframe references, and add your ID (#myModal) to both

Trouble accessing attribute after using BeautifulSoup's findAll

I'm trying to scrape sites like this one on the BBC website to grab the relevant parts of the programme listing, and I've just started using BeautifulSoup to do this.
The parts of interest start with sections like:
<li about="/programmes/p013zzsl#segment" class="segment track" id="segmentevent-p013zzsm" typeof="po:MusicSegment">
<li about="/programmes/p014003v#segment" class="segment speech alt" id="segmentevent_p014003w" typeof="po:SpeechSegment">
What I've done so far is opened the HTML as soup and then used soup.findAll(typeof=['po:MusicSegment', 'po:SpeechSegment']) to give a ResultSet of the parts I'm interested in the order in which they appear.
What I then want to do is check whether a section refers to po:MusicSegment or po:SpeechSegment in HTML that looks like:
<li about="/programmes/p01400m9#segment" class="segment track" id="segmentevent-p01400mb" typeof="po:MusicSegment"> <span class="artist-image"> <span class="depiction" rel="foaf:depiction"><img alt="" height="63" src="http://static.bbci.co.uk/programmes/2.54.3/img/thumbnail/artists_default.jpg" width="112"/></span> </span> <script type="text/javascript"> window.programme_data.tracklist.push({ segment_event_pid : "p01400mb", segment_pid : "p01400m9", playlist : "http://www.bbc.co.uk/programmes/p01400m9.emp" }); </script> <h3> <span rel="mo:performer"> <span class="artist no-image" property="foaf:name" typeof="mo:MusicArtist">Mala</span> </span> <span class="title" property="dc:title">Calle F</span> </h3></li>
I want to access the typeof attribute associated with <li>, but if this chunk of HTML (as a BS4 tag) is called section and I enter section.li, it returns None.
Note that if I do section.img instead, I get something back:
<img alt="" height="63" src="http://static.bbci.co.uk/programmes/2.54.3/img/thumbnail/artists_default.jpg" width="112"/>
and I could then do, e.g. section.img['height'] to get back u'63'
What I want is something analogous for the section.li part, so section.li['typeof'] to give me po:MusicSegment or po:SpeechSegment
Of course, I could simply convert each result to text and then do a simple string search, but searching by attribute seems more elegant.
I'd iterate over the list returned by findAll:
soup = BeautifulSoup('<li about="/programmes/p013zzsl#segment" class="segment track" id="segmentevent-p013zzsm" typeof="po:MusicSegment"><li about="/programmes/p014003v#segment" class="segment speech alt" id="segmentevent_p014003w" typeof="po:SpeechSegment">')
for elem in soup.findAll(typeof=['po:MusicSegment', 'po:SpeechSegment']):
print elem['typeof']
returns
po:MusicSegment
po:SpeechSegment
and then conditionally perform your other tasks:
if elem['typeof'] == 'po:MusicSegment'
do.something()
elif elem['typeof'] == 'po:SpeechSegment':
do.something_else()