Find and click links in ugly table with Python and Selenium webdriver

Find and click links in ugly table with Python and Selenium webdriver - python-2.7

I'm trying to get Selenium Webdriver to click x number of links in a table, and I can't get it to work. I can print the links like this:
links = driver.find_elements_by_xpath("//table[2]/tbody/tr/td/p/strong/a")
for i in range(0,len(links)):
print links[i].text
But when I try to do a links[i].click() instead of printing, python throws me an error.
The site uses JSP and the hrefs of the links looks like this "javascript:loadTestResult(169)"
This is a sub/sub-page and not possible to access by direct URL, and the table containing the links are very messy and large so instead of pasting the whole source here I saved the page on this url.
http://wwwe.aftonbladet.se/redaktion/martin/badplats.html
(I'm hunting the 12 blue links in the left column)
Any ideas?
Thanks
Martin

Sorry, to trigger happy.
Simple solution to my own problem:
linkList = driver.find_elements_by_css_selector("a[href*='loadTestResult']")
for i in range(0,len(linkList)):
links = driver.find_elements_by_css_selector("a[href*='loadTestResult']")
links[i].click()

Related

How to make a web page to totally display when loaded in python selenium?

The main objective I have is to read a table in a webpage and account for the total element it has. But since you have to scroll down to find another elements not 'chased' by this sentece
table_css=driver.find_elements_by_id('DeletButtn')
then I decided to zoom in down to 30% to catch them all. But no way I've used is working when trying to zoom in:
driver.execute_script("document.body.style_zoom='30%'")
driver.execute_script("document.body.style.zoom='30%'")
driver.execute_script("document.body.style.zoom='0.3'")
do not work
I have also tried to use key and send keys, but they have been worthless too:
html = driver.find_element_by_tag_name("body")
html.send_keys(Keys.CONTROL, Keys.SUBTRACT)
driver.Keys(html, "ctrl").Keys(html, "-").perform()
action = ActionChains(driver)
action.key_down(Keys.CONTROL).key_down(Keys.SUBTRACT).perform()
None of these combinations have worked up til now. They haven't show me any error so far, so I do not know what I am doing wrong.
I am using python 2.7, geckodriver and Firefox
Thanks for your help and greetings!

RegEX to search links in a text file?

Im trying to clean a database that contains a lot of links that doesnt work
The problem is that there are a lot of links for picture and every picture has a different name of course.
Is it possible to select the entire link That contains "http://example.com/img/bloguploads/" with a regEX ?

Can find all hyperlinks with:
http[s]?://.[a-zA-Z0-9\.\/\-]+
And all example.com links with:
http://example\.com/img/bloguploads/\S+

scraping the text from source code using python

I'm trying to scrape google search results using python and selenium. I'm able to get only the first search result. Here is the code I'm using.
driver.get(url)
res = driver.find_elements_by_css_selector('div.g')
link = res[0].find_element_by_tag_name("a")
href = link.get_attribute("href")
How can I get all the search results?

Try to get list of links (from first page only. If you need to scrape more pages, you need to click "Next" button in a loop and append results from following pages) as below:
href = [link.get_attribute("href") for link in driver.find_elements_by_css_selector('div.g a')]
P.S. You also might use solutions from this question to get results as GET request response with requests lib

content empty when using scrapy

Thanks for everyone in advance.
I encountered a problem when using Scrapy on Python 2.7.
The webpage I tried to crawl is a discussion board for Chinese stock market.
When I tried to get the first number "42177" just under the banner of this page (the number you see on that webpage may not be the number you see in the picture shown here, because it represents the number of times this article has been read and is updated realtime...), I always get an empty content. I am aware that this might be the dynamic content issue, but yet don't have a clue how to crawl it properly.
The code I used is:
item["read"] = info.xpath("div[#id='zwmbti']/div[#id='zwmbtilr']/span[#class='tc1']/text()").extract()
I think the xpath is set correctly and I have checked the return value of this response and it indeed told me that there is nothing under this directory. Results shown here:'read': [u'<div id="zwmbtilr"></div>']
If it has something, there should be something between <div id="zwmbtilr"> and </div>.
Really appreciated if you guys share any thoughts on this!

I just opened your link in Firefox with NoScript enabled. There nothing inside the <div #id='zwmbtilr'></div>. If I enable the javascripts, I can see the content you want. So, as you already new, it is a dynamic content issue.
Your first option is try to identify the request generated by javascript. If you can do that, you can send the same request from scrapy. If you can't do it, the next option is usually to use some package with javascript/browser emulation or someting like that. Something like ScrapyJS or Scrapy + Selenium.

Using Selenium Python on Google page to click links

Trying to write a very simple script in Selenium Python. I am opening Google page with a search string then I am not able to locate any of the HTML element like "Images", "Maps" etc of any of the links appearing as a part of search. Though I am using Firebug. But only one thing worked and that is following
links = driver.find_elements_by_tag_name ("a")
for link in links:
print ("hello")
What to do if I want to click on "Images" or "Maps"?
What to do if I want click on 1st, 2nd or a particular numbered link or click the link by partial text ?
Any help would be appreciated.

Something like:
driver.get('http://www.google.com')
driver.find_element_by_xpath('//a[starts-with(#href,"https://maps.google")]').click()
But please note that your browser would often redirect 'http://www.google.com' to a slightly different URL, and that the web-page it displays might be slightly different.

What to do if i want to click on images or maps?
Images:
img = driver.find_element_by_css_selector('img#myImage').click()
Maps:
map = driver.find_element_by_css_selector("map[for='myImage']").click()
1st, second or N link:
link = driver.find_elements_by_tag_name('a')[n].click() -- or
link = driver.find_elements_by_css_selector("div#someParent > a:nth-child(n)").click()

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Find and click links in ugly table with Python and Selenium webdriver - python-2.7

Sorry, to trigger happy. Simple solution to my own problem: linkList = driver.find_elements_by_css_selector("a[href='loadTestResult']") for i in range(0,len(linkList)): links = driver.find_elements_by_css_selector("a[href='loadTestResult']") links[i].click()

Related

How to make a web page to totally display when loaded in python selenium?

RegEX to search links in a text file?

scraping the text from source code using python

content empty when using scrapy

Using Selenium Python on Google page to click links

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Find and click links in ugly table with Python and Selenium webdriver - python-2.7

Sorry, to trigger happy. Simple solution to my own problem: linkList = driver.find_elements_by_css_selector("a[href*='loadTestResult']") for i in range(0,len(linkList)): links = driver.find_elements_by_css_selector("a[href*='loadTestResult']") links[i].click()

Related

How to make a web page to totally display when loaded in python selenium?

RegEX to search links in a text file?

scraping the text from source code using python

content empty when using scrapy

Using Selenium Python on Google page to click links

Categories

Resources

Sorry, to trigger happy. Simple solution to my own problem: linkList = driver.find_elements_by_css_selector("a[href='loadTestResult']") for i in range(0,len(linkList)): links = driver.find_elements_by_css_selector("a[href='loadTestResult']") links[i].click()