Selenium - Click button after completing file conversion on webpage - python-2.7

I want to wait for a page to finish converting a json file, then automatically download it. The following python code works.
import time
from selenium import webdriver
chrome = webdriver.Chrome()
chrome.get('https://json-csv.com/')
load_data = chrome.find_element_by_id('fileupload')
load_data.send_keys('C:\\path_to_file')
load_data.submit()
# Wait arbitrary duration before downloading result
time.sleep(10)
get_results = chrome.find_element_by_id('download-link')
get_results.click()
chrome.quit()
However, every time I run the script, I need to wait 10 seconds, which is more than enough for the page to finish converting the file. This is not time efficient. The page may finish loading the new file in 5 seconds.
How can I click the download button the moment the file is done converting?
What I've tried
I've read a solution to a similar problem, but it threw an error: ElementNotVisibleException: Message: element not visible.
Also tried following the documentation example:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
...
wait = WebDriverWait(chrome, 10)
get_result = wait.until(EC.element_to_be_clickable((By.ID, 'download-link')))
get_result.click()
This downloads some nonsense .tmp file instead.

You need to make a small change in the approach as follows:
Rather waiting for the WebElement as By.ID, 'download-link' with clause element_to_be_clickable, I would suggest you to try to wait for the WebElement as By.ID, 'convert-another' with clause element_to_be_clickable and then click on the DOWNLOAD link as follows:
wait = WebDriverWait(chrome, 10)
wait.until(EC.element_to_be_clickable((By.ID, 'convert-another')))
chrome.find_element_by_css_selector("a#download-link.btn-lg.btn-success").click()
chrome.quit()

Your code is ok. The exception is because you call load_data.submit() after the load_data.send_keys('C:\\path_to_file').
Remove this line:
chrome.get('https://json-csv.com/')
load_data = chrome.find_element_by_id('fileupload')
load_data.send_keys('C:\\path_to_file')
wait = WebDriverWait(chrome, 10)
get_result = wait.until(EC.element_to_be_clickable((By.ID, 'download-link')))
get_result.click()

Related

How can I add my web scrape process using bs4 to selenium automation in Python to make it one single process which just asks for a zipcode?

I am using selenium to go to a website and then go to the search button type a zipcode which I am entering beforehand and then for that zip code I want the link that the webpage has to feed my web scraper created using beautiful soup and once the link comes up I can scrape required data to get my csv.
What I want:
I am having trouble getting that link to the beautiful soup URL. I basically want to automate it so that I just have to enter a zip code and it gives me my CSV.
What I am able to get:
I am able to enter the zip code and search using selenium and then add that url to my scraper to give csv.
Code I am using for selenium :
driver = webdriver.Chrome('/Users/akashgupta/Desktop/Courses and Learning/Automating Python and scraping/chromedriver')
driver.get('https://www.weather.gov/')
messageField = driver.find_element_by_xpath('//*[#id="inputstring"]')
messageField.click()
messageField.send_keys('75252')
time.sleep(3)
showMessageButton = driver.find_element_by_xpath('//*[#id="btnSearch"]')
showMessageButton.click()
#web scraping Part:
url="https://forecast.weather.gov/MapClick.php?lat=32.99802500000004&lon=-96.79775499999994#.Xo5LnFNKgWo"
res= requests.get(url)
soup=BeautifulSoup(res.content,'html.parser')
tag=soup.find_all('div',id='seven-day-forecast-body')
weekly=soup.find_all(class_='tombstone-container')
main=soup.find_all(class_='period-name')
description=soup.find_all(class_='short-desc')
temp=soup.find_all(class_='temp')
Period_Name=[]
Desc=[]
Temp=[]
for a in range(0,len(main)):
Period_Name.append(main[a].get_text())
Desc.append(description[a].get_text())
Temp.append(temp[a].get_text())
df = pd.DataFrame(list(zip(Period_Name, Desc,Temp)),columns =['Period_Name', 'Short_Desc','Temperature'])
from selenium import webdriver
import time
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome('chromedriver.exe')
driver.get('https://www.weather.gov/')
messageField = driver.find_element_by_xpath('//*[#id="inputstring"]')
messageField.click()
messageField.send_keys('75252')
time.sleep(3)
showMessageButton = driver.find_element_by_xpath('//*[#id="btnSearch"]')
showMessageButton.click()
WebDriverWait(driver, 10).until(EC.url_contains("https://forecast.weather.gov/MapClick.php")) # here you are waiting until url will match your output pattern
currentURL = driver.current_url
print(currentURL)
time.sleep(3)
driver.quit()
#web scraping Part:
res= requests.get(currentURL)
....

Compiled console app quits immediately when importing ConfigParser (Python 2.7.12)

I am very new to Python and am trying to append some functionality to an existing Python program. I want to read values from a config INI file like this:
[Admin]
AD1 = 1
AD2 = 2
RSW = 3
When I execute the following code from IDLE, it works as ist should (I already was able to read in values from the file, but deleted this part for a shorter code snippet):
#!/usr/bin/python
import ConfigParser
# buildin python libs
from time import sleep
import sys
def main():
print("Test")
sleep(2)
if __name__ == '__main__':
main()
But the compiled exe quits before printing and waiting 2 seconds. If I comment out the import of ConfigParser, exe runs fine.
This is how I compile into exe:
from distutils.core import setup
import py2exe, sys
sys.argv.append('py2exe')
setup(
options = {'py2exe': {'bundle_files': 1}},
zipfile = None,
console=['Test.py'],
)
What am I doing wrong? Is there maybe another way to read in a configuration in an easy way, if ConfigParser for some reason doesnt work in a compiled exe?
Thanks in advance for your help!

chromedriver can't click when running a script, but can in shell

I have a problem in general with clicking in Chromedriver when the code is being ran by Python. This code is used in the script:
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
driver.get("https://www.marktplaats.nl/")
cook_button = WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.XPATH, "//form[#method='post']/input[#type='submit']"))).click()
It just times out giving "NoSuchElementException". But if I put those lines manually in the Shell, it clicks like normal. For what it's worth, I'm using the latest 2.40 Chromedriver and Chrome v67. Running it headless doesn't make any difference.
EDIT
The program actually breaks after on the third command when it tries to find an element that doesn't exist because the click wasn't completed
driver.get(master_link) # get the first page
wait_by_class("search-results-table")
page_2_el = driver.find_element_by_xpath("//span[#id='pagination-pages']/a[contains(#data-ga-track-event, 'gination')]")
So, page_2_el command gives this exception, but only because the click before wasn't completed successfully to remove the warning about cookies.And I'm sure the xpath search is good because it runs with geckodriver in Firefox, but won't do it here with Chromedriver.
EDIT2 See a video of the bug here https://streamable.com/tv7w4 Notice how it flinches a bit, see when it writes on the console "before click" and "after click"
SOLUTION
Replaced
cook_button = WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.XPATH, "//form[#method='post']/input[#type='submit']"))).click()
With
N_click_attempts = 0
while 1:
if N_click_attempts == 10:
print "Something is wrong. "
break
print "Try to click."
N_click_attempts = N_click_attempts+1
try:
cook_button = WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.XPATH, "//form[#method='post']/input[#type='submit']"))).click()
time.sleep(2.0)
except:
time.sleep(2.0)
break
It seems that the click is now completed. I have other clicks in the script and they work fine with element.click(), this one was problematic for some reason.
Your path is correct, but I would suggest a smaller one:
//form/input[2]
And about NoSuchElementException - you can try to add a pause, to wait until element loads and becomes 'visible' for selenium. Like this:
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
driver.get("https://www.marktplaats.nl/")
cook_button = WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.XPATH, "//form[#method='post']/input[#type='submit']"))).click()
time.sleep(5) # wait 5 seconds until DOM will reload
According edit in the question I would suggest to add time.sleep(5) after clicking on the button. And for the same reason, because after clicking the whole DOM reloads and selenium should wait until reload will be done. On my computer it takes about 2-3 seconds to full reload the DOM.

Delete pcap file after parsing it with PcapReader in scapy

I'm parsing a pcap file with PcapReader in scapy. After that I want to delete the pcap file. But it sucks because of this error:
OSError: [Errno 26] Text file busy: '/media/sf_SharedFolder/AVB/test.pcap'
This is my python code:
from scapy.all import *
import os
var = []
for packet in PcapReader('/media/sf_SharedFolder/AVB/test.pcap'):
var.append(packet[Ether].src)
os.remove('/media/sf_SharedFolder/AVB/test.pcap')
I think that this error occurs with any pcap file.
Does somebody has any idea?
You may want to try with Scapy's latest development version (from https://github.com/secdev/scapy), since I cannot reproduce your issue with it.
If that does not work, check with lsof /media/sf_SharedFolder/AVB/test.pcap (as root) if another program has opened your capture file. If so, try to find (and kill, if possible) that program.
You can try two different hacks, to try to figure out what exactly is happening:
Test 1: wait.
from scapy.all import *
import os
import time
var = []
for packet in PcapReader('/media/sf_SharedFolder/AVB/test.pcap'):
var.append(packet[Ether].src)
time.sleep(2)
os.remove('/media/sf_SharedFolder/AVB/test.pcap')
Test 2: explicitly close.
from scapy.all import *
import os
var = []
pktgen = PcapReader('/media/sf_SharedFolder/AVB/test.pcap')
for packet in pktgen:
var.append(packet[Ether].src)
pktgen.close()
os.remove('/media/sf_SharedFolder/AVB/test.pcap')
Found the solution. I replaced "PcapReader()" through "rdpcap()". Seems like PcapReader is open until the python script is finished.
This is the working code:
from scapy.all import *
import os
var = []
p=rdpcap('/media/sf_SharedFolder/AVB/test.pcap')
for packet in p:
var.append(packet[Ether].src)
os.remove('/media/sf_SharedFolder/AVB/test.pcap')

pygame program exits with no error message when run with pythonw

I'm trying to run a pygame program using pythonw to avoid having the console window show up. This causes a weird issue related to print statements.
Basically, the program will just exit after a few seconds with no error message. The more printing I do, the faster it happens.
If I run it in idle or at the command prompt (or in linux) the program works fine. This problem only happens when launched with pythonw (right-click, Open With, pythonw).
I'm using python 2.7.11 on Windows XP 32-bit. pygame 1.9.1release.
Is there a workaround for this? Why does the program simply terminate with no error?
import pygame
from pygame.locals import *
succeeded, failed = pygame.init()
display_surface = pygame.display.set_mode((320, 240))
clock = pygame.time.Clock()
terminate = False
while terminate is False:
for event in pygame.event.get():
if event.type == QUIT:
terminate = True
area = display_surface.fill((0,100,0))
pygame.display.flip()
elapsed = clock.tick(20)
print str(elapsed)*20
pygame.quit()
You don't need to remove print statements. Save them for later debugging. ;-)
Two steps to solve this problem:
Firstly, keep all the code in py file - don't change it to pyw now; Say it is actualCode.py
Then, create a new file runAs.pyw with the following lines in it
# In runAs.pyw file, we will first send stdout to StringIO so that it is not printed
import sys # access to stdout
import StringIO # StringIO implements a file like class without the need of disc
sys.stdout = StringIO.StringIO() # sends stdout to StringIO (not printed anymore)
import actualCode # or whatever the name of your file is, see further details below
Note that, just importing actualCode runs the file, so, in actualCode.py you should not enclose the code which is executed, in what I call is it main running file condition. For example,
# In actualCode.py file
....
....
....
if __name__ == '__main__': # Don't use this condition; it evaluates to false when imported
... # These lines won't be executed when this file is imported,
... # So, keep these lines outside
# Note: The file in your question, as it is, is fine