I am trying to write script in python where a user is prompted to enter a youtube link on the command line. This link should then be downloaded
from pytube import YouTube
downloadFile = raw_input("Enter your Youtube link: ")
YouTube(downloadFile).streams.first().download()
However when the link is entered on the command line I get the following:
File "dl.py", line 10, in <module>
YouTube(downloadFile).streams.first().download()
File "build/bdist.linux-x86_64/egg/pytube/__main__.py", line 69, in __init__
File "build/bdist.linux-x86_64/egg/pytube/extract.py", line 43, in video_id
File "build/bdist.linux-x86_64/egg/pytube/helpers.py", line 39, in regex_search
pytube.exceptions.RegexMatchError: regex pattern ((?:v=|\/)([0-9A-Za-z_-]{11}).*) had zero matches
I am able to get get it working via python interpeter.
Any suggestions are welcome!
Ok, after a bit of rooting around this seems to do the trick
import sys # import sys
import pytube
link = raw_input('Please enter a url link\n')
yt = pytube.YouTube(link)
stream = yt.streams.first()
finished = stream.download()
print 'Download is complete'
sys.exit()
Related
I am writing a script that will save the complete contents of a web page. If I try using urllib2 and bs4 it only writes the contents of the logon page and none of the content after navigating to a search within the page. However, if I do a ctrl + s on the search results page, an html file is saved to disk that when opened in a text editor has all of the contents from the search results.
I've read several posts here on the subject and am trying to use the steps in this one:
How to save "complete webpage" not just basic html using Python
However, after installing geckodriver and setting the sys path variable I continue to get errors. Here is my limited code:
from selenium import webdriver
>>> from selenium.webdriver.common.action_chains import ActionChains
>>> from selenium.webdriver.common.keys import Keys
>>> br = webdriver.Firefox()
Here is the error:
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Python27\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 142, in __init__
self.service.start()
File "C:\Python27\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
os.path.basename(self.path), self.start_error_message)
WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
And here is where I set the sys path variable:
I've restarted after setting sys path variable.
UPDATE:
I am now trying to use the chromdriver as this seemed more straight forward. I downloaded hromedriver_win32.zip II'm on a windows laptop) from chromedriver's download page, set the environmetal variable path to:
C:\Python27\Lib\site-packages\selenium\webdriver\chrome\chromedriver.exe
but am getting the similar following error:
>>> br = webdriver.Chrome()
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Python27\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 62, in __init__
self.service.start()
File "C:\Python27\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
os.path.basename(self.path), self.start_error_message)
WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
You have also to add the path of Firefox to the system variables manually,
you maybe have installed firefox some other location while Selenium is trying to find firefox and launch from default location but it couldn't find. You need to provide explicitly firefox installed binary location:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('path/to/installed firefox binary')
browser = webdriver.Firefox(firefox_binary=binary)
browser = webdriver.Firefox()
I'm trying to put an Item into an Amazon DynamoDB table using a python script, but when I run the python script I get following error:
Traceback (most recent call last):
File "./table.py", line 32, in <module>
item.put(None, None)
File "/usr/local/lib/python2.7/dist-packages/boto/dynamodb/item.py", line 183, in put
return self.table.layer2.put_item(self, expected_value, return_values)
File "/usr/local/lib/python2.7/dist-packages/boto/dynamodb/layer2.py", line 551, in put_item
object_hook=self.dynamizer.decode)
File "/usr/local/lib/python2.7/dist-packages/boto/dynamodb/layer1.py", line 384, in put_item
object_hook=object_hook)
File "/usr/local/lib/python2.7/dist-packages/boto/dynamodb/layer1.py", line 119, in make_request
retry_handler=self._retry_handler)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 954, in _mexe
status = retry_handler(response, i, next_sleep)
File "/usr/local/lib/python2.7/dist-packages/boto/dynamodb/layer1.py", line 159, in _retry_handler
data)
boto.exception.DynamoDBResponseError: DynamoDBResponseError: 400 Bad Request
{u'message': u'Requested resource not found', u'__type': u'com.amazonaws.dynamodb.v20111205#ResourceNotFoundException'}
My code is:
#!/usr/bin/python
import boto
import boto.s3
import sys
from boto import dynamodb2
from boto.dynamodb2.table import Table
from boto.s3.key import Key
import boto.dynamodb
conn = boto.dynamodb.connect_to_region('us-west-2', aws_access_key_id=<My_access_key>, aws_secret_access_key=<my_secret_key>)
entity = conn.create_schema(hash_key_name='RPI_ID', hash_key_proto_value=str, range_key_name='PIC_ID', range_key_proto_value=str)
table = conn.create_table(name='tblSensor', schema=entity, read_units=10, write_units=10)
item_data = {
'Pic_id': 'P100',
'RId': 'R100',
'Temperature': '28.50'
}
item = table.new_item(
# Our hash key is 'forum'
hash_key='RPI_ID',
# Our range key is 'subject'
range_key='PIC_ID',
# This has the
attrs=item_data
)
item.put() // I got error here.
My reference is: Setting/Getting/Deleting CORS Configuration on a Bucket
I ran your code in my account and it worked 100% perfect, returning:
{u'ConsumedCapacityUnits': 1.0}
You might want to check that you are using the latest version of boto:
pip install boto --upgrade
I searched on google and solved my that problem. I have set correct time and date on my raspberry pi board and run that program, it's working fine.
I'm trying to loop through the list of companies in the Link. The link of each company name is dynamic for example http://ae.bizdirlib.com/node/946273 - Text link 946273 keeps changing i.e its dynamic. I want open each of these links in the page in a browser I'm really confused on how to do this. I have tried this for now.
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import time
browser = webdriver.Firefox() # Get local session of firefox
#wait until the pages are loaded
browser.implicitly_wait(3)
browser.get("http://ae.bizdirlib.com/taxonomy/term/1493") # Load page
browser.refresh()
page_source = browser.page_source
for node in page_source:
link = browser.find_element_by_link_text('node').click
On executing this code it gives a error
Traceback (most recent call last):
File "C:/Python27/automation scripts/ggulf/large data.py", line 29, in <module>
link = browser.find_element_by_link_text('node').click
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 276, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 684, in find_element
{'using': by, 'value': value})['value']
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 195, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 170, in check_response
raise exception_class(message, screen, stacktrace)
NoSuchElementException: Message: Unable to locate element: {"method":"link text","selector":"node"}
Stacktrace:
at FirefoxDriver.prototype.findElementInternal_ (file:///c:/users/akrakhan/appdata/local/temp/tmppveyk8/extensions/fxdriver#googlecode.com/components/driver-component.js:10299)
at fxdriver.Timer.prototype.setTimeout/<.notify (file:///c:/users/akrakhan/appdata/local/temp/tmppveyk8/extensions/fxdriver#googlecode.com/components/driver-component.js:603)
You are better off looking for something more specific rather than looking through the page source. All the company links are links inside a H2 tag. You can find them using a CSS selector h2 > a which is read find all A tags that are a child of (>) an h2 element.
browser.get("http://ae.bizdirlib.com/taxonomy/term/1493") # Load page
links = browser.find_elements_by_css_selector("h2 > a")
for link in links:
link.click
This isn't the final solution because clicking the link will take you off the main page but it's a parallel to what you were trying to accomplish. Probably a better approach would be to store the URLs of all the company links in a string array and then loop through that array navigating to each URL... or something like that. An exercise for the reader... :)
I am currently using the following versions
Python - 2.7.10 ( 32 bit , win)
AndroidViewClient - androidviewclient-10.7.1-py2.7.egg
I have a simple program as below
import sys
import os
try:
sys.path.insert(0, os.path.join(os.environ['ANDROID_VIEW_CLIENT_HOME'], 'src'))
except:
pass
from com.dtmilano.android.viewclient import ViewClient
device, serialno = ViewClient.connectToDeviceOrExit()
vc = ViewClient(device=device, serialno=serialno)
device.takeSnapshot().save('Menu.png','PNG')
This is giving me the following error
*Traceback (most recent call last):
File "dump.py", line 14, in <module>
device.takeSnapshot().save('Menu.png','PNG')
File "C:\Python27\lib\site-packages\androidviewclient-10.7.1-py2.7.egg\com\dtmilano\android\adb\adbclient.py", line 678, in takeSnapshot
image = Image.open(stream)
File "C:\Python27\lib\site-packages\PIL\Image.py", line 2126, in open
% (filename if filename else fp))
IOError: cannot identify image file <cStringIO.StringI object at 0x023462A8>*
The Same snippet code - works for some devices and for some it doesnt
How can i figure out what is wrong with the devices where it doesnt work
Also please help me idetify any configuration issues as i am new to this
I have HTML that looks like the three following sample statements:
...
12
13
(I'd presently be on pg. 11.)
I don't know the Py/Selenium/Splinter syntax for selecting one of the page numbers in a list and clicking on it to go to that page. (Also, I need to be able to identify the element in the argument as, for example, 'Page$10' or 'Page$12', as seen in the __doPostBack notation. Maybe just a 'next page', in so many words, would be fine, but I don't even know how to do that.)
Thank you for any help.
UPDATE II: Here's the code I have to work from:
import time
import win32ui
import win32api
import win32con
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from ctypes import *
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get('http://[site]');
UPDATE III:
Traceback (most recent call last):
File "montpa_05.py", line 47, in <module>
continue_link = driver.find_element_by_link_text('4')
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", l
ine 246, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", l
ine 680, in find_element
{'using': by, 'value': value})['value']
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", l
ine 165, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\errorhandler.py"
, line 164, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: u'no such element\n
(Session info: chrome=28.0.1500.95)\n (Driver info: chromedriver=2.2,platform=
Windows NT 6.1 SP1 x86_64)'
The <a> element is defined as a link. That means that you can select it by link text.
I don't know Python, but the java syntax would be By.linkText(##) where ## is the number you want to click on.