Can't connect to Online Sharepoint using Python - python-2.7

I'am trying to display all sharepoint's list name but i'am getting this error :
No handlers could be found for logger "office365.runtime.auth.saml_token_provider.SamlTokenProvider._process_service_token_response"
This is my code :
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
url = 'https://abc.sharepoint.com/sites/siteName/'
ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username='username#abc.com'
,password ='password'):
ctx = ClientContext(url, ctx_auth)
lists = ctx.web.lists
ctx.load(lists)
ctx.execute_query()
for l in lists:
print(l.properties["Title"])
Thanks

I tested below code here with python 2.7 and it works well.
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
tenant_url= "https://company.sharepoint.com"
site_url="https://company.sharepoint.com/sites/sname"
ctx_auth = AuthenticationContext(tenant_url)
if ctx_auth.acquire_token_for_user("abc#company.onmicrosoft.com","mypassword"):
ctx = ClientContext(site_url, ctx_auth)
lists = ctx.web.lists
ctx.load(lists)
ctx.execute_query()
for l in lists:
print(l.properties["Title"])
else:
print(ctx_auth.get_last_error())
Result:
If this is related to ADFS, please refer to this closed question:
https://github.com/vgrem/Office365-REST-Python-Client/issues/85
BR

Well i found a solution to get data for specific sharepoint List
from shareplum import Site
from shareplum import Office365
import json
import csv
import pandas
authcookie = Office365('https://abc.sharepoint.com/', username='username', password='password').GetCookies()
site = Site('https://abc.sharepoint.com/sites/SitesName/', authcookie=authcookie)
sp_list = site.List('ListName')
#print(sp_list)
data = sp_list.GetListItems(fields=['FieldName1','FieldName2'])
c = pandas.read_json(json.dumps(data)).to_csv("output.csv")

Related

I have the following list of strings yet I want to apply filter so that I may certain item from the lists.How to do that?

I am trying to obtain the image data from the following website.
However, I am getting a list of data that contains the links that are not needed. I want to apply the filter so that I can only get the data that starts with /PIAimages. How to apply the filter to do that?
import requests
from bs4 import BeautifulSoup
import csv
result = []
response = requests.get("https://www.ikea.com/sa/en/catalog/products/00361049/")
assert response.ok
page = BeautifulSoup(response.text, "html.parser")
for des in page.find_all('img'):
image= des.get('src')
print(image)
Expected output:
/PIAimages/0531313_PE647261_S1.JPG
/PIAimages/0513228_PE638849_S1.JPG
/PIAimages/0618875_PE688687_S1.JPG
/PIAimages/0325432_PE517964_S1.JPG
/PIAimages/0690287_PE723209_S1.JPG
/PIAimages/0513996_PE639275_S1.JPG
/PIAimages/0325450_PE517970_S1.JPG
Actual output:
/ms/img/header/ikea-logo.svg
/ms/en_SA/img/header/ikea-store.png
/ms/img/header/main_menu_shadow.gif
/sa/en/images/products/strandmon-wing-chair-beige__0513996_PE639275_S4.JPG
/PIAimages/0531313_PE647261_S1.JPG
/PIAimages/0513228_PE638849_S1.JPG
/PIAimages/0618875_PE688687_S1.JPG
/PIAimages/0325432_PE517964_S1.JPG
/PIAimages/0690287_PE723209_S1.JPG
/PIAimages/0513996_PE639275_S1.JPG
/PIAimages/0325450_PE517970_S1.JPG
/ms/img/static/loading.gif
/ms/img/static/stock_check_green.gif
/ms/img/ads/services/ways_to_shop/20172_otav20a_assembly_20x20.jpg
/ms/en_SA/img/icons/picking-with-delivery.jpg
/ms/img/ads/services/ways_to_shop/20172_otav24a_pickingdelivery_20x20.jpg
/sa/en/images/products/strandmon-wing-chair-beige__0739100_PH147003_S4.JPG
https://smetrics.ikea.com/b/ss/ikeaallnojavascriptprod/5/?c8=sa&pageName=nojavascript
Use If clause then append data into list.
import requests
from bs4 import BeautifulSoup
result = []
response = requests.get("https://www.ikea.com/sa/en/catalog/products/00361049/")
assert response.ok
page = BeautifulSoup(response.text, "html.parser")
for des in page.find_all('img'):
image= des.get('src')
if 'PIAimages' in image:
result.append(image)
print(result)
OR use regular expression.This is much faster.
import requests
import re
from bs4 import BeautifulSoup
result = []
response = requests.get("https://www.ikea.com/sa/en/catalog/products/00361049/")
assert response.ok
page = BeautifulSoup(response.text, "html.parser")
for des in page.find_all('img', src=re.compile("PIAimages")):
image= des.get('src')
result.append(image)
print(result)
I think it faster and more concise to use css attribute = value selector with starts with operator. You specify the start substring for the src in the selector so only qualifying elements are returned.
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.ikea.com/sa/en/catalog/products/00361049/")
page = BeautifulSoup(response.text, "lxml")
images = [item['src'] for item in page.select('img[src^=\/PIAimages]')]
print(images)

Geocoding in python get Latitude and Longitude from addresses using API key

I currently have a data frame that has address details of certain place. I want to use Google geocode API key to find the coordinates - Latitude and Longitude in order to plot a map. Does anybody know how to do this? I have tried the below code but it is returning 'Error, skipping address...' on all the lines of addresses.
I would greatly appreciate any help!
import pandas as pd
import os
from geopy import geocoders
from geopy.geocoders import GoogleV3
API_KEY = os.getenv("API1234")
g = GoogleV3(api_key=API_KEY)
loc_coordinates = []
loc_address = []
for address in df.Address:
try:
inputAddress = Address
location = g.geocode(inputAddress, timeout=15)
loc_coordinates.append((location.latitude, location.longitude))
loc_Address.append(inputAddress)
except:
print('Error, skipping address...')
df_geocodes = pd.DataFrame({'coordinate':loc_coordinates,'address':loc_address})
You had some typos: Address instead of address, loc_Address instead of loc_address.
But what is df.Address ?
Try this:
import pandas as pd
import os
from geopy import geocoders
from geopy.geocoders import GoogleV3
API_KEY = os.getenv("API1234")
g = GoogleV3(api_key=API_KEY)
loc_coordinates = []
loc_address = []
for address in df.Address:
try:
inputAddress = address
location = g.geocode(inputAddress, timeout=15)
loc_coordinates.append((location.latitude, location.longitude))
loc_address.append(inputAddress)
except Exception as e:
print('Error, skipping address...', e)
df_geocodes = pd.DataFrame({'coordinate':loc_coordinates,'address':loc_address})

How to page through QueryResults

I am getting result from BigQuery using the following code:
from google.oauth2 import service_account
from google.cloud import bigquery
credential = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_FILE)
scoped_credential = credential.with_scopes(BIG_QUERY_SCOPE)
client = bigquery.Client(project="XX-XX",credentials=scoped_credential)
query_results = client.run_sync_query(query_detail)
query_results.use_legacy_sql = False
query_results.run()
iterator = query_results.fetch_data()
rows = iterator.query_result.rows
But it only returns up-to 50000 rows. I tried to paginate while fetching data, but failed to figure out how to do it:
page_token = query_results.page_token
iterator = query_results.fetch_data(max_results=500, page_token=page_token)
I could not find out how to get the updated page_token.
Thanks,
I think you are close. Try running this code now:
data = list(query_results.fetch_data()) # changed from `iterator` to `data` the variable name
The management of page tokens is done automatically for you.

Regular expression to find precise pdf links in a webpage

Given url='http://normanpd.normanok.gov/content/daily-activity', the website has three types of arrests, incidents, and case summaries. I was asked to use regular expressions to discover the URL strings of all the Incidents pdf documents in Python.
The pdfs are to be downloaded in a defined location.
I have gone through the link and found that Incident pdf files URLs are in the form of:
normanpd.normanok.gov/filebrowser_download/657/2017-02-19%20Daily%20Incident%20Summary.pdf
I have written code :
import urllib.request
url="http://normanpd.normanok.gov/content/daily-activity"
response = urllib.request.urlopen(url)
data = response.read() # a `bytes` object
text = data.decode('utf-8')
urls=re.findall(r'(\w|/|-/%)+\sIncident\s(%|\w)+\.pdf$',text)
But in the URLs list, the values are empty.
I am a beginner in python3 and regex commands. Can anyone help me?
This is not an advisable method. Instead, use an HTML parsing library like bs4 (BeautifulSoup) to find the links and then only regex to filter the results.
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
url="http://normanpd.normanok.gov/content/daily-activity"
response = urlopen(url).read()
soup= BeautifulSoup(response, "html.parser")
links = soup.find_all('a', href=re.compile(r'(Incident%20Summary\.pdf)'))
for el in links:
print("http://normanpd.normanok.gov" + el['href'])
Output :
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-23%20Daily%20Incident%20Summary.pdf
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-22%20Daily%20Incident%20Summary.pdf
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-21%20Daily%20Incident%20Summary.pdf
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-20%20Daily%20Incident%20Summary.pdf
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-19%20Daily%20Incident%20Summary.pdf
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-18%20Daily%20Incident%20Summary.pdf
http://normanpd.normanok.gov/filebrowser_download/657/2017-02-17%20Daily%20Incident%20Summary.pdf
But if you were asked to use only regexes, then try something simpler:
import urllib.request
import re
url="http://normanpd.normanok.gov/content/daily-activity"
response = urllib.request.urlopen(url)
data = response.read() # a `bytes` object
text = data.decode('utf-8')
urls=re.findall(r'(filebrowser_download.+?Daily%20Incident.+?\.pdf)',text)
print(urls)
for link in urls:
print("http://normanpd.normanok.gov/" + link)
Using BeautifulSoup this is an easy way:
soup = BeautifulSoup(open_page, 'html.parser')
links = []
for link in soup.find_all('a'):
current = link.get('href')
if current.endswith('pdf') and "Incident" in current:
links.append('{0}{1}'.format(url,current))

need to get the exact redirect link

I need to get the final url of the link. But this code is only giving me a link to its store
It is returning me the link: http://www.amazon.in/electronics/b?ie=UTF8&node=976419031
But what I need is: http://www.amazon.in/Samsung-G-550FY-On5-Pro-Gold/dp/B01FM7GGFI?tag=prdeskdetailmob-21&ascsubtag=desktop-mobile-15920-blank-27092016
import mechanize
br = mechanize.Browser()
br.open("https://priceraja.com/r/go2store.php?mpc=mobile--1178916--15920--deskdetail")
br.select_form(nr=0)
br.submit()
x=br.geturl()
print x
from selenium import webdriver
chrome_path = r"C:\Users\Bhanwar\Desktop\price raja mobile\working\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
link = "https://priceraja.com/r/go2store.php?mpc=mobile--1185105--15236--deskdetail"
driver.get(link)
while(link == driver.current_url):
time.sleep(3)
redirected_url = driver.current_url
print redirected_url