Scraping Jira and entering it in Google Sheet (filling cells - python-2.7

I need help figuring out how to finish my script.
So far I can get the info I need from Jira and I now know the basics of entering stuff into google sheets but my problem is when I try to combine the too.
If I just run the scraper for Jira I get 50 different values but when add the googlesheet code it only enters the last value out of the 50.
Anyone know how I can fix this?
I am using python and working with PYCHARM.
Thanks!
# coding=utf-8
from jira.client import JIRA
import gspread
from oauth2client.service_account import ServiceAccountCredentials
SCOPES = ["https://spreadsheets.google.com/feeds"]
credentials = ServiceAccountCredentials.from_json_keyfile_name("blank", SCOPES)
connection = gspread.authorize(credentials)
options = {'server': 'https://jira.blank.com/'}
jira = JIRA(options, basic_auth=('blank', 'blank'))
projects = jira.projects()
for i in jira.search_issues('filter=11152'):
print i
worksheet = connection.open("blank").sheet1
cell_list = worksheet.range('A2:A51')
for cell in cell_list:
cell.value = i
# Update in batch
worksheet.update_cells(cell_list)
print("Done updating, check the spreadsheet now")

Related

Python win32com to forward a selected email with added content

Some time ago I wrote a simple python app which asks users for input and generates a new mail via Outlook app basing on the input. Now, I was asked to add some functionality so the app will no longer generate a new mail but it'll forward a selected email and add content to it. While I was able to write code which generates a new mail, I'm completely lost when I want to approach it with forwarding selected mails.
At the moment I use something like this to send a new email:
import win32com.client
from win32com.client import Dispatch
const=win32com.client.constants
olMailItem = 0x0
obj = win32com.client.Dispatch("Outlook.Application")
newMail = obj.CreateItem(olMailItem)
newMail.SentOnBehalfOfName = 'mail#mail.com'
newMail.Subject = ""
newMail.BodyFormat = 2
newMail.HTMLBody = output
newMail.To = ""
newMail.CC = ""
newMail.display()
And I know that by using something like this you can select an email in Outlook so Python can interact with it :
obj = win32com.client.Dispatch("Outlook.Application")
selection = obj.ActiveExplorer().Selection
How to merge these two together so the app will forward a selected email and add a new content on the top? I tried to find it out by trial and error, but finally, I gave up. Microsoft API documentation also was not very helpful for me as I was not really able to understand much of it (I'm not a dev). Any help appreciated.
Replace the line newMail = obj.CreateItem(olMailItem) with
newMail = obj.ActiveExplorer().Selection.Item(1).Forward()

Scraper stopped scraping

I ran scraping ops this morning:
The scraper runs through list fine, but just keeps saying "skipped" as per code.
I have checked a few and confirmed the information i require is on the website.
I have pulled my code apart piece by piece but cannot find any changes - I've even gone back to a vanilla version of my code to see and still no luck.
Could someone please run this and see what I am missing as I am going insane!
Target website https://www.realestate.com.au/property/12-buckingham-dr-werribee-vic-3030
Code:
import requests
import csv
from lxml import html
text2search = '''<p class="property-value__title">
RECENTLY SOLD
</p>'''
quote_page = ["https://www.realestate.com.au/property/12-buckingham-dr-werribee-vic-3030"]
with open('index333.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for index, url in enumerate(quote_page):
page = requests.get(url)
if text2search in page.text:
tree = html.fromstring(page.content)
(title,) = (x.text_content() for x in tree.xpath('//title'))
(price,) = (x.text_content() for x in tree.xpath('//div[#class="property-value__price"]'))
(sold,) = (x.text_content().strip() for x in tree.xpath('//p[#class="property-value__agent"]'))
writer.writerow([url, title, price, sold])
else:
writer.writerow([url, 'skipped'])
There was a change in the HTML code that introduced an additional white space.
This stopped the text2search in page.text: from running.
Thanks to #MarcinOrlowski for pointing me in the right direction
Thanks to advice from #MT - the code has been shortened to lessen the chances of this occurring again.

Tweepy rate limit / pagination issue.

I've put together a small twitter tool to pull relevant tweets, for later analysis in a latent semantic analysis. Ironically, that bit (the more complicated bit) works fine - it's pulling the tweets that's the problem. I'm using the code below to set it up.
This technically works, but no as expected - the .items(200) parameter I thought would pull 200 tweets per request, but it's being blocked into 15 tweet chunks (so the 200 items 'costs' me 13 requests) - I understand that this is the original/default RPP variable (now 'count' in the Twitter docs), but I've tried that in the Cursor setting (rpp=100, which is the maximum from the twitter documentation), and it makes no difference.
Tweepy/Cursor docs
The other nearest similar question isn't quite the same issue
Grateful for any thoughts! I'm sure it's a minor tweak to the settings, but I've tried various settings on page and rpp, to no avail.
auth = tweepy.OAuthHandler(apikey, apisecret)
auth.set_access_token(access_token, access_token_secret_var)
from tools import read_user, read_tweet
from auth import basic
api = tweepy.API(auth)
current_results = []
from tweepy import Cursor
for tweet in Cursor(api.search,
q=search_string,
result_type="recent",
include_entities=True,
lang="en").items(200):
current_user, created = read_user(tweet.author)
current_tweet, created = read_tweet(tweet, current_user)
current_results.append(tweet)
print current_results
I worked it out in the end, with a little assistance from colleagues. Afaict, the rpp and items() calls are coming after the actual API call. The 'count' option from the Twitter documentation which was formerly RPP as mentioned above, and is still noted as rpp in Tweepy 2.3.0, seems to be at issue here.
What I ended up doing was modifying the Tweepy Code - in api.py, I added 'count' in to the search bind section (around L643 in my install, ymmv).
""" search """
search = bind_api(
path = '/search/tweets.json',
payload_type = 'search_results',
allowed_param = ['q', 'count', 'lang', 'locale', 'since_id', 'geocode', 'max_id', 'since', 'until', 'result_type', **'count**', 'include_entities', 'from', 'to', 'source']
)
This allowed me to tweak the code above to:
for tweet in Cursor(api.search,
q=search_string,
count=100,
result_type="recent",
include_entities=True,
lang="en").items(200):
Which results in two calls, not fifteen; I've double checked this with
print api.rate_limit_status()["resources"]
after each call, and it's only deprecating my remaining searches by 2 each time.

Python library to access a CalDAV server

I run ownCloud on my webspace for a shared calendar. Now I'm looking for a suitable python library to get read only access to the calendar. I want to put some information of the calendar on an intranet website.
I have tried http://trac.calendarserver.org/wiki/CalDAVClientLibrary but it always returns a NotImplementedError with the query command, so my guess is that the query command doesn't work well with the given library.
What library could I use instead?
I recommend the library, caldav.
Read-only is working really well with this library and looks straight-forward to me. It will do the whole job of getting calendars and reading events, returning them in the iCalendar format. More information about the caldav library can also be obtained in the documentation.
import caldav
client = caldav.DAVClient(<caldav-url>, username=<username>,
password=<password>)
principal = client.principal()
for calendar in principal.calendars():
for event in calendar.events():
ical_text = event.data
From this on you can use the icalendar library to read specific fields such as the type (e. g. event, todo, alarm), name, times, etc. - a good starting point may be this question.
I wrote this code few months ago to fetch data from CalDAV to present them on my website.
I have changed the data into JSON format, but you can do whatever you want with the data.
I have added some print for you to see the output which you can remove them in production.
from datetime import datetime
import json
from pytz import UTC # timezone
import caldav
from icalendar import Calendar, Event
# CalDAV info
url = "YOUR CALDAV URL"
userN = "YOUR CALDAV USERNAME"
passW = "YOUR CALDAV PASSWORD"
client = caldav.DAVClient(url=url, username=userN, password=passW)
principal = client.principal()
calendars = principal.calendars()
if len(calendars) > 0:
calendar = calendars[0]
print ("Using calendar", calendar)
results = calendar.events()
eventSummary = []
eventDescription = []
eventDateStart = []
eventdateEnd = []
eventTimeStart = []
eventTimeEnd = []
for eventraw in results:
event = Calendar.from_ical(eventraw._data)
for component in event.walk():
if component.name == "VEVENT":
print (component.get('summary'))
eventSummary.append(component.get('summary'))
print (component.get('description'))
eventDescription.append(component.get('description'))
startDate = component.get('dtstart')
print (startDate.dt.strftime('%m/%d/%Y %H:%M'))
eventDateStart.append(startDate.dt.strftime('%m/%d/%Y'))
eventTimeStart.append(startDate.dt.strftime('%H:%M'))
endDate = component.get('dtend')
print (endDate.dt.strftime('%m/%d/%Y %H:%M'))
eventdateEnd.append(endDate.dt.strftime('%m/%d/%Y'))
eventTimeEnd.append(endDate.dt.strftime('%H:%M'))
dateStamp = component.get('dtstamp')
print (dateStamp.dt.strftime('%m/%d/%Y %H:%M'))
print ('')
# Modify or change these values based on your CalDAV
# Converting to JSON
data = [{ 'Events Summary':eventSummary[0], 'Event Description':eventDescription[0],'Event Start date':eventDateStart[0], 'Event End date':eventdateEnd[0], 'At:':eventTimeStart[0], 'Until':eventTimeEnd[0]}]
data_string = json.dumps(data)
print ('JSON:', data_string)
pyOwnCloud could be the right thing for you. I haven't tried it, but it should provide a CMDline/API for reading the calendars.
You probably want to provide more details about how you are actually making use of the API but in case the query command is indeed not implemented, there is a list of other Python libraries at the CalConnect website (archvied version, original link is dead now).

How to read the contents of active directory using python-ldap?

My script is like this:
import ldap, sys
server = 'ldap://my_server'
l = ldap.initialize(server)
dn="myname#mydomain"
pw = "password"
l.simple_bind_s(dn,pw)
ldap.set_option(ldap.OPT_REFERRALS,0)
print "valid"
I am using Python 2.7 on windows.
Is there any method to read or get the contents of active directory?
You can do quite a lot also using win32com.client (which I had trouble finding documentation for). For example I've needed to resolve user email knowing his ADS_NAME_TYPE_NT4 formatted name (doman\jonjoe).
First of all you need to convert it to ADS_NAME_TYPE_1779 format (CN=Jeff Smith,CN=users,DC=Fabrikam,DC=com):
name_resolver = win32com.client.Dispatch(dispatch='NameTranslate')
name_resolver.Set(3, 'domain\\jonjoe')
ldap_query = 'LDAP://{}'.format(name_resolver.Get(1))
Once you have that you can simply call GetObject():
ldap = win32com.client.GetObject(ldap_query)
print(ldap.Get('mail'))
Tested with Python 3.2.5
You should realy need to read the documentation of python-ldap http://www.python-ldap.org/docs.shtml
You have a connection in your variable l, then you can do this.
l.con_search_s('dc=your,dc=base,dc=dit', ldap.SCOPE_SUBTREE, 'uid=*', ['uid', 'uidnumber'])
The above code, goint to search in to all the uid's entrys, for if entry, is going to get the uid and the uidnumbre attributes.