fetch the retweets for the tweets using python - python-2.7

I have to fetch the retweets for the tweets and create the JSON file with retweets,user id etc using the python script. Kindly help me to sort it our this issues.
Thanks in advance!!

This task require some fields of knowledge, and since you ask in a general way, I reckon you need a script to run immediately, but setting up this process requires sometime
This part to get connect to twitter API
from twython import Twython, TwythonError
APP_KEY = 'YOUR_APP_KEY'
APP_SECRET = 'YOUR_APP_SECRET'
twitter = Twython(APP_KEY, APP_SECRET)
Use Twitter API call from Twython,
you can find a list here https://twython.readthedocs.io/en/latest/api.html, the param is the same as twitter API
response = twitter.get_retweets(id, 100)
Pagnation
each call to API have limit of returns, in example for engine.get_friends_ids was limited to 5000 (https://dev.twitter.com/rest/reference/get/friends/ids), if you want to get more than 5000, you have to use the cursor in the returned result (if cur = 0 in json returned means no more results), following is example of how to handling cursor
#Set a temp to loop
cur = -1
#Stop when no more result
while cur !=0:
response = twitter.get_friends_ids(user_id=user_id, cursor=cur)
#Some code to handle the response
cur = response["next_cursor"]
API key
Key expires after some calls (https://dev.twitter.com/rest/public/rate-limits), so you need to set some code to auto change your key, or wait for some period (key reached limit return error code 429)
Response
The response from API was in JSON format, which was easy to use, you can access data by selecting base on response[key], in example
reponse["ids"] or response["next_cursor"]

Related

AWS-sdk no paginators for rds cluster

I just found out that I can get a max of 100 records for DBClusterSnapshots, luckily AWS supports pagination where you can get a list by page. I was going over the documentation for aws-sdk-go to see how my Operation implements pagination. Unfortunately there isn't a pagination method for my Operation.
This is the operation I want to paginate. It says in the doc that it supports pagination.
https://docs.aws.amazon.com/cli/latest/reference/rds/describe-db-cluster-snapshots.html
However the pagination method for my operation doesn't appear to be supported
doc: https://docs.aws.amazon.com/sdk-for-go/api/service/rds/
It only supports DBSnapshotsPages but not DBClusterSnapshotsPages
The AWS SDK for Go has the DescribeDBClusterSnapshots function:
func (c *RDS) DescribeDBClusterSnapshots(input *DescribeDBClusterSnapshotsInput) (*DescribeDBClusterSnapshotsOutput, error)
It accepts a parameter of DescribeDBClusterSnapshotsInput, which includes:
Marker *string type:"string"
An optional pagination token provided by a previous DescribeDBClusterSnapshots request. If this parameter is specified, the response includes only records beyond the marker, up to the value specified by MaxRecords.
Therefore, your code can call DescribeDBClusterSnapshots, store the marker that is returned, then make another call to DescribeDBClusterSnapshots, passing in that value for marker. This will return the next 'page' of results.
On aws sdk you can handle pagination by yourself using the response's next_page method to verify there's no pages to retrieve. In order to retrieve the next page of results, attaching some ruby example:
# object initializtion:
rds_client = Aws::RDS::Client.new
# implementation:
def self.describe_all_db_snapshots(db_instance_identifier: db_instance_identifier)
response = rds_client.describe_db_snapshots({
db_instance_identifier: db_instance_identifier,
snapshot_type: "automated",
include_shared: false,
include_public: false,
max_records: 100 })
while response.next_page? do
# use the response data here...
puts #{response}
# next pagination iterator
response = response.next_page
end
end
For more details read aws sdk documentation.

gspread Invalid token: Stateless token expired [duplicate]

I am using gspread and a Service Account Key, Other, json file. to continually update a google spreadsheet with python 2.7. I have this running off a Raspberry Pi running the latest Raspian Jessie. my oauth and gspread should all be the latest versions available for my platform. My script runs for one hour(the max token life span),then stops working with the error message : "Invalid token: Statless token expired error" My code is as follows
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import httplib2
from httplib2 import Http
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name(filename.json,scope)
gc = gspread.authorize(credentials)
wks = gc.open('spreadsheet name')
p1 = wks.worksheet('Printer One')
def functon()
...
p1.append_row(printing)
Any Help would be greatly appreciated, Thank You.
Authorisation expires every 0.5/1 hour (I think it depends on which of the two available methods you use to connect).
I have a google sheet connected 24/7 that updates every 2 seconds. Almost always the reason for a bad read/write is an authorisation error but also Google API can throw a variety of errors at you too that normally resolve after a few seconds. Here's one of my functions to update a cell, but using your details for auth_for_worksheet. Every operation (update single cell, update a range, read a column of values) has some similar construct as a function, which always returns an authorised worksheet. It's probably not the most elegant solution but the sheet has been connected for 3 months fine with no downtime.
def auth_for_worksheet():
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name(filename.json,scope)
gc = gspread.authorize(credentials)
wks = gc.open('spreadsheet name')
p1 = wks.worksheet('Printer One')
return p1
def update_single_cell(worksheet, counter, message):
""" No data to return, update a single cell in column B to reflect this """
single_cell_updated = False
while not single_cell_updated:
try:
cell_location = "B" + str(counter)
worksheet.update_acell(cell_location, message)
single_cell_updated = True
except gspread.exceptions.HTTPError:
logger.critical("Could not update single cell")
time.sleep(10)
worksheet = auth_for_worksheet()
logger.info("Updated single cell")
return worksheet
if __name__ == '__main__':
# your code here, but now to update a single cell
wksheet = update_single_cell(wksheet, x, "NOT FOUND")

Is there any nicer way to get the full message from gmail with google-api

I'm working on a project where I, among other things, need to read the message in e-mails from my google account. I came up with a solution that works but wonder if there are any simpler ways?
The first part of the code is pretty standard to get access to the mailbox. But I post it so you can see what I did to get it to work.
SCOPES = 'https://www.googleapis.com/auth/gmail.modify'
CLIENT_SECRET ='A.json'
store =file.Storage('storage.json')
credz=store.get()
flags = tools.argparser.parse_args(args=[])
if not credz or credz.invalid:
flow = client.flow_from_clientsecrets(CLIENT_SECRET,SCOPES)
if flags:
credz = tools.run_flow(flow, store, flags)
GMAIL = build('gmail','v1',http=credz.authorize(Http()))
response = GMAIL.users().messages().list(userId='me',q='').execute()
messages = []
if 'messages' in response:
messages.extend(response['messages'])
print len(messages)
while 'nextPageToken' in response:
page_token = response['nextPageToken']
response = service.users().messages().list(userId='me', q=query,pageToken=page_token).execute()
messages.extend(response['messages'])
FromMeInd=0
for message in messages:
ReadMessage(GMAIL,'me',message['id'])
It is this part that I'm more interested to imporve. Is there any other way to more directly get the message with python and the gmail-api. I've looked through the api documentation but could not get any more efficient way to read it.
def ReadMessage(service,userID,messID):
message = service.users().messages().get(userId=userID, id=messID,format='full').execute()
decoded=base64.urlsafe_b64decode(message['payload']['body']['data'].encode('ASCII'))
print decoded
You can get the body as raw and then parse it using the standard Python email module
According to the official API: https://developers.google.com/gmail/api/v1/reference/users/messages/get:
import email
message = service.users().messages().get(userId='me', id=msg_id,
format='raw').execute()
print 'Message snippet: %s' % message['snippet']
msg_str = base64.urlsafe_b64decode(message['raw'].encode('ASCII'))
mime_msg = email.message_from_string(msg_str)
You'll get a mime message with a payload containing mime parts, e.g. plain text, HTML, quoted printable, attachments, etc.

adding parameters to tweepy api request

How does one set the parameters for a request to twitter via tweepy's api.
#https://api.twitter.com/1.1/statuses/user_timeline.json?exclude_replies=true&include_rts=false
import tweepy
#assume tokens and secrets are declared
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
status = api.user_timeline('xxxxxxxxx')
What I get back from this is the "tweets and retweets" from the user inside a collection of Status objects, but
I only want the user's "tweets" returned. After reading the docs, it's still unclear to me on how to modify the request url
I've found success just filtering the json object returned from user_timeline.
This will filter out the user's retweets:
for tweetObj in status:
if hasattr(tweetObj, 'retweeted_status'):
continue
else:
print tweetObj #or whatever else you want to do
But to answer your question, you can pass the optional parameter, include_retweets like so:
status = api.user_timeline('xxxxxxxxx', include_retweets=False)
I like the first method better because the RTs still count against your count & maximum length parameters.

In python-oauth2 how do you retain oauth_token_secret parameter?

I'm trying to follow this Oauth2 guide for Sign in With Twitter https://github.com/simplegeo/python-oauth2 - Everything is going great until between steps 2 and 3. I handle the the callback fine, but how do I pass along the oauth_token_secret? My confusion is that it seems like it's lost after the redirect back to my handler.
From what I can tell the parameters I get back are oauth_token and oauth_verifier, and yet I need the oauth_token_secret to receive the access token in these steps.
token = oauth.Token(request_token['oauth_token'],
request_token['oauth_token_secret'])
token.set_verifier(oauth_verifier)
client = oauth.Client(consumer, token)
resp, content = client.request(access_token_url, "POST")
access_token = dict(urlparse.parse_qsl(content))
Am I supposed to store it in a cookie to retrieve later?
I was able to do this by storing the oauth_token and oauth_token_secret in a session during step one. These values are stored from the created request token request_token['oauth_token']