variable number of tweets using python - python-2.7

I was trying to run the following code and I get variable number of tweets when I keep running the code at some interval of time (more than 15min). Sometimes I get 1400 tweets and 1200,1000,1600 tweets the other time. Can't I get fixed number of tweets all the time I run the code even if i change the keyword?
for tweet in tweepy.Cursor(api.search, q="#narendramodi", rpp=100).items(200):

You search does not specify any id limit.
Because of pagination, Twitter Search API looks for latest tweets every time you call it. Since tweets are added continuously, simple call to Search API returns the most recent ones and you'll get different number of tweets based on how many tweets were posted during the time you were querying. See Working with Timelines.
Please also note that Twitter Search API focuses on relevance rather than completeness of the results. See The Search API.
If you want to iterate over tweets, starting from the moment you run your application and continuing to older tweets, I recommend using max_id in your next query parameters setting it with the id field of the last result from your query as suggested here.

Related

How to search for pages by name in Facebook Graph API with least amount of calls using Python?

I have a list of football team names my_team_list = ['Bayern Munich', 'Chelsea FC', 'Manchester United', ...] and try to search for their official Facebook page to get their fan_count using the Python facebook-api. This is my code so far:
club_list = []
for team in my_team_list:
data = graph.request('/pages/search?q=' + team[0])
for i in data['data']:
likes = graph.get_object(id=i['id'], fields='id,name,fan_count,is_verified')
if likes['is_verified'] is True:
club_list.append([team[0],likes['name'],likes['fan_count'],likes['id']])
However, because my list contains over 3000 clubs, with this code I will get the following rate limit error:
GraphAPIError: (#4) Application request limit reached
How can I reduce the calls to get more than one club's page per call (e.g. batch calls?)
As the comment on the OP states, batching will not save you. You need to be actively watching the rate limiting:
"All responses to calls made to the Graph API include an X-App-Usage HTTP header. This header contains the current percentage of usage for your app. This percentage is equal to the usage shown to you in the rate limiting graphs. Use this number to dynamically balance your call load to avoid being throttled."
https://developers.facebook.com/docs/graph-api/advanced/rate-limiting#application-level-rate-limiting
On your first run through, you should save all of the valid page ids so in the future you can just query those ids instead of doing a search.

Is there a limit to how long a filename URL statement can be?

I am on design number three I think now of a program that submits a series of stock tickers and metrics to Yahoo Finance. I don't need to go into too much total about what it does as I have got most of it up and running now apart from one remaining issue.
The Yahoo Finance site lists about 2700 stock tickers on the NASDAQ alone. I anticipated that submitting all of these in one filename URL statement might fall over for some reason, so set an initial string length of 500 tickers and built some nested macros to iterate through in 500 ticker blocks until everything I wanted had been extracted.
However during development of the code it seems that if I build a string with any more than about 200 tickers in I get an error telling me that SSL Support cannot be run and the code falls over.
Does anyone have any idea why this is? In ideal world I would like to be able to do this code in one pass where all 2700 stock tickers are pulled down. If this isn't possible if someone could explain why not that would be great.
Thanks

How do I get the most recent report data?

I'm trying to build a tool that collects a few data points from a user usage report with
https://www.googleapis.com/admin/reports/v1/usage/{user}/all/dates/{yyyy-mm-dd}
Since the data is delayed - how do I get the most recent report? If I were to query today's (2013-11-22) date I would get something like:
Data for dates later than 2013-11-19 is not yet available. Please check back later
Is there a set number of days/hours for reports to be available - or do I have to trial and error backwards until I get a successful response?
I believe there is a delay of about 48 hours for the reports as of right now. However, if Google is able to improve on that, you'll want your app to be able to take advantage of those improvements without any changes needed.
I suggest you make a first attempt using today's date. When that fails, parse the error response to grab the last date report data is available for and use that value. This way you're always making only 2 max attempts and if Google improves the delay to 24 hours or even less, your app is able to take immediate advantage of that change.

Retrieve Facebook Page posts comments total count (July 2013 Breaking Changes)

Currently I am using following API call to retrieve Post Likes and Post Comments for Facebook Page (PageId). Here in below i am making only one API call and retrieving ALL posts and their comments total count.
1). https://graph.facebook.com/PageId/posts?access_token=xyz&method=GET&format=json
But, as per "July 2013 Breaking Changes" : - Now comments counts are not available with above API call. so , as per Road Map documentation I am using following API call to retrieve comments count ('total_count') for that particular POST ID.
2). https://graph.facebook.com/post_ID/?summary=true&access_token=xyz&method=GET&format=json
So , with second API call - I am able to retrieve comments count per Post Wise. But, here you can see that I need to iterate through each post & need to retrieve its comments count one by one per each post id. then need to sum up all to find out total comments count. so that requires too much API calls.
My Question is :- Is it possible to retrieve Page -> Posts -> ALL comments total count in single API call by considering 10 July breaking changes ?
Is there any alternative to my second API call to retrieve all comments total count per Facebook page posts ?
Hmm, well, I don't believe there is a way to bundle this all in a single api call. But, you can batch requests to get this in the seemingly same api call (will save time), but they will count against your rate limits separately. (my example below would be 4 calls against the limits)
Example batch call (json encoded) - and i'm storing the post ID in the php variable $postId.:
[{"method":"GET","relative_url":"' . $postId . '"},
{"method":"GET","relative_url":"' . $postId . '/likes?limit=1000&summary=true"},
{"method":"GET","relative_url":"' . $postId . /comments?filter=stream&limit=1000&summary=true"},
{"method":"GET","relative_url":"' . $postId . '/insights"}]
I'm batching 4 queries in this single call. First to get post info, second to get likes (up to 1000, plus the total count), third to get all the comments, plus the summary count, and finlly, insights (if it's the page's own posts).
You can drastically simplify this batch call if you don't want all the details I'm pulling.
In this case you still need to iterate though all. But, Facebook allows you to bundle up to 50 calls per batch request I believe, so you could request multiple post ids in the same batch call to speed things up too.

SimpleDB Incremental Index

I understand SimpleDB doesn't have an auto increment but I am working on a script where I need to query the database by sending the id of the last record I've already pulled and pull all subsequent records. In a normal SQL fashion if there were 6200 records I already have 6100 of them when I run the script I query records with an ID greater than > 6100. Looking at the response object, I don't see anything I can use. It just seems like there should be a sequential index there. The other option I was thinking would be a real time stamp. Any ideas are much appreciated.
Using a timestamp was perfect for what I needed to do. I followed this article to help me on my way:http://aws.amazon.com/articles/1232 I would still welcome if anyone knows if there is a way to get an incremental index number.