C# Facebook SDK and automatic paging - facebook-graph-api

If I make a request for something that has a large number of objects (like if you had 10000 friends or 10000 photos in an album), does the C# Facebook SDK automatically follow the "Paging:Next" links for me, or is there anything I need to do?
I looked through their code and don't see any mention of paging, but could have missed it.
Note that I'm -not- talking about Batch requests; I'm speaking of a simple api.Get("/me/friends") where Facebook decides there are too many objects to put in a single response. Unfortunately I don't have an account with enough of anything to test the results...

Pagination is always up to the user of the SDK, no matter which SDK for Facebook. I don't think they've gotten that creative in adding it in, or maybe there's some legal reasons they have not.

Here's the code I wound up using. Since I know from the album's "count" how many images to expect I just request them in batches up to that count. It'd be trickier for scenarios where you don't know in advance how many objects you'll be getting back, but I haven't encountered a need for that yet.
const long maxBatchSize = 50;
for (long q = 0; q < album.Count; q += maxBatchSize)
{
var facebook = new FacebookClient(FacebookSession.AccessToken);
facebook.GetCompleted += new EventHandler<FacebookApiEventArgs>(GetPhotosCallback);
long take = album.Count - q;
if (take > maxBatchSize)
take = maxBatchSize;
dynamic parameters = new ExpandoObject();
parameters.limit = take;
parameters.offset = q;
facebook.GetAsync("/" + album.Id + "/photos", parameters, null);
}

If you look at the data Facebook returns there is usually paging element within the results. The paging element contains next and previous urls. These URLs are created by Facebook and the "next" can be used to retrieve more information. Below is an example of the tokens for /me/posts -request:
{
data: [ ... ],
"paging": {
"previous": "https://graph.facebook.com/6644332211/posts?limit=25&since=1374648795",
"next": "https://graph.facebook.com/6644332211/posts?limit=25&until=1374219441"
}
}
If you want to automated the retrieval of all items, you can pick the relevant parameters from the "next" url and pass them the Facebook SDK get method.
The relevant parameters you need to pick depend from the type of information you are retrieving. In posts you only have the "until", for checkins I can see also something called pagingtoken.

Related

Graph API Limiting the amount of data I can request

I am trying to extract all the posts from a private FB group of which I am an admin. I am using a Python script to access the data. Whether I use the Graph API Explorer via the web, or my Python script, I am having the exact same problem. I am able to gather the first 6 pages of the feed, each page containing 25 posts. The very first request looks like this:
https://graph.facebook.com/<groupID>/feed?access_token=<accessToken>
That will return, as I stated, the latest 25 posts on the group page.
At the bottom of the JSON that is returned for each request is a section like this:
"paging": {
"previous": "https://graph.facebook.com/v13.0/<pageID>/feed?access_token=<tokenID>&pretty=0&until&__previous=1&since=1649789940&__paging_token=<paging_token>",
"next": "https://graph.facebook.com/v13.0/<pageID>/feed?access_token=<tokenID>&pretty=0&until=1647885515&since&__paging_token=<paging_token>&__previous"
}
I use the value in next to launch the next query. This works until I get to the 6th request. At that point when I request the URL in next it spins for about 15 seconds and then I get the following error:
{
"error": {
"code": 1,
"message": "Please reduce the amount of data you're asking for, then retry your request"
}
}
How exactly do I reduce my data that I'm requesting? I've tried adding the feed.limit() to the request, and it works for the very first request. But that limit is never included in the next URL. Adding it in myself via the script still always returns 25 posts, not what the limit was on the first try. So if I set feed.limit(7) it returns 7 posts on the first request, but then when I use the next link I get 25.
I've set the limit to 100, the first request works, next works the first time, but not the second. If I set the limit to 120 it works with the first query, but now next doesn't. So it seems like it has this built in barrier at 125, it won't give me any more data than that. Any help would be greatly appreciated.

How to search for pages by name in Facebook Graph API with least amount of calls using Python?

I have a list of football team names my_team_list = ['Bayern Munich', 'Chelsea FC', 'Manchester United', ...] and try to search for their official Facebook page to get their fan_count using the Python facebook-api. This is my code so far:
club_list = []
for team in my_team_list:
data = graph.request('/pages/search?q=' + team[0])
for i in data['data']:
likes = graph.get_object(id=i['id'], fields='id,name,fan_count,is_verified')
if likes['is_verified'] is True:
club_list.append([team[0],likes['name'],likes['fan_count'],likes['id']])
However, because my list contains over 3000 clubs, with this code I will get the following rate limit error:
GraphAPIError: (#4) Application request limit reached
How can I reduce the calls to get more than one club's page per call (e.g. batch calls?)
As the comment on the OP states, batching will not save you. You need to be actively watching the rate limiting:
"All responses to calls made to the Graph API include an X-App-Usage HTTP header. This header contains the current percentage of usage for your app. This percentage is equal to the usage shown to you in the rate limiting graphs. Use this number to dynamically balance your call load to avoid being throttled."
https://developers.facebook.com/docs/graph-api/advanced/rate-limiting#application-level-rate-limiting
On your first run through, you should save all of the valid page ids so in the future you can just query those ids instead of doing a search.

Writing to Google Spreadsheet API Extremely Slow

I am trying to write data from here(http://acleddata.com/api/acled/read) to Google Sheets via its API.I'm using the gspread package to help.
Here is the code:
r = requests.get("http://acleddata.com/api/acled/read")
data = r.json()
data = data['data']
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('credentials.json', scope)
gc = gspread.authorize(credentials)
for row in data:
sheet.append_row(row.values())
The data is a list of dictionaries, each dictionary representing a row in a spreadsheet. This is writing to my Google Sheet but it is unusably slow. It took easily 40 minutes to write a hundred rows, and then I interrupted the script.
Is there anything I can do to speed up this process?
Thanks!
Based on your code, you're using the older V3 Google Data API. For better performance, switch to the V4 API. A migration guide is available here.
Here is the faster solution:
cell_list = sheet.range('A2:'+numberToLetters(num_columns)+str(num_lines+1))
for cell in cell_list:
val = df.iloc[cell.row-2, cell.col-1]
if type(val) is str:
val = val.decode('utf-8')
elif isinstance(val,(int, long, float, complex)):
val= int(round(val))
cell.value = val
sheet.update_cells(cell_list)
This is derived from here https://www.dataiku.com/learn/guide/code/python/export-a-dataset-to-google-spreadsheets.html
I believe the change here is that this solution creates a cell_list object, which only requires one API call.
Based from this thread, Google Spreadsheets API can be pretty slow depending on many factors including your connection speed to Google servers, usage of proxy, etc. Avoid having gspread.login inside a loop because this method is slow.
...get_all_records came to my rescue, much faster than range for entire sheet.
I have also read in this forum that it depends on the size of the worksheet, so as the rows increase in the worksheet, the program run even more slower.

Retrieve Facebook likes after October 2013 Breaking Changes

from https://developers.facebook.com/roadmap/
"Currently the API returns all likes by default. After the migration, fetching a user's likes via the Graph API will return 25 results at a time. We've added pagination to the results so you can page through to all of a user's likes."
I have read the paging how-to here https://developers.facebook.com/blog/post/478/ but still it is not very clear to me which is the best practice to use:
1) The document says that "With the Graph API, when there is more data available, you will notice that paging links are provided:", but at the moment (no limit) I'm getting the paging links even if all the results are already retrieved in the first page. Do I have to check manually the number of results of the following page to verify if it is empty?
2) The document also says "You might notice that the number of results returned is not always equal to the “limit” specified. This is expected behavior. Query parameters are applied on our end before checking to see if the results returned are visible to the viewer. Because of this, it is possible that you might get fewer results than expected.". This should not affect likes retrieving, am I right? I think it is not possible that some likes are viewable and some others not.
Thanks.
I'll try to answer to myself.
1) Yes, I have to check manually, I just did something like this (in this example I retrieve music likes)
$fb_music_likes_ar = array();
$end = 0;
$offset = 0;
while ($end === 0){
$temp_ar = $facebook->api('/me/music?limit=25&offset='.$offset);
$fb_music_likes_ar = array_merge($fb_music_likes_ar, $temp_ar['data']);
$offset = $offset+25;
if (count($temp_ar['data']) < 25){
$end = 1;
}
}
This of course takes more time than before; I don't understand the reason of the change, if I need all the likes I will end up doing several calls and I don't think it is more efficient...
Maybe we can use batch processing to launch several calls?
2) I don't think this is affecting likes retrieving

Insights API returns only 7 fields for some posts, 31 for others?

I have two otherwise identical posts on a Facebook page that I administer. One post we'll call "full" returns the full range of insight values (31) I'd expect even when the values are zero, while the other which we'll call "subset" returns only a very limited subset of values (7). See below for the actual values returned.
Note that I've confirmed this is the case by using both the GUI-driven export to Excel and the Facebook Graph API Explorer (https://developers.facebook.com/tools/explorer).
My first thought was that the API suppresses certain values such as post_negative_feedback if they are zero (i.e., nobody has clicked hide or report as spam/abusive), but this is not the case. The "full" post has no such reports (or at the very least the return value for all the post_negative_* fields are zero.
I've even tried intentionally reporting the post with no negative return values as spam, and then repulling what I thought was a real-time field (i.e., post_negative_feedback), but data still comes back empty:
{
"data": [
],
(paging data)
}
What gives?
Here is the more limited subset returned for the problematic post:
post_engaged_users
post_impressions
post_impressions_fan
post_impressions_fan_unique
post_impressions_organic
post_impressions_organic_unique
post_impressions_unique
And here is the full set returned for most other posts (with asterisks added to show the subset returned above):
post_consumptions
post_consumptions_by_type
post_consumptions_by_type_unique
post_consumptions_unique
*post_engaged_users
*post_impressions
post_impressions_by_story_type
post_impressions_by_story_type_unique
*post_impressions_fan
post_impressions_fan_paid
post_impressions_fan_paid_unique
*post_impressions_fan_unique
*post_impressions_organic
*post_impressions_organic_unique
post_impressions_paid
post_impressions_paid_unique
*post_impressions_unique
post_impressions_viral
post_impressions_viral_unique
post_negative_feedback
post_negative_feedback_by_type
post_negative_feedback_by_type_unique
post_negative_feedback_unique
post_stories
post_stories_by_action_type
post_story_adds
post_story_adds_by_action_type
post_story_adds_by_action_type_unique
post_story_adds_unique
post_storytellers
post_storytellers_by_action_type
The issue (besides "why does this happen?") is that I've tried giving negative feedback to the post that fails to report any count whatsoever for this -- and I still receive no data (would expect "1" or something around there). I started out waiting the obligatory 15 minutes (real-time field) and then when that didn't work give it a full 24 hours. What gives?