Information of re-shared status - facebook-graph-api

I am working with Facebook graph api for few days. I am trying to extract user's status and the information of reshared if any. I can easily find status of a user using fields=id,name,statuses query. But I could not find any information about re-sharing. I found a field of status sharedposts. But could not understand what it actually does. Can anyone enlighten me about how can I collect information about resharing (who reshared,when reshared,resharing location). I used user_status access token.

The sharedposts field applies to a status id. For example, the status id 10151794781777494 is from a status update by the TheKrazyCouponLady which has been shared 4 times. This query:
/10151794781777494?fields=sharedposts
Will return all the information about the users that have shared it. If you want to limit the returned fields to the name and id of the sharer, and the time and location it was shared, you could do this:
/10151794781777494?fields=sharedposts.fields(from,created_time,place)
Although I expect there won't be any location data most of the time.
To find the status id in the first place, you could just query the statuses field for a particular user. Again, using TheKrazyCouponLady (uid 255919387493) as an example:
/255919387493?fields=statuses
To get just the ids:
/255919387493?fields=statuses.fields(id)
As an alternative to that, you may want to consider querying the user's posts instead. The advantage to using posts, is that you can get back the share count for each post in that query.
/255919387493?fields=posts.fields(id,shares)
If the share count on a post is zero, then there is obviously no need to run another query to retrieve the users that have shared that post.
The downside of using posts is that the post id is slightly different from a status id. You'll see ids that look like this:
255919387493_10151794781777494
The first half of that string is the user id of the post owner. The second half is the actual status id. If you want to query the sharedposts field for the post, you first have to extract the second half (the status id) and use that for the query.
Having said that, it occurs to me that you could actually retrieve all the information you need in one go if you chain the statuses query and the sharedposts query together. For example, something like this:
/255919387493?fields=statuses.fields(id,message,sharedposts.fields(from,created_time,place))
That will return the status id and message text for each status from that user, and the user details, create time and location for each person that shared each of those statuses.
Even with paging, though, that is likely to be a fairly slow query, so I'm not sure if that's such a good idea. It's worth considering though.

According new version of API 2.1 and documentation from here
https://developers.facebook.com/docs/graph-api/reference/v2.1/post
there is a new edge called "sharedposts"
As described here https://developers.facebook.com/docs/graph-api/reference/v2.1/object/sharedposts
This reference describes the /sharedposts edge that is common to
multiple Graph API nodes. The structure and operations are the same
for each node.
This edge represents any posts where the original object was shared on
Facebook.

If the post type is photo sharedposts will return empty as the object is different to the postID
/317380948302131_847979698575584 => Object : 847979378575616
/317380948302131_847979698575584/sharedposts?fields=from,via
ObjectID will work as expected
/847979378575616//sharedposts?fields=from,via
The only problem if the object is a shared_post it will show all shares from the original post object too and no via node is present .
Just struggle around some time why the APi only sometimes return sharedposts

Related

Model Post and Topic through DynamoDB

Heres the relation I'm trying to model in DynamoDB:
My service contains posts and topics. A post may belong to multiple topics. A topic may have multiple posts. All posts have an interest value which would be adjusted based on a combination of likes and time since posted, interest measures the popularity of a post at the current moment. If a post gets too old, its interest value will be 0 and stay that way forever (archival).
The REST api end points work like this:
GET /posts/{id} returns a post-object containing title, text, author name and a link to the authors rest endpoint (doesn't matter for this example) and the number of likes (the interest value is not included)
GET /topics/{name} should return an object with both a list with the N newest posts of the topics as well as one for the N currently most interesting posts
POST /posts/ creates a new post where multiple topics can be specified
POST /topics/
creates a new topic
POST /likes/ creates a like for a specified post (does not actually create an object, just adds the user to the given post-object's list of likers, which is invisible to the users)
The problem now becomes, how do I create a relationship between topics and and posts in DynamoDB NoSql?
I thought about adding a list of copies of posts to tag entries in DynamboDB, where every tag has a list of both the newest and the most interesting Posts.
One way I could do this is by creating a cloudwatch job that would run every 10 minutes and loop through every topic object, finding both the most interesting and newest entries and then replacing the old lists of the topic.
Another job would also have to regularly update the "interest" value of every non archived post (keep in mind both likes and time have an effect on the interest value).
One problem with this is that a lot of posts in the Tag list would be out of date for 10 minutes in case the User makes a change or deletes the post. Likes will also not be properly tracked on the Tags post list. This could perhaps be solved with transactions, although dynamoDB is limited to 10 objects per transaction.
Another problem is that it would require the add-posts-to-tags job to load all the non archived posts into memory in order to manually sort them by both time and interest, split them up by tag and then adding the first N of both sets to the tag lists every 10 minutes.
I also had a another idea, by limiting the tags of a post that are allowed to 1, I could add the tag as a partition key, with the post-time as the sort key, and use a GSI to add Interest as a second sort key.
This does have several downsides though:
very popular tags may be limited to a single parition since all the posts share a single partition key
Tag limit is 1
A cloudwatch job to adjust the Interest value of posts may still be required
It would require use of a GSI which may lead to dangerous race conditions
But it would have the advantage that there are no replications of the post objects aside from the GSI. It would also allow basically infinite paging of all posts by date instead of being limited to just the N newest posts.
So what is a good approach here? It seams both of my solutions have horrible dealbreakers. Is this just one of those problems that NoSQL simply can't solve?
You are trying to model relational data using a non relational DB ,
to do this I would use 2 types of DB ,
I would store in dynamo the post information
in your example it would be :
GET /posts/{id}
POST /posts/
POST /likes/creates
For the topic related information I would use Elastic search (Amazon Elasticsearch Service)
GET /topics/{name} : the search index would stored the full topic info as well post id's that , and the relevant fields you want to search for (in your case update date to get the most recent posts)
what this will entail is background process (in dynamoDB this can be done via streams) that takes changes to the dynamoDB for new post's , update to like count etc.. and populates the search index.
Note: this can also be solved using graphDB but for scaling purposes better separate the source of the data (post's ) and the data relations (topic).

Musicbrainz querying artist and release

I am trying to get an artist and their albums. So reading this page https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2 i created the following query to get Michael Jackson's albums
http://musicbrainz.org/ws/2/artist/?query=artist:michael%20jackson?inc=releases+recordings
My understanding is to add ?inc=releases+recordings at the end of the URL which should return Michael Jackson's albums however this doesnt seem to return the correct results or i cant seem to narrow down the results? I then thought to use the {MBID} but again thats not returned in the artists query (which is why im trying to use inc in my query)
http://musicbrainz.org/ws/2/artist/?query=artist:michael%20jackson
Can anyone suggest where im going wrong with this?
You're not searching for the correct Entity. What you want is to get the discography, not artist's infos. Additionally, query fields syntax is not correct (you must use Lucene Search Syntax).
Here is what you're looking for:
http://musicbrainz.org/ws/2/release-group/?query=artist:"michael jackson" AND primarytype:"album"
We're targeting the release-group entity to get the albums, searching for a specific artist and filtering the results to limit them to albums. (accepted values are: album, single, ep, other)
There are more options to fit your needs, for example you can filter the type of albums using the secondarytype parameter. Here is the query to retrieve only live albums:
http://musicbrainz.org/ws/2/release-group/?query=artist:"michael jackson" AND primarytype:"album" AND secondarytype="live"
Here is the doc:
https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2/Search
Note that to be able to use MB's API you need to understand how it is structured, especially, the relations between release_group, release and medium.

Querying earliest post of a Facebook user using Facebook Graph API or FQL

I would like to query earliest posts of a Facebook user using FQL or Graph API. The big issue is by default, Facebook limit return items, which are ordered by descending time.
I know I can limit my query by until, but I don't know what date to put in, because I have no idea when my user become Facebook member. I have to do search like:
find post until Jan 2006
if null, then find post until Jan 2007
if null, then find post until Jan 2008
....
which I hate so much.
Is there a smarter way to find out earliest posts by user?
First off, it's near impossible to have an all encompassing program that determines when a user joined Facebook, to put it quite bluntly. I know from your past questions, you have been trying but many have tried before you, it's not possible.
For example what happens if no one decides to write anything on my wall from the date I joined to 1 year after? That indicator becomes pretty inaccurate now does it?
Anything smarter is based on assumptions that may or may not hold true.
e.g.
Assumption 1: Every Facebook user would publish a post on or near when they joined
this give an initial guess based on A1
Assumptions 2: Given A1, any post by a friend on a user's wall that is posted before the unix time returned by A1 will be earlier in date
this will always be true as long as A1 holds.
All of this falls when there is a year between actual activity and join date.
You can minimize the set returned by calling less data per item and more items overall
/me/feed?fields=created_time&limit=200
Then you page until there is no next paging parameter left.
If you are indeed trying to find when did a user join Facebook, I agree with phwd's answer.
The best way I have been able to find out (which is also cheaper than having to reiterate through tons of posts) is accessing the earliest "profile pictures" of the user. This is making the assumption that a user would post a profile picture soon after creating their account.
Once you can get access to "Profile Pictures" album, you might be able to use created_time field for the album (or sort Profile Pictures by created_time for individual photos).
Even if the earliest photo was deleted, what are the chances that the user stays without any profile picture for a long time?
Reference:
https://developers.facebook.com/docs/graph-api/reference/v2.0/album

Find User's First Post?

Using the Graph API or FQL, is there a way to efficiently find a user's first post or status? As in, the first one they ever made?
The slow way, I assume, would be to paginate through the feed, but for users like me who joined in 2005 or earlier, that would take a very long time with a huge amount of API calls.
From what I have found, we cannot obtain the date the user registered with Facebook for a good starting point, and we cannot sort by date ascending (not outside of the single page of data returned) to get the oldest post on top.
Is there any reasonable way to do this?
you can use facebook query language (FQL) to get first post information.
Please refer below query for more details :-
SELECT message, time FROM status WHERE uid= me() ORDER BY time ASC LIMIT 1
Please check and let me know in case of any issue.
Thanks and Regards
Durgaprasad
I think the Public API is limited to the depth of information it is allowed to query. Facebook probably put in these constraints for performance and cost concerns. Maybe they've changed it. When I tried to go backwards thru a person's stream about 4 months ago, there seemed to be a limit as to how far back I could go. Maybe it's a time limit or a # posts back limit. If you know when your user first posted, then getting to it should be fairly quick using the since/until time stamps in your queries.

Difference between a post's likes count and the likes data?

I'm seeing a discrepancy between the number of likes reported in the Graph API vs the number of entries in the "data" that has the name and ID of the people who liked a post.
When I view a certain post on Facebook, I see that it has 5 people who have liked it.
When I use the Graph API to fetch the post, the "likes" field has a "data" field with 3 entries in it, and a "count" field whose value is 5.
When I use the Graph API to fetch the likes for the post (eg, {post_id}/likes), I get a "data" field with 5 entries in it (and no "count" field).
Clearly the true answer to how many people have liked the post is 5. But then why is there only 3 entries in the "data" when I fetch the post object?
Here's another example of the same discrepancy:
https://graph.facebook.com/40796308305_10150394134258306 returns data for a post whose "likes/data" only has 1 entry in it, but whose "likes/count" says that there are 3. But https://graph.facebook.com/40796308305_10150394134258306/likes returns "data" with 3 entries. Finding that same entry on Coca-Cola's page finds that there are, in fact, 3 people who have liked it.
The documentation of the post object doesn't mention that the likes list may be incomplete, and the documentation of the fql stream table explicitly says to use the post object to get the full list, so It's either a bug in the API or in the documentation.
I suspect it may be a deliberate but undesirable "feature" to limit the detailed list for performance reasons, as some posts may have hundreds or even thousands of likes.
It ends up actually causing a huge performance problem as I need to find all posts that have been liked by a particular user, and the only way to do that is to do a separate fetch of likes for each post in the list whose like count is higher than the like list length.
2 people have their privacy settings set to not show their name to people who are not their friends.