I am looking for a solution which I can get that last 50 comments to the page's wall, or all comments in an hour to the page's wall and posts date wont matter, could be posted 2 years before but If gets a comment in an hour I need to get it. I don't want to get all posts and look one by one.
thank you for your effort
The first one is easy. Issue an API call to this endpoint:
/PAGE_NAME_OR_ID/feed?fields=comments.limit(50)
You will be restricted to the normal limits of feed, so the comments returned here will only be those made on the last 30 days or 50 posts, whichever is fewer.
If you want the last 50 comments, you'll need to use FQL.
SELECT time, text, text_tags, post_id FROM comment WHERE post_id IN
(SELECT post_id FROM stream WHERE source_id IN
(SELECT id FROM profile WHERE username="cocacola") LIMIT 100)
ORDER BY time DESC LIMIT 50
Keep in mind that Facebook's filtering algorithms operate after FQL. You may need to increase the LIMIT values substantially to be guaranteed get 50 results.
Related
I'm using the AWS Machine Learning regression to predict the waiting time in a line of a restaurant, in a specific weekday/time.
Today I have around 800k data.
Example Data:
restaurantID (rowID)weekDay (categorical)time (categorical)tablePeople (numeric)waitingTime (numeric - target)1 sun 21:29 2 23
2 fri 20:13 4 43
...
I have two questions:
1)
Should I use time as Categorical or Numeric?
It's better to split into two fields: minutes and seconds?
2)
I would like in the same model to get the predictions for all my restaurants.
Example:
I expected to send the rowID identifier and it returns different predictions, based on each restaurant data (ignoring others data).
I tried, but it's returning the same prediction for any rowID. Why?
Should I have a model for each restaurant?
There are several problems with the way you set-up your model
1) Time in the form you have it should never be categorical. Your model treats times 12:29 and 12:30 as two completely independent attributes. So it will never use facts it learn about 12:29 to predict what's going to happen at 12:30. In your case you either should set time to be numeric. Not sure if amazon ML can convert it for you automatically. If not just multiply hour by 60 and add minutes to it. Another interesting thing to do is to bucketize your time, by selecting which half hour or wider interval. You do it by dividing (h*60+m) by some number depending how many buckets you want. So to try 120 to get 2 hr intervals. Generally the more data you have the smaller intervals you can have. The key is to have a lot of samples in each bucket.
2) You should really think about removing restaurantID from your input data. Having it there will cause the model to over-fit on it. So it will not be able to make predictions about restaurant with id:5 based on the facts it learn from restaurants with id:3 or id:9. Having restaurant id there might be okay if you have a lot of data about each restaurant and you don't care about extrapolating your predictions to the restaurants that are not in the training set.
3) You never send restaurantID to predict data about it. The way it usually works you need to pick what are you trying to predict. In your case probably 'waitingTime' is most useful attribute. So you need to send weekDay, time and number of people and the model will output waiting time.
You should think what is relevant for the prediction to be accurate, and you should use your domain expertise to define the features/attributes you need to have in your data.
For example, time of the day, is not just a number. From my limited understanding in restaurant, I would drop the minutes, and only focus on the hours.
I would certainly create a model for each restaurant, as the popularity of the restaurant or the type of food it is serving is having an impact on the wait time. With Amazon ML it is easy to create many models as you can build the model using the SDK, and even schedule retraining of the models using AWS Lambda (that mean automatically).
I'm not sure what the feature called tablePeople means, but a general recommendation is to have as many as possible relevant features, to get better prediction. For example, month or season is probably important as well.
In contrast with some answers to this post, I think resturantID helps and it actually gives valuable information. If you have a significant amount of data per each restaurant then you can train a model per each restaurant and get a good accuracy, but if you don't have enough data then resturantID is very informative.
1) Just imagine what if you had only two columns in your dataset: restaurantID and waitingTime. Then wouldn't you think the restaurantID from the testing data helps you to find a rough waiting time? In the simplest implementation, your waiting time per each restaurantID would be the average of waitingTime. So definitely restaurantID is a valuable information. Now that you have more features in your dataset, you need to check if restaurantID is as effective as the other features or not.
2) If you decide to keep restaurantID then you must use it as a categorical string. It should be a non-parametric feature in your dataset and maybe that's why you did not get a proper result.
On the issue with day and time I agree with other answers and considering that you are building your model for the restaurant, hourly time may give a more accurate result.
I want to answer this question:
Does the average time on page A (or more accurately page group A) affect the conversion rate of goal B?
So far in the GUI I have:
A) Created an advanced segment of Time on Page >= 120 ("per hit" option):
http://grab.by/tKOA
B) Modified the segment to also add a filter for Page = regex matching my group:
http://grab.by/tKOU
...But I don't know if this gives me the results I'm after; that is, if they are accurate
I have some other ideas, including assigning the page group as a funnel step and then segmenting by the Time on Page; still waiting on data to come in for that one
Wanting to know if there was a better solution or if I'm on the right track
Drewdavid,
Your approach is quite smart and correct, I would say, however keep in mind that in this context, you are mixing different scopes:
Time on Page is page-level metric
Page seen is visit-level dimension
What you would get in your report is the average time on page calculated from all the pages there were seen during visits which met the regex condition set in the filter (that's what segment does, it included all the pages, not just those that you want to filter). I know this can be confusing, but see this article that gives more examples and goes into greater detail.
To achieve what you are after, remove the segment filter and simply use the advanced filtering above the report table (and choose exactly the same regex you mentioned in your question).
Hope this helps!
Currently I am using following API call to retrieve Post Likes and Post Comments for Facebook Page (PageId). Here in below i am making only one API call and retrieving ALL posts and their comments total count.
1). https://graph.facebook.com/PageId/posts?access_token=xyz&method=GET&format=json
But, as per "July 2013 Breaking Changes" : - Now comments counts are not available with above API call. so , as per Road Map documentation I am using following API call to retrieve comments count ('total_count') for that particular POST ID.
2). https://graph.facebook.com/post_ID/?summary=true&access_token=xyz&method=GET&format=json
So , with second API call - I am able to retrieve comments count per Post Wise. But, here you can see that I need to iterate through each post & need to retrieve its comments count one by one per each post id. then need to sum up all to find out total comments count. so that requires too much API calls.
My Question is :- Is it possible to retrieve Page -> Posts -> ALL comments total count in single API call by considering 10 July breaking changes ?
Is there any alternative to my second API call to retrieve all comments total count per Facebook page posts ?
Hmm, well, I don't believe there is a way to bundle this all in a single api call. But, you can batch requests to get this in the seemingly same api call (will save time), but they will count against your rate limits separately. (my example below would be 4 calls against the limits)
Example batch call (json encoded) - and i'm storing the post ID in the php variable $postId.:
[{"method":"GET","relative_url":"' . $postId . '"},
{"method":"GET","relative_url":"' . $postId . '/likes?limit=1000&summary=true"},
{"method":"GET","relative_url":"' . $postId . /comments?filter=stream&limit=1000&summary=true"},
{"method":"GET","relative_url":"' . $postId . '/insights"}]
I'm batching 4 queries in this single call. First to get post info, second to get likes (up to 1000, plus the total count), third to get all the comments, plus the summary count, and finlly, insights (if it's the page's own posts).
You can drastically simplify this batch call if you don't want all the details I'm pulling.
In this case you still need to iterate though all. But, Facebook allows you to bundle up to 50 calls per batch request I believe, so you could request multiple post ids in the same batch call to speed things up too.
I am trying to use the graph api with limit and since
I think the highest limit is 5000, so I am using that ( I want to make the fewest calls).
I am also trying to look 1 month back.
So I try:
https://graph.facebook.com/[ID of page]/feed&access_token=[accesstoken]&limit=5000&since=11-12-24
and I get 207 results, and the earliest date is december 24th, this is all fine, its saying hey there are only 207 results in the last month. The problem is there is a next link that has:
"next": "https://graph.facebook.com/[id of page]/feed?limit=5000&until=1324702511"
If I get this page, I start getting posts beore december 24th.
So my question is, how can I be sure I get all posts after a given date with fewest calls???
The kludge I am thinking of is to set the since on the first call to 1 day before, then if I get a post with that date, I know I got them all, if not I paginate... 5000 posts in one month is a lot, but I think its possible...
It seems like facebook should provide a way to get since with highest limit possible...I read this http://developers.facebook.com/blog/post/478/ but im still confused.
After hours of searching the web (including SO), I am requesting advice from the community. RRD seems to be the right tool for this, but I could not get a straight answer until now.
My question is : Is it possible to get RRD output a graph for the day, that averages data from the past year ?
In other words, I want the "view span" to be one day long, but the "data span" to extend over the last 12 months, so that for 6pm, the value will be computed as the average value of ALL previous traffic measured at 6pm last 12 months.
Any hints, or instructions welcomed!
There is no direct way to create such a graph, at least in theory it would be possible using multiple DEF lines together with the SHIFT operation to create such a chart ... you would have to use a program to create the necessary command line though