How do you handle large relationship data attributes and compound documents? - ember.js

If an article has several comments (think thousands over time). Should data.relationships.comments return with a limit?
{
"data": [
{
"type": "articles",
"id": 1,
"attributes": {
"title": "Some title",
},
"relationships": {
"comments": {
"links": {
"related": "https://www.foo.com/api/v1/articles/1/comments"
},
"data": [
{ "type": "comment", "id": "1" }
...
{ "type": "comment", "id": "2000" }
]
}
}
}
],
"included": [
{
"type": "comments",
"id": 1,
"attributes": {
"body": "Lorem ipusm",
}
},
.....
{
"type": "comments",
"id": 2000,
"attributes": {
"body": "Lorem ipusm",
}
},
]
}
This starts to feel concerning, when you think of compound documents (http://jsonapi.org/format/#document-compound-documents). Which means, the included section will list all comments as well, making the JSON payload quite large.

If you want to limit the number of records you get at a time from a long list use pagination (JSON API spec).
I would load the comments separately with store.query (ember docs), like so -
store.query('comments', { author_id: <author_id>, page: 3 });
which will return the relevant subset of comments.
If you don't initially want to make two requests per author, you could include the first 'page' in the authors request as you're doing now.
You may also want to look into an addon like Ember Infinity (untested), which will provide an infinite scrolling list and automatically make pagination requests.

Related

How to interpret user search query (in Elasticsearch)

I would like to serve my visitors the best results possible when they use our search feature.
To achieve this I would like to interpret the search query.
For example a user searches for 'red beds for kids 120cm'
I would like to interpret it as following:
Category-Filter is "beds" AND "children"
Color-filter is red
Size-filter is 120cm
Are there ready to go tools for Elasticsearch?
Will I need NLP in front of Elasticsearch?
Elasticsearch is pretty powerful on its own and is very much capable of returning the most relevant results to full-text search queries, provided that data is indexed and queried adequately.
Under the hood it always performs text analysis for full-text searches (for fields of type text). A text analyzer consists of a character filter, tokenizer and a token filter.
For instance, synonym token filter can replace kids with children in the user query.
Above that search queries on modern websites are often facilitated via category selectors in the UI, which can easily be implemented with querying keyword fields of Elasticsearch.
It might be enough to model your data correctly and tune its indexing to implement the search you need - and if that is not enough, you can always add some extra layer of NLP-like logic on the client side, like #2ps suggested.
Now let me show a toy example of what you can achieve with a synonym token filter and copy_to feature.
Let's define the mapping
Let's pretend that our products are characterized by the following properties: Category, Color, and Size.LengthCM.
The mapping will look something like:
PUT /my_index
{
"mappings": {
"properties": {
"Category": {
"type": "keyword",
"copy_to": "DescriptionAuto"
},
"Color": {
"type": "keyword",
"copy_to": "DescriptionAuto"
},
"Size": {
"properties": {
"LengthCM": {
"type": "integer",
"copy_to": "DescriptionAuto"
}
}
},
"DescriptionAuto": {
"type": "text",
"analyzer": "MySynonymAnalyzer"
}
}
},
"settings": {
"index": {
"analysis": {
"analyzer": {
"MySynonymAnalyzer": {
"tokenizer": "standard",
"filter": [
"MySynonymFilter"
]
}
},
"filter": {
"MySynonymFilter": {
"type": "synonym",
"lenient": true,
"synonyms": [
"kid, kids => children"
]
}
}
}
}
}
}
Notice that we selected type keyword for the fields Category and Color.
Now, what about these copy_to and synonym?
What will copy_to do?
Every time we send an object for indexing into our index, value of the keyword field Category will be copied to a full-text field DescritpionAuto. This is what copy_to does.
What will synonym do?
To enable synonym we need to define a custom analyzer, see MySynonymAnalyzer which we defined under "settings" above.
Roughly, it will replace every token that matches something on the left of => with the token on the right.
How will the documents look like?
Let's insert a few example documents:
POST /my_index/_doc
{
"Category": [
"beds",
"adult"
],
"Color": "red",
"Size": {
"LengthCM": 150
}
}
POST /my_index/_doc
{
"Category": [
"beds",
"children"
],
"Color": "red",
"Size": {
"LengthCM": 120
}
}
POST /my_index/_doc
{
"Category": [
"couches",
"adult",
"family"
],
"Color": "blue",
"Size": {
"LengthCM": 200
}
}
POST /my_index/_doc
{
"Category": [
"couches",
"adult",
"family"
],
"Color": "red",
"Size": {
"LengthCM": 200
}
}
As you can see, DescriptionAuto is not present in the original documents - though due to copy_to we will be able to query it.
Let's see how.
Performing the search!
Now we can try out our index with a simple query_string query:
POST /my_index/_doc/_search
{
"query": {
"query_string": {
"query": "red beds for kids 120cm",
"default_field": "DescriptionAuto"
}
}
}
The results will look something like the following:
"hits": {
...
"max_score": 2.3611186,
"hits": [
{
...
"_score": 2.3611186,
"_source": {
"Category": [
"beds",
"children"
],
"Color": "red",
"Size": {
"LengthCM": 120
}
}
},
{
...
"_score": 1.0998137,
"_source": {
"Category": [
"beds",
"adult"
],
"Color": "red",
"Size": {
"LengthCM": 150
}
}
},
{
...
"_score": 0.34116736,
"_source": {
"Category": [
"couches",
"adult",
"family"
],
"Color": "red",
"Size": {
"LengthCM": 200
}
}
}
]
}
The document with categories beds and children and color red is on top. And its relevance score is twice bigger than of its follow-up!
How can I check how Elasticsearch interpreted the user's query?
It is easy to do via analyze API:
POST /my_index/_analyze
{
"text": "red bed for kids 120cm",
"analyzer": "MySynonymAnalyzer"
}
{
"tokens": [
{
"token": "red",
"start_offset": 0,
"end_offset": 3,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "bed",
"start_offset": 4,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "for",
"start_offset": 8,
"end_offset": 11,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "children",
"start_offset": 12,
"end_offset": 16,
"type": "SYNONYM",
"position": 3
},
{
"token": "120cm",
"start_offset": 17,
"end_offset": 22,
"type": "<ALPHANUM>",
"position": 4
}
]
}
As you can see, there is no token kids, but there is token children.
On a side note, in this example Elasticsearch wasn't able, though, to parse the size of the bed: token 120cm didn't match to anything, since all sizes are integers, like 120, 150, etc. Another layer of tweaking will be needed to extract 120 from 120cm token.
I hope this gives an idea of what can be achieved with Elasticsearch's built-in text analysis capabilities!

How to get school informations by id?

I try to find a way to get some information (address, geo-data and so on ) of an education institute by the id.
My current request: response the following object:
{
"education": [
{
"school": {
"id": "110415448986654",
"name": "BULME"
},
"type": "High School",
"id": "587913304561851"
},
{
"school": {
"id": "114054375298355",
"name": "HTL Bulme Graz-Gösting"
},
"type": "College",
"id": "587913327895182"
}
],
"id": "769605149725998"
}
I see that here are IDs for the schools, but how can I use this to load some data about this institute?
I read a lot on the developers Facebook page but I can't find a solution.

Extract specific users comments from a list using Wikipedia API and Python 2.7

I am using the wikipedia API - wikitools package to extract some data from Wikipedia. I get the output of the format shown below and now I want to extract the timestamp and the comment for revisions made of specific user for several pages. Let's say I just want the comments made by TechBot, then I figured that I can do something like:
for revision in res["query"]["pages"]["7940378"]["revisions"]:
if revision["user"] = "Techbot":
do.something()
But the problem is ["7940378"] because this is a unique page id and will change for every page and I dont know how to get the pageid. Is there another way of doing this?
[{
"query": {
"pages": {
"7940378": {
"ns": 0,
"pageid": 7940378,
"revisions": [
{
"comment": "robot Modifying: [[az:T\u00fcrk Tarixi]]",
"timestamp": "2009-01-03T19:47:11Z",
"user": "TechBot"
},
{
"comment": "",
"timestamp": "2009-02-14T02:07:49Z",
"anon": "",
"user": "88.231.237.130"
},
{
"comment": "fixing recent deletion by merging it with the next paragraph",
"timestamp": "2009-04-03T14:49:27Z",
"user": "Soap"
},
{
"comment": "robot Modifying: [[az:T\u00fcrk tarixi]]",
"timestamp": "2009-04-09T14:35:19Z",
"user": "RibotBOT"
},
{
"comment": "Repairing link to disambiguation page - [[Wikipedia:Disambiguation pages with links|You can help!]]",
"timestamp": "2009-06-12T23:55:55Z",
"user": "J04n"
}
],
"title": "History of the Turkic peoples"
}
}
},
"continue": {
"rvcontinue": "20090807172715|306635892",
"continue": "||"
},
"warnings": {
"main": {
"*": "Unrecognized parameter: 'user'"
}
}
}]
Instead of using a single for loop. you can split up into two loops, where the outer loop gets the pages, and with the inner loop you can get to the revisions.
for pageid, pagedetails in res["query"]["pages"].iteritems():
for revision in pagedetails["revisions"]:
if revision["user"] == "TechBot":
do.something()

Search words with 'and' logical condition with Facebook Graph API

Using Facebook Graph API I am trying to search for all public pages related with two or more words. I want the AND condition satisfied.
Trying to use a query like i.e.
https://graph.facebook.com/v2.5/search?access_token=my_token&type=page&q=marziano+venusiano&limit=1000
but it gives me empty data answer.
I've tried to use something suggested in old questions, but it seems to be not working any more.
What is the right syntax to use if one exist?
I suspect there is no page with these two words in the name. If you try scuderia and ferrari, the results look as desired:
/search?type=page&q=scuderia+ferrari
returns
{
"data": [
{
"name": "Scuderia Ferrari",
"id": "500214176674878"
},
{
"name": "Scuderia Ferrari",
"id": "105467226153743"
},
{
"name": "Scuderia Ferrari Club Prealpi Venete",
"id": "342159749154826"
},
{
"name": "Scuderia Ferrari Club Prato",
"id": "1414918342093728"
},
{
"name": "Scuderia Ferrari Club Zola Predosa",
"id": "226860530777232"
},
...
],
"paging": {
"cursors": {
"before": "MAZDZD",
"after": "MjQZD"
},
"next": "https://graph.facebook.com/v2.5/search?access_token=&pretty=0&q=scuderia+ferrari&type=page&limit=25&after=MjQZD"
},
}

What is the "likes" field from an FB Open Graph GET og.likes request?

I've implemented functionality to Like a non-FB URL in a cross-platform mobile app (Phonegap) I'm developing, and part of functionality is that I need to find out if a user has liked a URL before, hence a GET on the og.likes object. In the result of this request there's a field in the og.likes data that I'm unsure about.
My request:
GET me/og.likes?access_token={AccessToken}&object={EncodedExternalURL}
The response:
{
"data": [
{
"id": "_____",
"from": {
"id": "______",
"name": "______"
},
"start_time": "2015-01-12T06:17:24+0000",
"end_time": "2015-01-12T06:17:24+0000",
"publish_time": "2015-01-12T06:17:24+0000",
"application": {
"name": "______",
"namespace": "______",
"id": "______"
},
"data": {
"object": {
"id": "____",
"url": "____",
"type": "website",
"title": "____"
}
},
"type": "og.likes",
"no_feed_story": false,
"likes": { // <-- this guy here and its properties
"count": 0,
"can_like": true,
"user_likes": false
},
"comments": {
"count": 0,
"can_comment": true,
"comment_order": "chronological"
}
}
],
"paging": {
"next": "____"
}
}
What is the likes field? And the sub properties of count, can_like, user_likes? Is it that other users can like this Like?
likes field is for manage the likes that this like has. Users can like or comment a like.