RESTservice, resource with two different outputs - how would you do it? - web-services

Im currently working on a more or less RESTful webservice, a type of content api for my companys articles. We currently have a resource for getting all the content of a specific article
http://api.com/content/articles/{id}
will return a full set of article data of the given article id.
Currently we control alot of the article's business logic becasue we only serve a native-app from the webservice. This means we convert tags, links, images and so on in the body text of the article, into a protocol the native-app can understand. Same with alot of different attributes and data on the article, we will transform and modify its original (web) state into a state that the native-app will understand.
fx. img tags will be converted from a normal <img src="http://source.com"/> into a <img src="inline-image//{imageId}"/> tag, samt goes for anchor tags etc.
Now i have to implement a resource that can return the articles data in a new representation
I'm puzzled over how best to do this.
I could just implement a completely new resource, on a different url like: content/articles/web/{id} and move the old one to content/article/app/{id}
I could also specify in my documentation of the resource, that a client should always specify a specific request header maybe the Accept header for the webservice to determine which representation of the article to return.
I could also just use the original url, and use a url parameter like .../{id}/?version=app or .../{id}/?version=web
What would you guys reckon would be the best option? My personal preference lean towards option 1, simply because i think its easier to understand for clients of the webservice.
Regards, Martin.
EDIT:
I have chosen to go with option 1. Thanks for helping out and giving pros and cons. :)

I would choose #1. If you need to preserve the existing URLS you could add a new one content/articles/{id}/native or content/native-articles/{id}/. Both are REST enough.
Working with paths make content more easily cacheable than both header or param options. Using Content-Type overcomplicates the service especially when both are returning JSON.

Use the HTTP concept of Content Negotiation. Use the Accept header with vendor types.
Get the articles in the native representation:
GET /api.com/content/articles/1234
Accept: application/vnd.com.exmaple.article.native+json
Get the articles in the original representation:
GET /api.com/content/articles/1234
Accept: application/vnd.com.exmaple.article.orig+json

Option 1 and Option 3
Both are perfectly good solutions. I like the way Option 1 looks better, but that is just aesthetics. It doesn't really matter. If you choose one of these options, you should have requests to the old URL redirect to the new location using a 301.
Option 2
This could work as well, but only if the two responses have a different Content-Type. From the description, I couldn't really tell if this was the case. I would not define a custom Content-Type in this case just so you could use Content Negotiation. If the media type is not different, I would not use this option.

Perhaps option 2 - with the header being a Content-Type?
That seems to be the way resources are served in differing formats; e.g. XML, JSON, some custom format

Related

Correct REST API URL format for related objects

I'm designing a REST API where, amongst others, there are two objects.
Journey
Report
For each Journey there are many Reports enroute, and each Report has exactly one associated Journey.
A user might create a Journey using the API as follows...
POST /journey/
Then retrieve the details...
GET /journey/1226/
The first question is, if a user wanted to post an Report to their Journey, which is the 'correct' URL structure that the API should impose? This seems intuitive to me...
POST /journey/1226/report/
...or is this the case...
POST /report/
...whereby in the latter, the Journey ID is passed in the request body somewhere?
The second question is, how might one go about implementing the first case in a tool such as the Django REST framework?
Thanks!
The URL/URI structure is almost completely irrelevant. It is nice to be able to read it, or easily change or even guess it, but that is it. There is no "requirement" official or unwritten how they should look like.
The point is however, that you supply the URIs to your clients in your responses. Each GET will get you a representation that contains links to the next "states" that your client can reach. This means the server has full control over URI structure, the client usually has to only know the "start" or "homepage" URI, and that's it.
Here is an article which discusses this question, has some good points: http://www.ben-morris.com/hackable-uris-may-look-nice-but-they-dont-have-much-to-do-with-rest-and-hateoas/
Pass for the second question :) I didn't use that particular framework.

Read AICC Server response in cross domain implementation

I am currently trying to develop a web activity that a client would like to track via their Learning Management System. Their LMS uses the AICC standard (HACP binding), and they keep the actual learning objects on a separate content repository.
Right now I'm struggling with the types of communication between the LMS and the "course" given that they sit on two different servers. I'm able to retreive the sessionId and the aicc_url from the URL string when the course launches, and I can successfully post values to the aicc_url on the LMS.
The difficulty is that I can not read and parse the return response from the LMS (which is formatted as plain text). AICC stipulates that the course start with posting a "getParam" command to the aicc_url with the session id in order to retrieve information like completion status, bookmarking information from previous sessions, user ID information, etc, all of which I need.
I have tried three different approaches so far:
1 - I started with using jQuery (1.7) and AJAX, which is how I would typically go about a same-server implementation. This returned a "no transport" error on the XMLHttpRequest. After some forum reading, I tried making sure that the ajax call's crossdomain property was set to true, as well as a recommendation to insert $.support.cors = true above the ajax call, neither of which helped.
2 & 3 - I tried using an oldschool frameset with a form in a bottom frame which would submit and refresh with the returned text from the LMS and then reading that via javascript; and then a variation upon that using an iFrame as a target of an actual form with an onload handler to read and parse the contents. Both of these approaches worked in a same-server environment, but fail in the cross-domain environment.
I'm told that all the other courses running off the content repository bookmark as well as track completion, so obviously it is possible to read the return values from the LMS somehow; AICC is pitched frequently as working in cross-server scenarios, so I'm thinking there must be a frequently-used method to doing this in the AICC structure that I am overlooking. My forum searches so far haven't turned up anything that's gotten me much further, so if anyone has any experience in cross-domain AICC implementations I could certainly use recommendations!
The only idea I have left is to try setting up a PHP "relay" form on the same server as the course, and having the front-end page send values to that, and using the PHP to submit those to the LMS, and relay the return text from the LMS to the front-end iframe or ajax call so that it would be perceived as being within the same domain.... I'm not sure if there's a way to solve the issue without going server-side. It seems likely there must be a common solution to this within AICC.
Thanks in advance!
Edits and updates:
For anyone encountering similar problems, I found a few resources that may help explain the problem as well as some alternate solutions.
The first is specific to Plateau, a big player in the LMS industry that was acquired by Successfactors. It's some documentation that provide on setting up a proxy to handle cross-domain content:
http://content.plateausystems.com/ContentIntegration/content/support_files/Cross-domain_Proxlet_Installation.pdf
The second I found was a slide presentation from Successfactors that highlights the challenge of cross-domain content, and illustrates so back-end ideas for resolving it; including the use of reverse proxies. The relevant parts start around slide 21-22 (page 11 in the PDF).
http://www.successfactors.com/static/docs/successconnect/sf/successfactors-content-integration-turley.pdf
Hope that helps anyone else out there trying to resolve the same issues!
The answer in this post may lead you in the right direction:
Best Practice: Legitimate Cross-Site Scripting
I think you are on the right track with setting up a PHP "relay." I think this is similar to choice #1 in the answer from the other post and seems to make most sense with what you described in your question.

Can you build a truly RESTful service that takes many parameters?

After reading an article on REST ("Restful Grails"), I have gotten the impression that it is not possible to truly conform to a REST style in a service that demands a lot of parameters. Is this so? All the examples I have seen so far seem to imply that true REST style services are "parameterless". Using parameters would be RPC-ish and not truly RESTful.
To be more specific, say we have a service that returns graph data for stock prices, and this service needs to know the start date, end date, the currency, stock name, and whatever else might be applicable. In any case, at least 4-5 parameters are needed to retrieve the information needed.
I would imagine the URL to be something like this : /stocks/YAHOO?startDate="2008-09-01"&endDate=...
("YAHOO" is here a made-up stock name).
Would this really be REST or is this more RPC-like, what the author of the aforementioned article calls "GETful" (i.e. just low ceremony rpc)?
You can see the querystring as a filter on the resource you are GETing. Here, your resource is the stock prices of yahoo. Doing a GET on that resource give you all the available data, or the most recents. The query string filter the prices you want. Content negociation allow you to change the representation, e.g. a png graph, a csv file, and so on. To add a price, simply POST a representation (e.g. CSV) to the same resource.
The "restfulness" is not realy in the URL itself, since URIs are obscures to client, but in the way you interact with resources themselves identified by their URI
Feel free to use as many parameters as you need to identify the resource you wish to access. REST doesn't care.
Why would you think it is not possible?
Google uses REST for their charts api, and they take alot of params:
http://chart.apis.google.com/chart?cht=bvg&chs=350x300&chd=t:20,35,10&chxr=1,0,40&chds=0,40&chco=FF0000|FFA000|00FF00&chbh=65,0,35&chxt=x,y,x&chxl=0:|High|Medium|Low|2:||Task+Priority||&chxs=2,000000,12&chtt=Tasks+on+my+To+Do+list&chts=000000,20&chg=0,25,5,5

How to solve two REST problems: the interface document; loss of privacy in descriptive URLs

Coming from a lot of frustrating times with WSDL/Soap, I very much like the REST paradigm, but am trying to solve two basic problems in our application, before moving over to REST. The first problem relates to the lack of an interface document. I think I finally see how to handle this situation: One can query his way down from a top-level "/resources" resource using various requests of GET, HEAD, and OPTIONS to find the one needed resource in the correct hypermedia format. Is this the idea? If so, the client need only be provided with a top-level resource URI: http://www.mywebservicesite.com/mywebservice/resources. He will then have to do some searching and possible keep track of what he is discovering, so that he can use the URIs again efficiently in future to do GETs, POSTs, PUTs, and DELETEs. Are there any thoughts on what should happen here?
The other problem is that we cannot use descriptive URLs like /resources/../customer/Madonna/phonenumber. We do have an implementation of opaque URLs we use in the context of a session, and I'm wondering how opaque URLs might be applied to REST. The general problem is how to keep domain-specific details out of URLs, and still benefit from what REST has to offer.
The other problem is that we cannot use descriptive URLs like /resources/../customer/Madonna/phonenumber.
I think you've misunderstood the point of opaque URIs. The notion of opaque URIs is with respect to clients: A client shall not decipher a URI to guess anything of semantic meaning from it. So a service may well have URIs like /resources/.../customer/Madonna/phonenumber, and that's quite a good idea. The URIs should be treated as opaque by clients: not infer from the URI that it represents Madonna's phone number, and that Madonna is a customer of some sort. That knowledge can only be obtained by looking inside the URI itself, or perhaps by remembering where the URI was discovered.
Edit:
A consequence of this is that navigation should happen by links, not by deconstructing the URI. So if you see /resouces/customer/Madonna/phonenumber (and it actually represents Customer Madonna's phone number) you should have links in that resource to point to the Madonna resource: e.g.
{
"phone_number" : "01-234-56",
"customer_URI": "/resources/customer/Madonna"
}
That's the only way to navigate from a phone number resource to a customer resource. An important aspect is that the server implementation might or might not have domain specific information in the URI, The Madonna record might just as well live somewhere else: /resources/customers/byid/81496237. This is why clients should treat URIs as opaque.
Edit 2:
Another question you have (in the comments) is then how a client, with the required no knowledge of the server's URIs is supposed to be able to find anything. Clients have the following possibilities to find resources:
Provide a search interface. This could be done by providing an OpenSearch description document, which tells clients how to search for items. An OpenSearch template can include several variables, and several endpoints, depending on what you're looking for. So if you have a "customer ID" that's unique, you could have the following template: /customers/byid/{proprietary:customerid}", the customerid element needs to be documented somewhere, inside the proprietary namespace. A client can then know how to use such a template.
Provide a custom form. This implies making a custom media type in which you explicitly define how (based on an instance of the document) a URI to a customer can be forged. <customers template="/customers/byid/{id}"/>. The documentation (for the media type) would have to state that the template attribute must be interpreted as a relative URI after the string substitution "{id}" to an actual customer ID.
Provide links to all resources. Some resources aren't innumerable, so you can simply make a link to each and every one of them, optionally including identifying information along with the links. This could also be done in a custom media type: <customer id="12345" href="/customer/byid/12345"/>.
It should be noted that #1 and #2 are two ways of saying the same thing: Clients are allowed to create URIs if they
haven't got the URI structure a priori
a media type exists for which the documentation states that URIs should be created
This is much the same way as a web browser has no idea of any URI structure on the web, except for the rules laid out in the definition of HTML forms, to add a ? and then all the query parameters separated by &.
In theory, if you have a customer with id 12345, then you could actually dispense with the href, since you could plug the customer id 12345 into #1 or #2. It's more common to actually provide real links between resources, rather than always relying on lookup or search techniques.
I haven't really used web RPC systems (WSDL/Soap), but i think the 'interface document' is there mostly to allow client libraries to create the service API, right? if so, REST shouldn't need it, because the verbs are already defined and don't really need to be documented again.
AFAIUI, the REST way is to document the structure of each resource (usually encoded in XML or JSON). In that document, you'll also have to document the relationship between those resources. In my case, a resource is often a container of other resources (sometimes more than one type), therefore the structure doc specifies what field holds a list of URLs pointing to the contained resources. Ideally, only one unique resource will need a single, fixed (documented) URL. everithing else follows from there.
The URL 'style' is meaningless to the client, since it shouldn't 'construct' an URL. Every URL it needs should be already constructed on a resource field. That let's you change the URL structure without changing the client (that has saved tons of time to me). Your URLs can be as opaque or as descriptive as you like. (personally, i don't like text keys or slugs; my keys are all BIGINTs or UUIDs)
I am currently building a REST "agent" that addresses the first part of your question. The agent offers a temporary bookmarking service. The client code that is interacting with the agent can request that an URL be bookmarked using some identifier. If the client code needs to retrieve that representation again, it simply asks the agent for the url that corresponds to the saved bookmark and then navigates to that bookmark. Currently those bookmarks are not persisted so they only last for the lifetime of the client application, but I have found it a useful mechanism for accessing commonly used resources. E.g. The root representation provides a login link. I bookmark that link and if the client ever receives a 401 then I can redirect to the "login" bookmark.
To address an issue you mentioned in a comment, the agent also has the ability to store retrieved representations in a dictionary. If it becomes necessary to aggregate and manipulate multiple representations at the same time then I can simply request that the agent store the current representation in a dictionary associated to a key and then continue navigating to the next resource. Once the client has accumulated all the necessary representation it can do what it needs to do.

Best way to decide on XML or HTML response?

I have a resource at a URL that both humans and machines should be able to read:
http://example.com/foo-collection/foo001
What is the best way to distinguish between human browsers and machines, and return either HTML or a domain-specific XML response?
(1) The Accept type field in the request?
(2) An additional bit of URL? eg:
http://example.com/foo-collection/foo001 -> returns HTML
http://example.com/foo-collection/foo001?xml -> returns, er, XML
I do not wish to oblige machines reading the resource to parse HTML (or XHTML for that matter). Machines like the googlebot should receive the HTML response.
It is reasonable to assume I control the machine readers.
If this is under your control, rather than adding a query parameter why not add a file extension:
http://example.com/foo-collection/foo001.html - return HTML
http://example.com/foo-collection/foo001.xml - return XML
Apart from anything else, that means if someone fetches it with wget or saves it from their browser, it'll have an appropriate filename without any fuss.
My preference is to make it a first-class part of the URI. This is debatable, since there are -- in a sense -- multiple URI's for the same resource. And is "format" really part of the URI?
http://example.com/foo-collection/html/foo001
http://example.com/foo-collection/xml/foo001
These are very easy deal with in a web framework that has URI parsing to direct the request to the proper application.
If this is indeed the same resource with two different representations, the HTTP invites you to use the Accept-header as you suggest. This is probably a very reliable way to distinguish between the two different scenarios. You can be plenty sure that user agents (including search engine spiders) send the Accept-header properly.
About the machine agents you are going to give XML; are they under your control? In that case you can be doubly sure that Accept will work. If they do not set this header properly, you can give XML as default. User agents DO set the header properly.
I would try to use the Accept heder for this, because this is exactly what the Accept header is there for.
The problem with having two different URLs is that is is not automatically apparent that these two represent the same underlying resource. This can be bad if a user finds an URL in one program, which renders HTML, and pastes it in the other, which needs XML. At this point a smart user could probably change the URL appropriately, but this is just a source of error that you don't need.
I would say adding a Query String parameter is your best bet. The only way to automatically detect whether your client is a browser(human) or application would be to read the User-Agent string from the HTTP Request. But this is easily set by any application to mimic a browser, you're not guaranteed that this is going to work.