Defining a REST api - web-services

I was wondering what the most correct or best practice way to define a rest api would be. Lets say I have an application that deals with cars and their parts/components. Any api will always start first with the top level car. So, in defining my api should I go with:
[url]/{resourceName}/{carId}/{resouorceId}
so as an example, lets say I want to get an engine, so:
[url]/engine/123/456
where 123 is the car id and 456 is the engine id. Or, is it a better practice to place the resource after the top level container, in this case a car, so something like:
[url]/{carId}/{resourceName}/{resourceId}
so the same example would be:
[url]/123/engine/456
Or does it matter? Is there an "industry best practice" type of approach? As I understand it, most of what I read has said something to the effect of "make the api intuitive and easy to understand".
Also, does or should the api be fully expressive, for example, should all api's regardless of approach be something like below even though the api is always for a car for this particular application.
[url]/car/{carId}/{resourceName}/{resourceId}
so, examples being:
[url]/car/123/engine/456
[url]/car/123/wheel/1
I understand this could be interpreted as a post that will just create debate, but I am really looking for help in tracking down what might be most widely accepted.

REST really has nothing to do with URIs or HTTP. It is not a standard; it's an architectural style or paradigm. In Roy Fielding's thesis, he identifies certain criteria for an architectural style to be "RESTful". It just so happens that HTTP satisfies these criteria, which makes it a good protocol to realize a RESTful architecture. But REST itself has nothing to do with HTTP.
In REST URI's are opaque, meaning they don't convey any other semantic other than uniquely identifying a resource. So for example, a URI like /dfjSuX7kx is perfectly valid. Of course, that doesn't mean we have to use URI's like that; it's better to use something you can make sense of.
You wanted to know whether you should go with:
[url]/{resourceName}/{carId}/{resourceId}
I would say that it is better to go with just:
[url]/{resourceName}/{resourceId}
And then make carId part of the representation. Why is that? We know that one of the requirements is that your URI must uniquely identify a resource. Right now we have an engine 456 that is part of car 123, which means that you are identifying that particular resource with the URI /engine/123/456. But let's say car 123 was totaled and the engine was moved to car 789. Now your URI for the same resource is /engine/789/456. Keep in mind here that nothing about engine 456 has changed; it is still the same engine. The only thing is that it is now attached to a different car. But that doesn't mean that the URI that uniquely identifies it has to change as well (because then it would no-longer uniquely identify it). What identifies an engine uniquely is its id, and not the car that it is attached to. Hence it's better to just have /engine/456 and return a representation like:
{
"id": 456,
"carId": 123,
"cylinders": 6,
"hp": 350,
...
}
You also asked if you could do this:
[url]/{carId}/{resourceName}/{resourceId}
I would not recommend that at all. Let's say you have hosted your API at http://my.car-api.com. What does http://my.car-api.com/123 actually mean? Of course, it's RESTful because URI's are opaque, but it's hard to understand what it means when looking at the URI, so it's better to use /car/{carId}.
For your final question, you asked if the following is acceptable:
[url]/car/{carId}/{resourceName}/{resourceId}
Subresources are ok, and I have seen people use it, and I have also used it in some REST API's I have created. But personally I'm moving towards giving every resource a top-level URI. This is because we can already convey hierarchical relationships or the concept of "ownership" via the representation we return from the URI, instead of encoding that into the URI itself (remember, the URI has no additional meaning other than the fact that it uniquely identifies a resource). I've also noticed issues with orthogonality in the way I implement my controllers, error-handlers and transformers (helper classes that convert DTO's to domain objects and vice-versa); supporting a nested URI-structure makes it hard for me to write general code (although some of this is due to the way the framework I'm using, Spring, does things).
You can decide how "RESTful" you want to be; the best metric I've seen for quantifying the RESTfulness of your API is the Richardson Maturity Model. I would suggest that you decide on a level that you're comfortable with. It's not necessary to be completely dogmatic and be completely RESTful; it depends on your situation. But I have found that getting to Level 3 (HATEOAS) has brought me a lot of benefits because my custom media-types (i.e. I use application/vnd.my.api.resource.v1+json instead of just application/json) are well-documented and convey their semantics very well. Also none of my clients have to construct URIs, since they can just follow links. The representations are self-documenting as well, since you can follow links to associated documentation, for instructions on processing resources.
The only thing I would strongly suggest is picking a style and sticking with it. I don't like to mix paradigms; so if you are aiming to have something RESTful, don't let anything RPC-ish sneak into your API. This can be a little difficult because REST is noun-based and RPC is verb-based (i.e. behavior), and some concepts just map quite easily over to RPC. An example is if you had a Bank API where you wanted to provide the ability to transfer money from one account to another. A "REST" implementation that is really RPC-ish may expose a URI like /account/{accountId}/transfer?to={toAccountId}&amount={amount} (which directly maps to a method call on an Account object: account.transfer(amount, toAccountId); hence RPC). The RESTful way would be to expose a transaction resource at /transaction, where you would post something like:
{
"from_account_id": 123,
"to_account_id": 456,
"amount": "100.00"
}
and the /transaction could respond with:
{
"id": 789,
"source_account_id": 123,
"destination_account_id": 456,
"amount": "100.00",
"_links": {
"self": { "href": "/transaction/789" }
}
}
A lot of times people end up duplicating what HTTP already provides through RPC calls. For example, HTTP already gives you methods that translate to CRUD operations (quick note: the methods themselves have nothing to do with CRUD semantics; they just happen to map over conveniently), and also has response codes that convey information about the endresult of the operation (Was it successful? Did it fail? Why did it fail? etc.). The idea with REST is that you can concentrate on modeling your resources, their representations, and their relationships with each other and use what HTTP already provides you (hypermedia, response codes, methods) instead of duplicating all of that.

Related

How to offer lists of valid values for parameters in a RESTful API?

This is more a conceptual than technical question, I guess. Suppose I have a REST API for dealing with a huge rent-a-car fleet.
The API is modelled around the business entities/resources in the very standard and coherent (even if controversial) way like that:
/cars/1234 - detailed data about a certain car
/clients/5678 - detailed data about a certain client
/cars - a list of cars and their URIs
/clients - a list of clients
However, the fleet is huge and a list of all cars is not that useful. I would rather have it filtered, like:
GET /cars?type=minivan
For properly using the "type" parameter, I should have a list of valid values such as "minivan", "convertible", "station-wagon", "hatchback", "sedan", etc. Ok, there are not that many kinds of cars out there, but let´s suppose this list is something too big for an enum in the API´s Swagger definition.
So... What would be the most consistent and natural way for a REST API to offer a list of valid values for a query parameter like that?
As a subordinated resource like /cars/types ? This would break the /cars/{id} URL pattern, isn´t it?
As a separate resource such as /tables/cars/types? This would break the consistency around the main resources of the business model itself, right?
As part of the body of a response for OPTIONS /cars ? It looks like the "RESTfullest" way to me, but some of my coworkers disagree, and OPTIONS seems to be rarely used for things like that.
Maybe as part of a response to GET /cars?&metadata=values or something alike? The "values" here would seem more semantically related to the returned data than to the query parameters, isn´t it?
Anything else?
I have googled and searched in SO for some recommendations about this particular subject, but I could not find anything to help me with arguments for such a decision...
Thank you!
Fabricio Rocha
Brasilia, Brasil
"a good REST API is like an ugly website" -- Rickard Öberg
So how would you do it in a website? Well, you'd probably have a link that goes to a form, and the form would have a list control / radio buttons with semantic cues for each option, with the expectation that the user would select a value from the options available, and the user agent would encode that value into the URL of the GET request when the form was submitted.
So in REST, you do the same thing. In the initial response, you would include a link to your "form" resource; and when a user agent gets the form resource you return a hypermedia representation of your form, with the available options encoded within it, and when the form is submitted your resources pick the client choice(s) out of the query part of the identifier.
But you probably aren't doing REST: it's a colossal PITA, and the benefits of the REST architecture constraints probably don't pay off in your context. So you are likely just looking for a reasonable spelling of an identifier for a resource that returns a message with a list of options.
As a subordinated resource like /cars/types ? This would break the /cars/{id} URL pattern, isn´t it?
Assuming your routing implementation can handle the ambiguity, that's a fine choice. You might consider whether there's just one list, or different lists for different contexts, and how to handle that.
As a separate resource such as /tables/cars/types? This would break the consistency around the main resources of the business model itself, right?
Remember OO programming and encapsulation? Decoupling the API from the underlying data model is a good thing.
That said, I'm personally not fond of the "tables" as an element in your hierarchy. If you wanted to head that direction, I'd suggest /dimensions -- it's a spelling you might use if you were designing a data warehouse
As part of the body of a response for OPTIONS /cars ? It looks like the "RESTfullest" way to me, but some of my coworkers disagree, and OPTIONS seems to be rarely used for things like that.
Yikes! RFC 7231 suggests that a very confusing idea.
The OPTIONS method requests information about the communication options available for the target resource, at either the origin server or an intervening intermediary.
(emphasis added). When writing APIs for the web, you should always be keeping in mind that a client request may go through intermediaries that you do not control; your ability to provide a good experience in those circumstances depends on not confusing the intermediaries by deviating from the uniform interface.
Maybe as part of a response to GET /cars?&metadata=values or something alike?
For the most part, machines are pretty comfortable with any spelling. URI design guidelines normally focus on the human audience. I think that particular spelling will confuse your human consumers, especially if /cars?... would otherwise identify the resource that is a search result.
Anything else? I still feel that what people expect to find under /cars is... a bunch of cars (their representations, I mean), not a list of values among them...
So let's change up your question a little bit
What would be the most consistent and natural way for a REST API to document a list of valid values for a query parameter like that?
If there's one thing the web is really good for, it is documenting things. Pick almost any well documented web API, and pay careful attention to where you are reading about the endpoints -- that will give you some good ideas.
For instance, you could look at the StackExchange API where
https://api.stackexchange.com/docs/questions
tells you everything you need to know about the family of resources resources at
https://api.stackexchange.com/2.2/questions
Types, unsurprisingly, are documented like:
https://api.stackexchange.com/docs/types/flag-option
If you wanted to be sexy about it, you could use Accept-Type to negotiate a redirect to either human readable documentation or machine readable documentation.
I'm in a similar situation but I have a large number of fields each of which has a large number of possible values, and in some cases the values come from a hierarchy so my field is an array of strings. (Borrowing from your example: you might want to record the plant that manufactured the car but rather than a 1-dimensional list these are organised by: continent, country, and state).
I think I'll implement a /taxonomies resource to provide all the data to users. I see WordPress uses a similar scheme (http://v2.wp-api.org/reference/taxonomies/) although I haven't studied it closely yet.

tastypie: why reference objects using uris rather than ids?

When creating or editing a model that contains a reference/foreign key to another object, you have to use the uri of that object. For example, imagine we have two classes: User and Group. Each Group has many Users and each User can belong to exactly one group.
Then, if we are creating a User, we might send an object that looks like this:
{"name":"John Doe", "group":"/path/to/group/1/"}
instead of
{"name":"John Doe", "group_id":1}
I believe this is related to one of the principles of HATEOAS, but I can't find the rationale for using the resource uri rather than the id. What are some reasons for using the uri?
(I'm not interested in opinions about which is better, but in any resources that can help me understand this design choice.)
I'll take a stab
The simplest reason is that surrogate keys like your 1 only mean something within the boundaries of your system. They are meaningless outside of the system.
Expanding on this, you could build your app such that there's no limitations on the URLs that identify groups, only the conformance of the resources gathered from the response of those URLS. Someone could add a user in your system that is in a group in the FaceBook system, as long as the two systems could negotiate what a group is. There are standards for concepts like group, and it's not impossible to do such a thing.
This is how most web apps work. EG: the citation links in a wikipedia article which can point to any other article (until the wiki trolls remove it for not being an appropriate citation resource...)
having your app work like this gets you closer to RESTful conformance. Whether or not you consider RESTful architecture a good idea is what you asked us not to discuss, so i won't.
Another often cited benefit would be the ability for you to completely re-key your setup. You may dismiss this at first...but if you really use 1 for id's, that's probably an int or long, and you'll soon run out of those. Also such an id means you have to sequence them appropriately. At some point you may wish you had used a guid as your id's. Anyone holding on to your old ID scheme would be considered legacy. The URLs give you a little abstraction from this..old url's remain a legacy thing, but it's easier to identify a legacy url than it is to identify a legacy id (granted not much...it's pretty easy to know if you're getting a long or a guid, but a bit easier to see a url as /old/path/group/1 vs /new/path/group/). Generally using URLs gives you a little more forward compatibility and room to grow.
I also find providing URLs as identifiers makes it very easy for a client to retrieve information about that thing. the self link is so VERY convenient. Suppose i have some reference to a group:1....what good is that? How many UI's are going to show a control that says "add group 1". You'll want to show more. If you pass around URLs as identifiers of selections then clients can always retrieve more information about what that selection actually is. In some apps you could pass around the whole object (which would include the id) to deal with this, but it's nice to just save the URL for later retrieval (think bookmarks). Even more importantly it's always nice to be able to refresh that object regularly in order to get the latest state of it. A self link can do that very nicely, and i'd argue it's useful enough to always include...and if an always included self link identifies the resource...why do you need to also provide your surrogate key as a secondary identifier?
One side note. I try to avoid services that require a url as a parameter. I'd prefer to create the user, than have the service offer up possible group memberships as links, then have the client choose to request those state transitions from non-membership to membership. If you need to "create the user with groups" i'd go with intermediate states prior to actual submission/commitment of the new user to the service. I've found the less inputs the client has to provide, the easier the application is to use.

REST vs RPC for a C++ API

I am writing a C++ API which is to be used as a web service. The functions in the API take in images/path_to_images as input parameters, process them, and give a different set of images/paths_to_images as outputs. I was thinking of implementing a REST interface to enable developers to use this API for their projects (independent of whatever language they'd like to work in). But, I understand REST is good only when you have a collection of data that you want to query or manipulate, which is not exactly the case here.
[The collection I have is of different functions that manipulate the supplied data.]
So, is it better for me to implement an RPC interface for this, or can this be done using REST itself?
Like lcfseth, I would also go for REST. REST is indeed resource-based and, in your case, you might consider that there's no resource to deal with. However, that's not exactly true, the image converter in your system is the resource. You POST images to it and it returns new images. So I'd simply create a URL such as:
POST http://example.com/image-converter
You POST images to it and it returns some array with the path to the new images.
Potentially, you could also have:
GET http://example.com/image-converter
which could tell you about the status of the image conversion (assuming it is a time consuming process).
The advantage of doing it like that is that you are re-using HTTP verbs that developers are familiar with, the interface is almost self-documenting (though of course you still need to document the format accepted and returned by the POST call). With RPC, you would have to define new verbs and document them.
REST use common operation GET,POST,DELETE,HEAD,PUT. As you can imagine, this is very data oriented. However there is no restriction on the data type and no restriction on the size of the data (none I'm aware of anyway).
So it's possible to use it in almost every context (including sending binary data). One of the advantages of REST is that web browser understand REST and your user won't need to have a dedicated application to send requests.
RPC presents more possibilities and can also be used. You can define custom operations for example.
Not sure you need that much power given what you intend to do.
Personally I would go with REST.
Here's a link you might wanna read:
http://www.sitepen.com/blog/2008/03/25/rest-and-rpc-relationship/
Compared to RPC, REST's(json style interface) is lightweight, it's easy for API user to use. RPC(soap/xml) seems complex and heavy.
I guess that what you want is HTTP+JSON based API, not the REST API that claimed by the REST author
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven

Can you help clarify some points regarding RESTful services and Code Generation?

I've been struggling with understanding a few points I keep reading regarding RESTful services. I'm hoping someone can help clarify.
1a) There seems to be a general aversion to generated code when talking about RESTful services.
1b) The argument that if you use a WADL to generate a client for a RESTful service, when the service changes - so does your client code.
Why I don't get it: Whether you are referencing a WADL and using generated code or you have manually extracted data from a RESTful response and mapped them to your UI (or whatever you're doing with them) if something changes in the underlying service it seems just as likely that the code will break in both cases. For instance, if the data returned changes from FirstName and LastName to FullName, in both instances you will have to update your code to grab the new field and perhaps handle it differently.
2) The argument that RESTful services don't need a WADL because the return types should be well-known MIME types and you should already know how to handle them.
Why I don't get it: Is the expectation that for every "type" of data a service returns there will be a unique MIME type in existence? If this is the case, does that mean the consumer of the RESTful services is expected to read the RFC to determine the structure of the returned data, how to use each field, etc.?
I've done a lot of reading to try to figure this out for myself so I hope someone can provide concrete examples and real-world scenarios.
REST can be very subtle. I've also done lots of reading on it and every once in a while I went back and read Chapter 5 of Fielding's dissertation, each time finding more insight. It was as clear as mud the first time (all though some things made sense) but only got better once I tried to apply the principles and used the building blocks.
So, based on my current understanding let's give it a go:
Why do RESTafarians not like code generation?
The short answer: If you make use of hypermedia (+links) There is no need.
Context: Explicitly defining a contract (WADL) between client and server does not reduce coupling enough: If you change the server the client breaks and you need to regenerate the code. (IMHO even automating it is just a patch to the underlying coupling issue).
REST helps you to decouple on different levels. Hypermedia discoverability is one of the goods ones to start with. See also the related concept HATEOAS
We let the client “discover” what can be done from the resource we are operating on instead of defining a contract before. We load the resource, check for “named links” and then follow those links or fill in forms (or links to forms) to update the resource. The server acts as a guide to the client via the options it proposes based on state. (Think business process / workflow / behavior). If we use a contract we need to know this "out of band" information and update the contract on change.
If we use hypermedia with links there is no need to have “separate contract”. Everything is included within the hypermedia – why design a separate document? Even URI templates are out of band information but if kept simple can work like Amazon S3.
Yes, we still need a common ground to stand on when transferring representations (hypermedia), so we define your own media types or use widely accepted ones such as Atom or Micro-formats. Thus, with the constraints of basic building blocks (link + forms + data - hypermedia) we reduce coupling by keeping out of band information to a minimum.
As first it seems that going for hypermedia does not change the impact of change :) : But, there are subtle differences. For one, if I have a WADL I need to update another doc and deploy/distribute. Using pure hypermedia there is no impact since it's embedded. (Imagine changes rippling through a complex interweave of systems). As per your example having FirstName + LastName and adding FullName does not really impact the clients, but removing First+Last and replacing with FullName does even in hypermedia.
As a side note: The REST uniform interface (verb constraints - GET, PUT, POST, DELETE + other verbs) decouples implementation from services.
Maybe I'm totally wrong but another possibility might be a “psychological kick back” to code generation: WADL makes one think of the WSDL(contract) part in “traditional web services (WSDL+SOAP)” / RPC which goes against REST. In REST state is transferred via hypermedia and not RPC which are method calls to update state on the server.
Disclaimer: I've not completed the referenced article in detail but I does give some great points.
I have worked on API projects for quite a while.
To answer your first question.
Yes, If the services return values change (Ex: First name and Last name becomes Full Name) your code might break. You will no longer get the first name and last name.
You have to understand that WADL is a Agreement. If it has to change, then the client needs to be notified. To avoid breaking the client code, we release a new version of the API.
The version 1.0 will have First Name and last name without breaking your code. We will release 1.1 version which will have the change to Full name.
So the answer in short, WADL is there to stay. As long as you use that version of the API. Your code will not break. If you want to get full name, then you have to move to the new versions. With lot of code generation plugins in the technology market, generating the code should not be a issue.
To answer your next question of why not WADL and how you get to know the mime types.
WADL is for code generation and serves as a contract. With that you can use JAXB or any mapping framework to convert the JSON string to generated bean objects.
If not WADL, you don't need to inspect every element to determine the type. You can easily do this.
var obj =
jQuery.parseJSON('{"name":"John"}');
alert( obj.name === "John" );
Let me know, If you have any questions.

What does using RESTful URLs buy me?

I've been reading up on REST, and I'm trying to figure out what the advantages to using it are. Specifically, what is the advantage to REST-style URLs that make them worth implementing over a more typical GET request with a query string?
Why is this URL:
http://www.parts-depot.com/parts/getPart?id=00345
Considered inferior to this?
http://www.parts-depot.com/parts/00345
In the above examples (taken from here) the second URL is indeed more elegant looking and concise. But it comes at a cost... the first URL is pretty easy to implement in any web language, out of the box. The second requires additional code and/or server configuration to parse out values, as well as additional documentation and time spent explaining the system to junior programmers and justifying it to peers.
So, my question is, aside from the pleasure of having URLs that look cool, what advantages do RESTful URLs gain for me that would make using them worth the cost of implementation?
The hope is that if you make your URL refer to a noun then there is a better chance that you will implement the HTTP verbs correctly. Beyond that there is absolutely no advantage of one URL versus another.
The reality is that the contents of an URL are completely irrelevant to a RESTful system. It is simply an identifier.
It's not what it looks like, it is what you do with it that is important.
One way of looking at REST:
http://tomayko.com/writings/rest-to-my-wife (which has now been taken down, sadly, but can still be see on web.archive.org)
So anyway, HTTP—this protocol Fielding
and his friends created—is all about
applying verbs to nouns. For instance,
when you go to a web page, the browser
does an HTTP GET on the URL you type
in and back comes a web page.
...
Instead, the large majority are busy
writing layers of complex
specifications for doing this stuff in
a different way that isn’t nearly as
useful or eloquent. Nouns aren’t
universal and verbs aren’t
polymorphic. We’re throwing out
decades of real field usage and proven
technique and starting over with
something that looks a lot like other
systems that have failed in the past.
We’re using HTTP but only because it
helps us talk to our network and
security people less. We’re trading
simplicity for flashy tools and
wizards.
One thing that jumps out at me (nice question by the way) is what they describe. The first describes an operation (getPart), the second describes a resource (part 00345).
Also, maybe you couldn't use other HTTP verbs with the first - you'd need a new method for putPart, for example. The second can be reused with different verbs (like PUT, DELETE, POST) to 'manipulate' the resource? I suppose you're also kinda saying GET twice - once with the verb, again in the method, so the second is more consistent with the intent of the HTTP protocol?
One that I always like as a savvy web-user, but certainly shouldn't be used as a guiding principle for when to use such a URL scheme is that those types of URLs are "hackable". In particular for things like blogs where I can just edit a date or a page number within a URL instead of having to find where the "next page" button is.
The biggest advantage of REST IMO is that it allows a clean way to use the HTTP Verbs (which are the most important on REST services). Actually, using REST means you are using the HTTP protocol and its verbs.
Using your urls, and imagining you want to post a "part", instead of getting it
First case should be like this:
You are using a GET where you should have used a post
http://www.parts-depot.com/parts/postPart?param1=lalala&param2=lelele&param3=lilili
While on a REST context, it should be
http://www.parts-depot.com/parts
and on the body, (for example) a xml like this
<part>
<param1>lalala<param1>
<param2>lelele<param1>
<param3>lilili<param1>
</part>
URI semantics are defined by RFC 2396. The extracts particularly pertinent to this question are 3.3. "Path Component":
The path component contains data, specific to the authority (or the
scheme if there is no authority component), identifying the resource
within the scope of that scheme and authority.
And 3.4 "Query Component":
The query component is a string of information to be interpreted by
the resource.
Note that the query component is not part of the resource identifier, it is merely information to be interpreted by the resource.
As such, the resource being identified by your first example is actually just /parts/getPart. If your intention is that the URL should identify one specific part resource then the first example does not do that, whereas the second one (/parts/00345) does.
So the 'advantage' of the second style of URL is that it is semantically correct, whereas the first one is not.
"The second requires additional code
and/or server configuration to parse
out values,"
Really? You choose a poor framework, then. My experience is that the RESTful version is exactly the same amount of code. Maybe I just lucked into a cool framework.
"as well as additional documentation
and time spent explaining the system
to junior programmers"
Only once. After they get it, you shouldn't have to explain it again.
"and justifying it to peers."
Only once. After they get it, you shouldn't have to explain it again.
Don't use query/search parts in URLs which aren't queries or searches, if you do that - according to the URL spec - you are likely implying something about that resource that you don't really want to.
Use query parts for resources that are a subset of some bigger resource - pagination is a good example of where this could be applied.