Can you help clarify some points regarding RESTful services and Code Generation?

Can you help clarify some points regarding RESTful services and Code Generation? - web-services

I've been struggling with understanding a few points I keep reading regarding RESTful services. I'm hoping someone can help clarify.
1a) There seems to be a general aversion to generated code when talking about RESTful services.
1b) The argument that if you use a WADL to generate a client for a RESTful service, when the service changes - so does your client code.
Why I don't get it: Whether you are referencing a WADL and using generated code or you have manually extracted data from a RESTful response and mapped them to your UI (or whatever you're doing with them) if something changes in the underlying service it seems just as likely that the code will break in both cases. For instance, if the data returned changes from FirstName and LastName to FullName, in both instances you will have to update your code to grab the new field and perhaps handle it differently.
2) The argument that RESTful services don't need a WADL because the return types should be well-known MIME types and you should already know how to handle them.
Why I don't get it: Is the expectation that for every "type" of data a service returns there will be a unique MIME type in existence? If this is the case, does that mean the consumer of the RESTful services is expected to read the RFC to determine the structure of the returned data, how to use each field, etc.?
I've done a lot of reading to try to figure this out for myself so I hope someone can provide concrete examples and real-world scenarios.

REST can be very subtle. I've also done lots of reading on it and every once in a while I went back and read Chapter 5 of Fielding's dissertation, each time finding more insight. It was as clear as mud the first time (all though some things made sense) but only got better once I tried to apply the principles and used the building blocks.
So, based on my current understanding let's give it a go:
Why do RESTafarians not like code generation?
The short answer: If you make use of hypermedia (+links) There is no need.
Context: Explicitly defining a contract (WADL) between client and server does not reduce coupling enough: If you change the server the client breaks and you need to regenerate the code. (IMHO even automating it is just a patch to the underlying coupling issue).
REST helps you to decouple on different levels. Hypermedia discoverability is one of the goods ones to start with. See also the related concept HATEOAS
We let the client “discover” what can be done from the resource we are operating on instead of defining a contract before. We load the resource, check for “named links” and then follow those links or fill in forms (or links to forms) to update the resource. The server acts as a guide to the client via the options it proposes based on state. (Think business process / workflow / behavior). If we use a contract we need to know this "out of band" information and update the contract on change.
If we use hypermedia with links there is no need to have “separate contract”. Everything is included within the hypermedia – why design a separate document? Even URI templates are out of band information but if kept simple can work like Amazon S3.
Yes, we still need a common ground to stand on when transferring representations (hypermedia), so we define your own media types or use widely accepted ones such as Atom or Micro-formats. Thus, with the constraints of basic building blocks (link + forms + data - hypermedia) we reduce coupling by keeping out of band information to a minimum.
As first it seems that going for hypermedia does not change the impact of change :) : But, there are subtle differences. For one, if I have a WADL I need to update another doc and deploy/distribute. Using pure hypermedia there is no impact since it's embedded. (Imagine changes rippling through a complex interweave of systems). As per your example having FirstName + LastName and adding FullName does not really impact the clients, but removing First+Last and replacing with FullName does even in hypermedia.
As a side note: The REST uniform interface (verb constraints - GET, PUT, POST, DELETE + other verbs) decouples implementation from services.
Maybe I'm totally wrong but another possibility might be a “psychological kick back” to code generation: WADL makes one think of the WSDL(contract) part in “traditional web services (WSDL+SOAP)” / RPC which goes against REST. In REST state is transferred via hypermedia and not RPC which are method calls to update state on the server.
Disclaimer: I've not completed the referenced article in detail but I does give some great points.

I have worked on API projects for quite a while.
To answer your first question.
Yes, If the services return values change (Ex: First name and Last name becomes Full Name) your code might break. You will no longer get the first name and last name.
You have to understand that WADL is a Agreement. If it has to change, then the client needs to be notified. To avoid breaking the client code, we release a new version of the API.
The version 1.0 will have First Name and last name without breaking your code. We will release 1.1 version which will have the change to Full name.
So the answer in short, WADL is there to stay. As long as you use that version of the API. Your code will not break. If you want to get full name, then you have to move to the new versions. With lot of code generation plugins in the technology market, generating the code should not be a issue.
To answer your next question of why not WADL and how you get to know the mime types.
WADL is for code generation and serves as a contract. With that you can use JAXB or any mapping framework to convert the JSON string to generated bean objects.
If not WADL, you don't need to inspect every element to determine the type. You can easily do this.
var obj =
jQuery.parseJSON('{"name":"John"}');
alert( obj.name === "John" );
Let me know, If you have any questions.

Related

tastypie: why reference objects using uris rather than ids?

When creating or editing a model that contains a reference/foreign key to another object, you have to use the uri of that object. For example, imagine we have two classes: User and Group. Each Group has many Users and each User can belong to exactly one group.
Then, if we are creating a User, we might send an object that looks like this:
{"name":"John Doe", "group":"/path/to/group/1/"}
instead of
{"name":"John Doe", "group_id":1}
I believe this is related to one of the principles of HATEOAS, but I can't find the rationale for using the resource uri rather than the id. What are some reasons for using the uri?
(I'm not interested in opinions about which is better, but in any resources that can help me understand this design choice.)

I'll take a stab
The simplest reason is that surrogate keys like your 1 only mean something within the boundaries of your system. They are meaningless outside of the system.
Expanding on this, you could build your app such that there's no limitations on the URLs that identify groups, only the conformance of the resources gathered from the response of those URLS. Someone could add a user in your system that is in a group in the FaceBook system, as long as the two systems could negotiate what a group is. There are standards for concepts like group, and it's not impossible to do such a thing.
This is how most web apps work. EG: the citation links in a wikipedia article which can point to any other article (until the wiki trolls remove it for not being an appropriate citation resource...)
having your app work like this gets you closer to RESTful conformance. Whether or not you consider RESTful architecture a good idea is what you asked us not to discuss, so i won't.
Another often cited benefit would be the ability for you to completely re-key your setup. You may dismiss this at first...but if you really use 1 for id's, that's probably an int or long, and you'll soon run out of those. Also such an id means you have to sequence them appropriately. At some point you may wish you had used a guid as your id's. Anyone holding on to your old ID scheme would be considered legacy. The URLs give you a little abstraction from this..old url's remain a legacy thing, but it's easier to identify a legacy url than it is to identify a legacy id (granted not much...it's pretty easy to know if you're getting a long or a guid, but a bit easier to see a url as /old/path/group/1 vs /new/path/group/). Generally using URLs gives you a little more forward compatibility and room to grow.
I also find providing URLs as identifiers makes it very easy for a client to retrieve information about that thing. the self link is so VERY convenient. Suppose i have some reference to a group:1....what good is that? How many UI's are going to show a control that says "add group 1". You'll want to show more. If you pass around URLs as identifiers of selections then clients can always retrieve more information about what that selection actually is. In some apps you could pass around the whole object (which would include the id) to deal with this, but it's nice to just save the URL for later retrieval (think bookmarks). Even more importantly it's always nice to be able to refresh that object regularly in order to get the latest state of it. A self link can do that very nicely, and i'd argue it's useful enough to always include...and if an always included self link identifies the resource...why do you need to also provide your surrogate key as a secondary identifier?
One side note. I try to avoid services that require a url as a parameter. I'd prefer to create the user, than have the service offer up possible group memberships as links, then have the client choose to request those state transitions from non-membership to membership. If you need to "create the user with groups" i'd go with intermediate states prior to actual submission/commitment of the new user to the service. I've found the less inputs the client has to provide, the easier the application is to use.

REST: forms, links and hypermedia format

I am currently learning REST practices with the help of the excellent book of Richardson "RESTful Web Services". I would like to design a REST API that follows the maturity model of Richardson, especially the level 3 called HATEOAS which seems to be the most complicate to handle.
Firstly, I don't really understand the different meaning between a link and a form ?(regarding the hypermedia, I know the HTML explanation..).
Is it simply a matter of "link is for GET method" and "form is for GET/POST/PUT methods" ?
EDIT1: I got the point: forms can be application forms for constructing a URI and use a GET method or can be resource forms for PUT/POST methods (more or less what i asked). Correct me if I'm wrong, links are supposed to be use carefully by the client with the OPTIONS method to know how it can be use.
As I want to be compliant to HATEOAS, I need to choose an hypermedia format... and I know that it exists several of format such as Siren, HAL, Collection+JSON, JSON-LD, Hydra, etc...
But well, I don't know which one to use ?
In the Richardson's book, he uses xHTML which has one main good point: testing your API with a browser. But xHTML seems to be heavy.
I would probably prefer something more lightweight but the recent hypermedia format (Siren, HAL, ...) are probably too recent and complex to test without a programmable client.

I would definitely recommend you to give it a try to the siren format for your API. As they correctly mentioned in one of the comments in has a nice browser...but that one is poorly supported (you can see it in their github repository).
So you should use this one which was based on the first mentioned but it has some extras like: nice error handling & supports actions for nested entities (among others)
Regarding the difference between link and forms... well my 5 cents are that you use links for GET requests that require no input params from your client and actions (talking in siren format) for POST, PUT, PATCH, GET actions that do require sending some params to the server.
Now, you could say: "But I can have this link http://testsite.com/api/v1/users?param1=value1" and still be passing params. You're correct, BUT!, how would your client now that it can pass this or that param.
That's why you have actions with a key called 'fields' where you describe the fields you're willing to accept.
Enjoy hypermedia APIs!

REST vs RPC for a C++ API

I am writing a C++ API which is to be used as a web service. The functions in the API take in images/path_to_images as input parameters, process them, and give a different set of images/paths_to_images as outputs. I was thinking of implementing a REST interface to enable developers to use this API for their projects (independent of whatever language they'd like to work in). But, I understand REST is good only when you have a collection of data that you want to query or manipulate, which is not exactly the case here.
[The collection I have is of different functions that manipulate the supplied data.]
So, is it better for me to implement an RPC interface for this, or can this be done using REST itself?

Like lcfseth, I would also go for REST. REST is indeed resource-based and, in your case, you might consider that there's no resource to deal with. However, that's not exactly true, the image converter in your system is the resource. You POST images to it and it returns new images. So I'd simply create a URL such as:
POST http://example.com/image-converter
You POST images to it and it returns some array with the path to the new images.
Potentially, you could also have:
GET http://example.com/image-converter
which could tell you about the status of the image conversion (assuming it is a time consuming process).
The advantage of doing it like that is that you are re-using HTTP verbs that developers are familiar with, the interface is almost self-documenting (though of course you still need to document the format accepted and returned by the POST call). With RPC, you would have to define new verbs and document them.

REST use common operation GET,POST,DELETE,HEAD,PUT. As you can imagine, this is very data oriented. However there is no restriction on the data type and no restriction on the size of the data (none I'm aware of anyway).
So it's possible to use it in almost every context (including sending binary data). One of the advantages of REST is that web browser understand REST and your user won't need to have a dedicated application to send requests.
RPC presents more possibilities and can also be used. You can define custom operations for example.
Not sure you need that much power given what you intend to do.
Personally I would go with REST.
Here's a link you might wanna read:
http://www.sitepen.com/blog/2008/03/25/rest-and-rpc-relationship/

Compared to RPC, REST's(json style interface) is lightweight, it's easy for API user to use. RPC(soap/xml) seems complex and heavy.
I guess that what you want is HTTP+JSON based API, not the REST API that claimed by the REST author
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven

What does using RESTful URLs buy me?

I've been reading up on REST, and I'm trying to figure out what the advantages to using it are. Specifically, what is the advantage to REST-style URLs that make them worth implementing over a more typical GET request with a query string?
Why is this URL:
http://www.parts-depot.com/parts/getPart?id=00345
Considered inferior to this?
http://www.parts-depot.com/parts/00345
In the above examples (taken from here) the second URL is indeed more elegant looking and concise. But it comes at a cost... the first URL is pretty easy to implement in any web language, out of the box. The second requires additional code and/or server configuration to parse out values, as well as additional documentation and time spent explaining the system to junior programmers and justifying it to peers.
So, my question is, aside from the pleasure of having URLs that look cool, what advantages do RESTful URLs gain for me that would make using them worth the cost of implementation?

The hope is that if you make your URL refer to a noun then there is a better chance that you will implement the HTTP verbs correctly. Beyond that there is absolutely no advantage of one URL versus another.
The reality is that the contents of an URL are completely irrelevant to a RESTful system. It is simply an identifier.
It's not what it looks like, it is what you do with it that is important.

One way of looking at REST:
http://tomayko.com/writings/rest-to-my-wife (which has now been taken down, sadly, but can still be see on web.archive.org)
So anyway, HTTP—this protocol Fielding
and his friends created—is all about
applying verbs to nouns. For instance,
when you go to a web page, the browser
does an HTTP GET on the URL you type
in and back comes a web page.
...
Instead, the large majority are busy
writing layers of complex
specifications for doing this stuff in
a different way that isn’t nearly as
useful or eloquent. Nouns aren’t
universal and verbs aren’t
polymorphic. We’re throwing out
decades of real field usage and proven
technique and starting over with
something that looks a lot like other
systems that have failed in the past.
We’re using HTTP but only because it
helps us talk to our network and
security people less. We’re trading
simplicity for flashy tools and
wizards.

One thing that jumps out at me (nice question by the way) is what they describe. The first describes an operation (getPart), the second describes a resource (part 00345).
Also, maybe you couldn't use other HTTP verbs with the first - you'd need a new method for putPart, for example. The second can be reused with different verbs (like PUT, DELETE, POST) to 'manipulate' the resource? I suppose you're also kinda saying GET twice - once with the verb, again in the method, so the second is more consistent with the intent of the HTTP protocol?

One that I always like as a savvy web-user, but certainly shouldn't be used as a guiding principle for when to use such a URL scheme is that those types of URLs are "hackable". In particular for things like blogs where I can just edit a date or a page number within a URL instead of having to find where the "next page" button is.

The biggest advantage of REST IMO is that it allows a clean way to use the HTTP Verbs (which are the most important on REST services). Actually, using REST means you are using the HTTP protocol and its verbs.
Using your urls, and imagining you want to post a "part", instead of getting it
First case should be like this:
You are using a GET where you should have used a post
http://www.parts-depot.com/parts/postPart?param1=lalala&param2=lelele&param3=lilili
While on a REST context, it should be
http://www.parts-depot.com/parts
and on the body, (for example) a xml like this
<part>
<param1>lalala<param1>
<param2>lelele<param1>
<param3>lilili<param1>
</part>

URI semantics are defined by RFC 2396. The extracts particularly pertinent to this question are 3.3. "Path Component":
The path component contains data, specific to the authority (or the
scheme if there is no authority component), identifying the resource
within the scope of that scheme and authority.
And 3.4 "Query Component":
The query component is a string of information to be interpreted by
the resource.
Note that the query component is not part of the resource identifier, it is merely information to be interpreted by the resource.
As such, the resource being identified by your first example is actually just /parts/getPart. If your intention is that the URL should identify one specific part resource then the first example does not do that, whereas the second one (/parts/00345) does.
So the 'advantage' of the second style of URL is that it is semantically correct, whereas the first one is not.

"The second requires additional code
and/or server configuration to parse
out values,"
Really? You choose a poor framework, then. My experience is that the RESTful version is exactly the same amount of code. Maybe I just lucked into a cool framework.
"as well as additional documentation
and time spent explaining the system
to junior programmers"
Only once. After they get it, you shouldn't have to explain it again.
"and justifying it to peers."
Only once. After they get it, you shouldn't have to explain it again.

Don't use query/search parts in URLs which aren't queries or searches, if you do that - according to the URL spec - you are likely implying something about that resource that you don't really want to.
Use query parts for resources that are a subset of some bigger resource - pagination is a good example of where this could be applied.

Files in domain model

What are the best practices for dealing with binaries in domain model? I frequently must associate images and other files with business objects and the simple byte[] is not adequate even for the simplest of cases.
The files:
Does not have a fixed size and can be quite large thus:
Have to be streamed or buffered, preferably in asynchronous manner;
Must be cached both on server and client to avoid redundant transfer;
On unreliable connections the data transfer can be easily interrupted and has to be
resumed - therefore the transfer could start not from the beginning of file but from arbitrary position.
Are handled differently than the rest of the data:
In web applications are not part of the page content but are downloaded by browser separately;
Might be a black box that is handled by third-party software;
For performance reasons might not even be stored in the database.
How do we go about expressing such files in domain model (or more specifically, in model classes)? If the rest of the model is transferred via DTOs and WCF web services and persisted with NHibernate in the database, but the files not necessarily so, how to make the file handling transparent, part of the overall transaction where applicable yet support all that is necessary for them to be consumed not only in web applications, but also in ordinary desktop applications.
For WPF and ASP.NET the file object must expose some form of Url property that can be data-bound to WPF controls or used in IMG or HTML tags. Uploading a file is a lot more complicated. Preferably, proper presentation and content practices such as MVVM must be maintained there.
I am really lost here as I am not satisfied with any of my previous solutions. What would you advice?

You have to be careful not to try and shoehorn too much functionality into a single class here, your wording sounds a bit like you want a single "File" object that will do everything, this is not a good idea.
You will need to have a concept of a File representation that can be passed around everywhere as you have identified - but this needs to be little more than an identifier and possibly a name - it is then up to individual components to decide how they treat this, for example the HTML page may use a File json object and infer that jsFile.Id needs to be retrieved using ftp://xxx/uploads/{id} or something, while in order to display additional associated information a WCF service might receive the file id and look up info in a database.
It probably makes sense to have a FileAttributesDTO class or some such just to distinguish it from when you are dealing with the physical file. You need to consider seperation of concerns and nail down as many use cases as you can before you proceed really. For example will you really need additional information or would a simple wrapper around an FTP service get you all you need.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js