ReST philosophy - how to handle services and side effects - web-services

I've been diving into ReST lately, and a few things still bug me:
1) Since there are only resources and no services to call, how can I provide operations to the client that only do stuff and don't change any data?
For example, in my application it is possible to trigger a service that connects to a remote server and executes a shell scripts. I don't know how this scenario would apply to a resource?
2) Another thing I'm not sure about is side effects: Let's say I have a resource that can be in certain states. When transitioning into another state, a lot of things might happen (e-mails might be sent). The transition is triggered by the client. Should I handle this transition merely by letting the resource be updated via PUT? This feels a bit odd.
For the client this means that updating an attribute of this ressource might only change the attribute, or it also might do a lot of other things. So PUT =/= PUT, kind of.
And implementation wise, I have to check what exacty the PUT request changed, and according to that trigger the side effects. So there would be a lot of checks like if(old_attribute != new_attribute) {side_effects}
Is this how it's supposed to be?
BR,
Philipp

Since there are only resources and no services to call, how can I provide operations to the client that only do stuff and don't change any data?
HTTP is a document transport application. Send documents (ie: messages) that trigger the behaviors that you want.
In other words, you can think about the message you are sending as a description of a task, or as an entry being added to a task queue. "I'm creating a task resource that describes some work I want done."
Jim Webber covers this pretty well.
Another thing I'm not sure about is side effects: Let's say I have a resource that can be in certain states. When transitioning into another state, a lot of things might happen (e-mails might be sent). The transition is triggered by the client. Should I handle this transition merely by letting the resource be updated via PUT?
Maybe, but that's not your only choice -- you could handle the transition by having the client put some other resource (ie, a message describing the change to be made). That affords having a number of messages (commands) that describe very specific modifications to the domain entity.
In other words, you can work around PUT =/= PUT by putting more specific things.
(In HTTP, the semantics of PUT are effectively create or replace. Which is great for dumb documents, or CRUD, but need a bit of design help when applied to an entity with its own agency.)
And implementation wise, I have to check what exacty the PUT request changed, and according to that trigger the side effects.
Is this how it's supposed to be?
Sort of. Review Udi Dahan's talk on reliable messaging; it's not REST specific, but it may help clarify the separation of responsibilities here.

Related

Is there a way to detect from which source an API is being called?

Is there any method to identify from which source an API is called? source refer to IOS application, web application like a page or button click( Ajax calls etc).
Although, saving a flag like (?source=ios or ?source=webapp) while calling api can be done but i just wanted to know is there any other better option to accomplish this?
I also feel this requirement is weird, because in general an App or a web application is used by n number of users so it is difficult to monitor those many API calls.
please give your valuable suggestions.
There is no perfect way to solve this. Designating a special flag won't solve your problem, because the consumer can put in whatever she wants and you cannot be sure if it is legit or not. The same holds true if you issue different API keys for different consumers - you never know if they decide to switch them up.
The only option that comes to my mind is to analyze the HTTP header and see what you can deduce from it. As you probably know a typical HTTP header looks something like this:
You can try and see how the requests from all sources differ in your case and decide if you can reliably differentiate between them. If you have the luxury of developing the client (i.e. this is not a public API), you can set your custom User-Agent strings for different sources.
But keep in mind that Referrer is not mandatory and thus it is not very reliable, and the user agent can also be spoofed. So it is a solution that is better than nothing, but it's not 100% reliable.
Hope this helps, also here is a similar question. Good luck!

How to handle global hypermedia in a HATEOAS API for GUI apps?

Edit: To clarify, this question concerns building GUI applications on HATEOAS APIs, how to design interfaces built on hypermedia "discoverability" (ie. dynamic) principles, and specifically dealing with avoiding purely "modal" GUIs that "link" back to "home" for global functionality that should be "always on" (as represented by the hypermedia entry API point).
In a strict REST API implementation, utilising Hypermedia As The Engine Of Application State (HATEOAS), what patterns (if any) are used to indicate/represent globally "always valid" actions (if such a concept even truly exists for REST)?
The meta question is, can you at all 'factor out' repeated hypermedia?
By simplified example, lets say we have a resource /version with Allow: OPTIONS HEAD GET. This resource depends on nothing and is NEVER affected by any stateful transitions that may occur elsewhere.
Is the requirement that /version hypermedia are simply sent along with every single other resource request?
Or the alternate pattern, with Client behaviour to link back to Home (likely cached) and THEN trigger our always valid /version call? (A "modal" pattern in GUI terms - close this resource, return home, and move on)
Or is there some kind of method/pattern to create independent decoupled modules for any given application? (perhaps namespacing of some kind?)
In a complex, but loosely coupled API, option 1 ends up being buried in Hypermedia hell with 80-95% of your payload being repeated on each resource call. Which seems "right" but is so nasty. Option 2 leads to either strange quirks in GUI client behaviour (hiding valid elements until you 'return home' - modal type ops) OR lots of non-restful assumptions by the GUI client hardcoding out-of-band actions it "knows" are always valid.
Option 3 relates back to my initial question: is there a flag, or some other pattern for indicating globally valid actions that can be sent once (say with the root/home resource) and then "factored out" of subsequent responses?
Well, there's a couple of things going on here.
First, you could simply add a link to any of your common "global" resources. As much as I am loathe to compare a REST architecture to a web site, a web site is a fitting example here. Consider many of the resources here on SO have links to common, "global" resources -- such as the home page, /questions, /tags, /badges, and /users.
The fact that a resource is "static" and never changes has no affect on whether you make the resource available as a link via another resource as part of HATEOS, that's an orthogonal issue.
The second point is that there is nothing that says the entire service be continually accessible from every resource. You can have "well known" entry points in to your service, and these can be well documented (externally). The only downside is that if the URLs for a certain relation changes (/questions to /questions-2, say), then those URLs may not be picked up automatically by clients. But this it likely not an issue, since by changing the URL you are likely changing something else (such as the payload) that affects clients to the point that an older client may well be incompatible with the new URLs.
Having clients "know" where things are in a nested system is not an issue either. For example, if they "know", as documented, that they can only access /version from the /home resource, and every other resource has a path to /home, (either directly, or indirectly), then this is not a problem either. When it wants /version, it should have an idea of what the path is to get it.
As long as the client is traversing the graph based on what you tell it, rather than what "it thinks", then all is well. If the client is currently processing the /blog/daily_news/123 resource, and it has a link rel to "parent" which has a url of /blog, and /blog has a link rel of "version" to /version, then the client can walk the graph (traversing the parent rel to /blog, and the version rel to /version). What the client should not do (unless otherwise documented) is that it should not ASSUME that it can visit /version whenever it wants. Since it's not linked from /blog/daily_news/123, the client should not just jump over to it. The rel isn't there, so the client "doesn't know".
This is the key point. The fact that it is not there means it's not an option right now, for whatever reason, and it's not the client task to force the point, as the URL space is not in its hands, it's in the services hands. The service controls it, not the client.
If /version suddenly vanishes, well, that's a different issue. Perhaps they timed out and aren't allowed to "see" /version anymore, perhaps you deleted it. The client at that point will simply error out "can't find version rel" and then quit. This is an unrelated problem, just a truth of the matter (what else can you do when resources suddenly vanish behind your back).
Addenda for question:
Which, if I understand this means: if /home is not expired, and we
navigate to /blog (which contains a rel back to /home) then the rel
methods at /home are still immediately "accessible" from /blog right?
No, not really. That's the key point. Save for some global resource specifically documented (out of band), you should not traverse to any link not specified in your current resource. Whether /home is not expired or not is not relevant at all.
The server could certainly have sent appropriate caching instructions, letting you know that you could cache /home for some time, but you should still traverse through the rel to /home to get any links you think are there.
If /home is well cached on your client, then this traversal is effectively free, but logically and semantically you should stick with the protocol of traversing the links.
Because if it's NOT cached, then you simply have no idea what rels will be there when you get back. Yes, 99.999999% of the time it will always be the same, and shame on the server for not sending the appropriate caching headers, but by definition, the server isn't promising you anything, so both you, the client, and it, the server, eat the processing costs of hitting an effectively static resource over and over and over again.
By mandating your client follow the steps, perhaps with internal optimizations due to caching and pre-processing to make these static traversals quick and efficient, you stick with the model of HATEOS, and defer to the system to make it optimal rather than pre-supposing at the code level and jumping through rels you think you already have.
This way your code will always work, regardless of what the server does. Who knows when they may turn caching on or off, your code certainly shouldn't care, not at the level of deciding whether or not to dereference a link certainly.
The premise of HATEOS is that the server is in charge of, and mandates its URL space. Arbitrarily jumping around without guidance from the server is off spec, it's not your graph to navigate. REST is for more coarse grained operations, but proper caching and such can make jumping through these hoops quick and efficient for you the client.
Maybe I'm misunderstanding the question, but it seems like client-side caching should solve this. Look for the Expires and Cache Control headers.

How to monitor communication in a SOA environment with an intermediary?

I'm looking for a possiblity to monitor all messages in a SOA enviroment with an intermediary, who'll be designed to enforce different rule-sets over the message's structure and sequences (e.g., let's say it'll check and ensure that Service A has to be consumed before B).
Obviously the first idea that came to mind is how WS-Adressing might help here, but I'm not sure if it does, as I don't really see any mechanism there to ensure that a message will get delivered via a given intermediary (as it is in WS-Routing, which is an outdated proprietary protocol by Microsoft).
Or maybe there's even a different approach that the monitor wouldn't be part of the route but would be notified on request/responses, which might it then again make somehow harder to actively enforce rules.
I'm looking forward to any suggestions.
You can implement a "service firewall" either by intercepting all the calls in each service as part of your basic servicehost. Alternatively you can use 3rd party solutions and route all your service calls to them (they will do the intercepting and then forward calls to your services).
You can use ESBs to do the routing (and intercepting) or you can use dedicated solutions like IBM's datapower, XML firewall from Layer7 etc.
For all my (technical) services I use messaging and the command processor pattern, which I describe here, without actually calling the pattern name though. I send a message and the framework finds to corresponding class that implements the interface that corresponds to my message. I can create multiple classes that can handle my message, or a single class that handles a multitude of messages. In the article these are classes implementing the IHandleMessages interface.
Either way, as long as I can create multiple classes implementing this interface, and they are all called, I can easily add auditing without adding this logic to my business logic or anything. Just add an additional implementation for every single message, or enhance the framework so it also accepts IHandleMessages implementations. That class can than audit every single message and store all of them centrally.
After doing that, you can find out more information about the messages and the flow. For example, if you put into the header information of your WCF/MSMQ message where it came from and perhaps some unique identifier for that single message, you can track the flow over various components.
NServiceBus also has this functionality for auditing and the team is working on additional tooling for this, called ServiceInsight.
Hope this helps.

Can you help clarify some points regarding RESTful services and Code Generation?

I've been struggling with understanding a few points I keep reading regarding RESTful services. I'm hoping someone can help clarify.
1a) There seems to be a general aversion to generated code when talking about RESTful services.
1b) The argument that if you use a WADL to generate a client for a RESTful service, when the service changes - so does your client code.
Why I don't get it: Whether you are referencing a WADL and using generated code or you have manually extracted data from a RESTful response and mapped them to your UI (or whatever you're doing with them) if something changes in the underlying service it seems just as likely that the code will break in both cases. For instance, if the data returned changes from FirstName and LastName to FullName, in both instances you will have to update your code to grab the new field and perhaps handle it differently.
2) The argument that RESTful services don't need a WADL because the return types should be well-known MIME types and you should already know how to handle them.
Why I don't get it: Is the expectation that for every "type" of data a service returns there will be a unique MIME type in existence? If this is the case, does that mean the consumer of the RESTful services is expected to read the RFC to determine the structure of the returned data, how to use each field, etc.?
I've done a lot of reading to try to figure this out for myself so I hope someone can provide concrete examples and real-world scenarios.
REST can be very subtle. I've also done lots of reading on it and every once in a while I went back and read Chapter 5 of Fielding's dissertation, each time finding more insight. It was as clear as mud the first time (all though some things made sense) but only got better once I tried to apply the principles and used the building blocks.
So, based on my current understanding let's give it a go:
Why do RESTafarians not like code generation?
The short answer: If you make use of hypermedia (+links) There is no need.
Context: Explicitly defining a contract (WADL) between client and server does not reduce coupling enough: If you change the server the client breaks and you need to regenerate the code. (IMHO even automating it is just a patch to the underlying coupling issue).
REST helps you to decouple on different levels. Hypermedia discoverability is one of the goods ones to start with. See also the related concept HATEOAS
We let the client “discover” what can be done from the resource we are operating on instead of defining a contract before. We load the resource, check for “named links” and then follow those links or fill in forms (or links to forms) to update the resource. The server acts as a guide to the client via the options it proposes based on state. (Think business process / workflow / behavior). If we use a contract we need to know this "out of band" information and update the contract on change.
If we use hypermedia with links there is no need to have “separate contract”. Everything is included within the hypermedia – why design a separate document? Even URI templates are out of band information but if kept simple can work like Amazon S3.
Yes, we still need a common ground to stand on when transferring representations (hypermedia), so we define your own media types or use widely accepted ones such as Atom or Micro-formats. Thus, with the constraints of basic building blocks (link + forms + data - hypermedia) we reduce coupling by keeping out of band information to a minimum.
As first it seems that going for hypermedia does not change the impact of change :) : But, there are subtle differences. For one, if I have a WADL I need to update another doc and deploy/distribute. Using pure hypermedia there is no impact since it's embedded. (Imagine changes rippling through a complex interweave of systems). As per your example having FirstName + LastName and adding FullName does not really impact the clients, but removing First+Last and replacing with FullName does even in hypermedia.
As a side note: The REST uniform interface (verb constraints - GET, PUT, POST, DELETE + other verbs) decouples implementation from services.
Maybe I'm totally wrong but another possibility might be a “psychological kick back” to code generation: WADL makes one think of the WSDL(contract) part in “traditional web services (WSDL+SOAP)” / RPC which goes against REST. In REST state is transferred via hypermedia and not RPC which are method calls to update state on the server.
Disclaimer: I've not completed the referenced article in detail but I does give some great points.
I have worked on API projects for quite a while.
To answer your first question.
Yes, If the services return values change (Ex: First name and Last name becomes Full Name) your code might break. You will no longer get the first name and last name.
You have to understand that WADL is a Agreement. If it has to change, then the client needs to be notified. To avoid breaking the client code, we release a new version of the API.
The version 1.0 will have First Name and last name without breaking your code. We will release 1.1 version which will have the change to Full name.
So the answer in short, WADL is there to stay. As long as you use that version of the API. Your code will not break. If you want to get full name, then you have to move to the new versions. With lot of code generation plugins in the technology market, generating the code should not be a issue.
To answer your next question of why not WADL and how you get to know the mime types.
WADL is for code generation and serves as a contract. With that you can use JAXB or any mapping framework to convert the JSON string to generated bean objects.
If not WADL, you don't need to inspect every element to determine the type. You can easily do this.
var obj =
jQuery.parseJSON('{"name":"John"}');
alert( obj.name === "John" );
Let me know, If you have any questions.

Best way to keep the user-interface up-to-date?

This question is a refinement of my question Different ways of observing data changes.
I still have a lot of classes in my C++ application, which are updated (or could be updated) frequently in complex mathematical routines and in complex pieces of business logic.
If I go for the 'observer' approach, and send out notifications every time a value of an instance is changed, I have 2 big risks:
sending out the notifications itself may slow down the applications seriously
if user interface elements need to be updated by the change, they are updated with every change, resulting in e.g. screens being updated thousends of times while some piece of business logic is executing
Some problems may be solved by adding buffering-mechanisms (where you send out notifications when you are going to start with an algorith, and when the algorithm is finished), but since the business logic may be executed on many places in the software, we end up adding buffering almost everywhere, after every possible action chosen in the menu.
Instead of the 'observer' aproach, I could also use the 'mark-dirty' approach, only marking the instances that have been altered, and at the end of the action telling the user interface that it should update itself.
Again, business logic may be executed from everywhere within the application, so in practice we may have to add an extra call (telling all windows they should update themselves) after almost every action executed by the user.
Both approaches seem to have similar, but opposite disadvantages:
With the 'observer' approach we have the risk of updating the user-interface too many times
With the 'mark-dirty' approach we have the risk of not updating the user-interface at all
Both disadvantages could be solved by embedding every application action within additional logic (for observers: sending out start-end notifications, for mark-dirty: sending out update-yourself notifications).
Notice that in non-windowing applications this is probably not a problem. You could e.g. use the mark-dirty approach and only if some calculation needs the data, it may need to do some extra processing in case the data is dirty (this is a kind of caching approach).
However, for windowing applications, there is no signal that the user is 'looking at your screen' and that the windows should be updated. So there is no real good moment where you have to look at the dirty-data (although you could do some tricks with focus-events).
What is a good solution to solve this problem? And how have you solved problems like this in your application?
Notice that I don't want to introduce windowing techniques in the calculation/datamodel part of my application. If windowing techniques are needed to solve this problem, it must only be used in the user-interface part of my application.
Any idea?
An approach I used was with a large windows app a few years back was to use WM_KICKIDLE. All things that are update-able utilise a abstract base class called IdleTarget. An IdleTargetManager then intercepts the KICKIDLE messages and calls the update on a list of registered clients. In your instance you could create a list of specific targets to update but I found the list of registered clients enough.
The only gotcha I hit was with a realtime graph. Using just the kick idle message it would spike the CPU to 100% due to constant updating of the graph. Use a timer to sleep until the next refresh solved that problem.
If you need more assistance - I am available at reasonable rates...:-)
Another point I was thinking about.
If you are overwhelmed by the number of events generated, and possibly the extra-work it is causing, you may have a two phases approach:
Do the work
Commit
where notifications are only sent on commit.
It does have the disadvantage of forcing to rewrite some code...
You could use the observer pattern with coalescing. It might be a little ugly to implement in C++, though. It would look something like this:
m_observerList.beginCoalescing();
m_observerList.notify();
m_observerList.notify();
m_observerList.notify();
m_observerList.endCoalescing(); //observers are notified here, only once
So even though you call notify three times, the observers aren't actually notified until endCoalescing when the observers are only notified once.