SOAP Pooling Advantages / Disadvantages - web-services

I am doing some research on SOAP, for a personal project, and I came across a website with a list of pros and cons for using SOAP, and I understood what most of them meant, except for this one under disadvantages:
SOAP is typically limited to pooling, and not event notifications, when leveraging HTTP for transport. What's more, only one client can use the services of one server in typical situations.
From my understanding of pooling, there should be no issue pooling a SOAP Object for re usability. Pooling is simply a way to use the same resources over and over again, like a connection to a database. Also not entirely certain on the context of Event Notifications.
So my two questions here are, what does the above block quoted text actually mean, and is this information correct?
Website: http://searchsoa.techtarget.com/definition/SOAP

SOAP is RPC, and in RPC some local client invokes a method on some remote target and receives a result. That's how it works, so SOAP works that way too. A client invokes a service asking for something and the service just responds.
If you want "events" in this type of communication the most simple approach is to invoke the service more often (i.e. polling). This has the advantage that nothing changes for the server or the client. It's the same RPC call but done more frequently.
These days everyone is connected to the web and everyone is subscribed to all sorts of services. They want to get notified as soon as something happens to the world around them. Pooling becomes inefficient in this sea of users and services because you are wasting resources. You might poll a service a hundred times just to get back one notification. For this reason technology is evolving so that resource use is minimized. And the direction this is moving to is push services.
Now almost everything happens in the browser. Every browser manufacturer rushes to implement the latest technology changes and HTML5 spec. This means actual pages that push notifications to users instead of faking it with Ajax, comet, etc.
SOAP has been around since 1998 and it's not moving as fast as the rest of the web, mainly because SOAP is mostly an enterprise player and because it's a protocol. Because it's a protocol you have to make new technology available to it without breaking that protocol. Things move slower so people have abandoned SOAP in favor of other ways of doing server-client communication.
SOAP is typically limited to pooling, and not event notifications...
That is correct. But be aware that "typically" does not mean "always".
You can have events, but it's harder. It involves using WS-* specifications like WS-Eventing and WS-Addressing. This is a change in the way SOAP clients operate because a client now becomes some sort of a service too because it needs to receive calls too, not just initiate them. If your technology stack implements these specifications then good for you, but if it doesn't, then you have to build it yourself and it's a real pain.
So for these reasons, if you don't have blocking performance or resource usage issues, you "typically" chose doing polling with SOAP and not event notifications.

Related

How to create a full-duplex communication between API and various clients?

In my website, I'd like to create a public API that would allow clients (unknown people) to interact with my services. A classic REST API would work well in that case.
However, I need to be able to send events to the clients too. These events are not related to client HTTP requests. I saw "webhooks" are a way to deal with this. If I understood well, with webhooks, my service would send HTTP POST requests to a URL specified by the client, with event data inside this request.
I think websocket can be used too as a solution for this full-duplex communication need.
What I want to know, is which method would be the simplest for clients to implement to talk to my services? Simplicity is the key point here.
The hard thing is that my clients can use various technologies (full websites with HTTP servers, iOS/Android apps without server, etc.)
What are implications for clients if I use REST API + webhooks? Websockets? etc?
How to make a choice?
Hope it's clear (but not sure). Thanks :)
I would consider webhooks a simpler solution. And yes, you understood it well, that with webhooks, a developer using your API would register a URL where your backend would POST event data. It's a common pattern that's used in APIs.
A great benefit of using a webhooks design is that a client/server connection does not need to stay open. After all, if events occur infrequently (i.e. only a few times per hour, or per day) or keeping a consistent connection open is a challenge, establishing a connection only when it's needed is rather efficient.
The challenge of using webhooks for you, the API provider, is designing an evented backend system that deals with change of state detection and reliable webhook calling mechanisms (i.e. dealing with webhook receiver URLs that are unresponsive or throw errors).
The challenge of using webhooks on the developer end is that they need to stand up a reliable web server that listens for the event POST data from your server.
Realtime APIs (i.e. based on Websockets, Bayeux/CometD) are really swell because that live connection means that new connections do not have to be established, which is particularly useful with very chatty sessions. Additionally, there are a lot of projects and companies out there that have taken care of the heavy lifting on the server and client with fully-baked libraries. One of those is Fanout.io which makes pushing messages between the client/server possible with just a few lines of code, utilizing XMPP, Bayeux, and Websockets when possible.
(I am not affiliated with Fanout, but I have used it)
So, to sum it up, webhooks are simple mostly because you are already familiar with the architecture needed to implement them, and the pattern is a well traveled one. If you are leaning toward a persistent connection approach, I would look at tools/platforms like Fanout because it takes care of the heavy lifting (i.e. subscribe/publish, concurrent connection scale, client/server libraries).

How can SOAP-based service can be seen as a special case of a REST-style service?

A quote from Java Web Services: Up And Running, Second Edition book :
"At present, the distinction between the two flavours of web service is
not sharp,
because a SOAP-based service delivered over HTTP can be seen as a special case
of a REST-style service;"
How ?
How?
I believe the writer's statement is incorrect.
What is SOAP?
According to wikipedia:
SOAP can form the foundation layer of a web services protocol stack,
providing a basic messaging framework upon which web services can be
built. This XML based protocol consists of three parts: an envelope,
which defines what is in the message and how to process it, a set of
encoding rules for expressing instances of application-defined
datatypes, and a convention for representing procedure calls and
responses. SOAP has three major characteristics: Extensibility
(security and WS-routing are among the extensions under development),
Neutrality (SOAP can be used over any transport protocol such as HTTP,
SMTP, TCP, or JMS) and Independence (SOAP allows for any programming
model).
As you can see, there really isn't anything in this description of SOAP that takes any ideological stance over what the structure of your API calls (url wise) must adhere to. Of course, soap uses XML, and XML can have a data structure that essentially works as the rule-set of your API call... thats cool.
In contrast, we have REST.
What is REST?
According to wikipedia:
The REST architectural style describes the following six constraints
applied to the architecture, while leaving the implementation of the
individual components free to design:
Client–server: Servers are not concerned with the user interface or user state, so that servers can be simpler and more scalable.
Stateless: The client–server communication is further constrained by no client context being stored on the server between requests.
Cacheable: Responses must, implicitly or explicitly, define themselves as cacheable, or not, to prevent clients reusing stale or inappropriate data in response to further requests.
Layered system: A client cannot ordinarily tell whether it is connected directly to the end server, or to an intermediary along the way. Intermediary servers may improve system scalability by enabling load-balancing and by providing shared caches.
Code on demand (optional): Servers can temporarily extend or customize the functionality of a client by the transfer of executable code.
Uniform interface: The uniform interface between clients and servers, discussed below, simplifies and decouples the architecture, which enables each part to evolve independently. (i.e. HTTP GET, POST, PUT, PATCH, DELETE)
Comparison
In my mind, it shouldn't be described as SOAP vs REST, it should be RPC vs REST. RPC is remote procedural call, which basically means that every single functionality of your API gets 1 distinct API endpoint, and so on. so, REST can do with 1 url what RPC does with 7. SOAP is RPC (right?)
Yes, both are web services.
But saying that an RPC API is RESTful-ish because its transmitted over HTTP is hardly grounds to say they are similar... from the detailed information above, you can see that REST takes a much more ideological approach to the structure, transfer, purpose, scalability, and state of your service, whereas SOAP doesn't really talk about those things, and presumably the developer can choose to do, or not do, those things.
In conclusion, more context is needed for me to really understand what point the author was trying to make. An RPC API can be similar to REST if you make it do RESTful things.. but that is really circumstantial, isn't it?

Best practice to integrate web services (with Camel)

I have the following situation. Several services provide their functionality mainly via SOAP interfaces. There is one module that wants to consume this functionality for integration into a website. What would be the best practice to do that?
The functionality of the services is subject to change. Therefore, each single function/method should be "reroutable". The web service is probably hosted on a different machine.
Is it reasonable to map all web services to JMS queues (my first idea)? The website module would only talk to JMS then. A router would route all incoming JMS messages to the different web services (or elsewhere).
Or: There could be one dedicated web service, that integrates all functions, to be used exclusively by the web site? The advantage here would be that parameters and return values are typed.
What would you suggest? What could be another, better approach?
If I understood you right, what you're aiming at is providing a homogeneous interface, a coherent API for your webapp module, serving a purpose of a facade for multiple remote interfaces (mainly SOAP ones).
Regarding the JMS approach you've mentioned - it seems reasonable, but:
instead of many queues I'd rather go with a single JMS destination with a Camel content-based router immediately after the queue (or two queues for Request/Reply pattern) That would make things "reroutable" and isolate the web module from service changes.
your requests would be less prone to services-related errors, brief problems in service availability and such, while still retaining an ability to perform Request/Reply style calls (which are common to RPC-ish nature of SOAP).
i'd use one-way style calls wherever they are applicable (for increased reliability and reactivity).
You should not worry about lack of typing, WSDL+SOAP seems to enforce strong typing but that's an illusion driven by auto-generated stubs. You still have to marshal the data back and forth.
Instead of SOAP I'd go with JSON, as it is far cleaner and less redundant than XML (probably faster, but that's usually irrelevant). Jackson is a very efficient JSON library and it's already supported in Camel distributions. JMS ObjectMessage is a big NO (a good article on some of the ObjectMessage pitfalls )
The single-service approach seems to be a good way to separate the web-module from the service layer. It lacks the flexibility and fault-tolerance of the JMS approach but seems tad easier to implement.
If there are many calls that can be concluded in a one-way fashion, I'd say go with JMS and reroute the messages after the queue.

Designing an architecture for exchanging data between two systems

I've been tasked with creating an intermediate layer which needs to exchange data (over HTTP) between two independent systems (e.g. Receiver <=> Intermediate Layer (IL) <=> Sender). Receiver and Sender both expose a set of API's via Web Services. Everytime a transaction occurs in the Sender system, the IL should know about it (I'm thinking of creating a Windows Service which constantly pings the Sender), massage the data, then deliver it to the Receiver. The IL can temporarily store the data in a SQL database until it is transferred to the Receiver. I have the following questions -
Can WCF (haven't used it a lot) be used to talk to the Sender and Receiver (both expose web services)?
How do I ensure guaranteed delivery?
How do I ensure security of the messages over the Internet?
What are best practices for handling concurrency issues?
What are best practices for error handling?
How do I ensure reliability of the data (data is not tampered along the way)
How do I ensure the receipt of the data back to the Sender?
What are the constraints that I need to be aware of?
I need to implement this on MS platform using a custom .NET solution. I was told not to use any middleware like BizTalk. The receiver is an SDFC instance, if that matters.
Any pointers are greatly appreciated. Thank you.
A Windows Service that orchestras the exchange sounds fine.
Yes WCF can deal with traditional Web Services.
How do I ensure guaranteed delivery?
To ensure delivery you can use TransactionScope to handle the passing of data between the
Receiver <=> Intermediate Layer and Intermediate Layer <=> Sender but I wouldn't try and do them together.
You might want to consider some sort of queuing mechanism to send the data to the receiver; I guess I'm thinking more of a logical queue rather than an actual queuing component. A workflow framework could also be an option.
make sure you have good logging / auditing in place; make sure it's rock solid, has the right information and is easy to read. Assuming you write a service it will execute without supervision so the operational / support aspects are more demanding.
Think about scenarios:
How do you manage failed deliveries?
What happens if the reciever (or sender) is unavailbale for periods of time (and how long is that?); for example: do you need to "escalate" to an operator via email?
How do I ensure security of the messages over the Internet?
HTTPS. Assuming other existing clients make calls to the Web Services how do they ensure security? (I'm thinking encryption).
What are best practices for handling concurrency issues?
Hmm probably a separate question. You should be able to find information on that easily enough. How much data are we taking? what sort of frequency? How many instances of the Windows Service were you thinking of having - if one is enough why would concurrency be an issue?
What are best practices for error handling?
Same as for concurrency, but I can offer some pointers:
Use an established logging framework, I quite like MS EntLibs but there are others (re-using whatever's currently used is probably going to make more sense - if there is anything).
Remember that execution is unattended so ensure information is complete, clear and unambiguous. I'd be tempted to log more and dial it down once a level of comfort is reached.
use a top level handler to ensure nothing get's lost; but don;t be afraid to log deep in the application where you can still get useful context (like the metadata of the data being sent / recieved).
How do I ensure the receipt of the data back to the Sender?
Include it (sending the receipt) as a step that is part of the transaction.
On a different angle - have a look on CodePlex for ESB type libraries, you might find something useful: http://www.codeplex.com/site/search?query=ESB&ac=8
For example ESBasic which seems to be a class library which you could reuse.

Best messaging medium for real-time SOA applications?

I'm working on a real time application implemented using in a SOA-style (read loosely coupled components connected via some messaging protocol - JMS, MQ or HTTP).
The architect who designed this system opted to use JMS to connect the components. This system is real time so there no need to queue up messages should one component fail (the transaction will simply time out). Further, there is no need for guaranteed delivery or rollback.
In this instance, is there any benefit to using JMS over something like an HTTP web service (speed, resource footprint, etc)?
One thing that I'm thinking is since the JMS approach requires us to set a thread pool size (the number of components listening to a JMS topic/queue), wouldn't a HTTP service be a better fit since this additional configuration is not needed (a new thread is created for each HTTP request making the application scalable to an "unlimited" number of requests until the server runs out of resources).
Am I missing something?
I don't disagree with the points made by S.Lott at all, but here are a couple of points to consider regarding HTTP web services:
Your clients only need to know how to communicate via HTTP - a protocol well supported by just about every modern langauge in one form or another. JMS, though popular, is more specialist than HTTP, and so restricts the languages your interconnected systems can use. Perhaps not an issue for your system at the moment, but will you need to plug in other systems later that might struggle to support JMS connectivity?
Standards like WSDL and SOAP which you could levarage for your services are well supported by many langauges and there are plenty of tools around that will generate code to implement both ends of the pipeline (client and server) for you from a WSDL file, reducing the amount of dev you'll have to do. These standards also make it relatively simple to define and publish the specification of the data you'll be passing between your systems, something you'll presumably have to do by hand using a queueing technology like JMS.
On the downside, as pointed out by S.Lott, JMS gives you functionality that you throw away using the (stateless) HTTP protocol: guaranteed ordering & reliability; monitoring; scalability; etc. Are you sure you don't need these, and won't need these going forward?
Great question, btw.
I think it's really dependent on the situation. Where I work, we support Remoting, JMS, MQ, HTTP, and sFTP. We are implementing a middleware appliance that speaks Remoting, JMS, MQ, and HTTP, and a software middleware component that speaks JMS, MQ, and HTTP.
As sgreeve alluded to above, standards help us become flexible, but proprietary formats allow more functionality.
In a nutshell, I'd say use HTTP for stateless calls (which could end up meeting almost all of your needs), and whatever proprietary formats you need for stateful calls. If you work in a big enterprise, a hardware appliance is usually a great fit as middleware: Lightning fast compression, encryption, transformation, and translation, with very low total cost of ownership.
I don't know enough about your requirements, but you may be overlooking Manageability, Flexibility and Performance.
JMS allows you to monitor and manage the queue. These are features HTTP lacks, and you'd have to build rather than buy from a vendor.
Also, There are queues and topics in JMS, allowing multiple subscribers to a single publisher. Not possible in HTTP.
While you may not need those things in release 1.0, you might want them in the future.
Also, JMS may be able to use other transport mechanisms like named sockets, which reduces the overheads if there isn't all that socket negotiation going on with (almost) every request.
If you go down the HTTP route and you want to support more than one machine or some kind of reliability - you are going to need a load balancer capable of discovering the available web servers and loading requests across them - then failing over to another web server if a particular box/process dies. Clients making HTTP requests are also going to have to deal with servers failing and retrying operations in some loop.
This is one of the main features of a message queue - reliable load balancing with failover and loose coupling among the producers and consumers without them having to include retry logic - so your client or server code doesn't have to worry about this kinda thing. This is totally separate to whether or not you want message persistence or want to use ACID transactions to produce/consume messages (which can be very handy BTW).
If you focus just on the server side using Java - whether Servlets or MessageListener/MDBs they are kinda similar either way really. The difference is the load balancer.
So maybe the question should really be - is a JMS broker easier to setup & work with than setting up your DNS/NAT/IP/HTTP load balancer infrastructure?
I suppose it depends on what you mean by real-time... Neither JMS nor HTTP in my opinion support "real-time" applications well, meaning they cannot offer predictable/deterministic performance nor properly prioritize flows in the presence of contention.
Part of it is that these technologies are built on top of TCP which serializes all traffic into a single FIFO meaning that different traffic flows cannot be easily prioritized. Moreover TCP timers are not easily controlled resulting unpredictable blocking and timeouts... For this reason many streaming applications use UDP instead of TCP as an underlying protocol.
Another problem with JMS is that typical implementations use a broker that centralizes message dispatch. This is not the best architecture to get deterministic performance.
If you are looking for a middleware that can offer you the kind of reliability guarantees and publish-subscribe semantics you get with JMS, but was developed to fit the real-time application domain I recommend you take a look at the OMG Data-Distribution Service (DDS). See dds.omg.org and this article I wrote arguing why DDS is the best middleware to implement a real-time SOA. http://soa.sys-con.com/node/467488