On SO there is a heavily-upvoted (albeit "closed") question entitled "Good use case for Akka" that I am looking for some further resolution on.
Although not the accepted answer, Akka's own Viktor Klang weighed in with a heavily-upvoted answer where he states that RESTful web services are a good use case for the actor-based framework.
But this seems to be in direct conflict with the most basic staple of Akka: Akka is for asynchronous systems, whereas REST services need to be synchronous, and are typically expected to produce results within 200ms (e.g., I shouldn't have to wait 6 seconds for a simple GET to return some JSON to me).
So I ask: how is Akka ideal for REST services if it is implicitly asynchronous and non-blocking, or is this just self-marketing hype?
Why do you think that asynchronous means slow? :)
Of course because of the nature of HTTP protocol, from client point of view HTTP calls will be synchronous. But internally Akka will use its asynchronous capabilities to process requests as soon as possible.
spray.io is a standard Akka HTTP layer that will be replaced with Akka HTTP module soon (which is basically Spray 2.0). It's very lightweight and super fast.
And this is an example about how you can integrate synchronous HTTP and asynchronous Akka Actors. As you can see it creates Future and sends result back when it's done.
More advanced examples: http://techblog.net-a-porter.com/2013/12/ask-tell-and-per-request-actors/
Related
We have a monolithic Rails API that was also serving our Websockets. Recently, we outgrew ActionCable and we decided to move our Websockets to Elixir's Phoenix.
In this model clients still interact with the Rails application for HTTP requests, but Phoenix handles all the Websocket traffic. Rails communicates to Phoenix what data to send and on what channel then Phoenix acts essentially as a passthrough for what Rails sends it.
I had initially set this up using Redis PubSub for communication from Rails to Phoenix. It works well at its current scale, but I'm starting to think that it may have been an inferior choice. Here is my list of pros and cons:
Redis
Pros:
Ordered messages (not important in our case)
Acts as a proper queuing mechanism
Wicked fast and dead simple publishing from Rails
Cons:
No competitive consumers - I would have to manually implement balancing if I had multiple Phoenix consumers (a real possibility)
Concurrency is more difficult to implement well (which really acts against Elixir's strengths)
HTTP
Pros:
Concurrency comes for free
Load balancing comes for free - a request will only be fulfilled by a single Phoenix consumer
Slightly more simple to implement
Cons:
Unordered messages (not important for us)
Much slower to send a message from Rails
Would have to manually implement retry and timeouts on the HTTP requests from Rails
If a message is lost (due to server restarts or similar), it's gone for good
Even after weighing it out, I still find it hard to claim one as being the clear choice. Are there patterns for Redis or HTTP communication between services that alleviate some of my problems? If not, which of these two would be preferred, considering the cons?
Is there another simple alternative that I'm overlooking? I don't want to involve something like Rabbit MQ if it can be avoided.
I am doing some research on SOAP, for a personal project, and I came across a website with a list of pros and cons for using SOAP, and I understood what most of them meant, except for this one under disadvantages:
SOAP is typically limited to pooling, and not event notifications, when leveraging HTTP for transport. What's more, only one client can use the services of one server in typical situations.
From my understanding of pooling, there should be no issue pooling a SOAP Object for re usability. Pooling is simply a way to use the same resources over and over again, like a connection to a database. Also not entirely certain on the context of Event Notifications.
So my two questions here are, what does the above block quoted text actually mean, and is this information correct?
Website: http://searchsoa.techtarget.com/definition/SOAP
SOAP is RPC, and in RPC some local client invokes a method on some remote target and receives a result. That's how it works, so SOAP works that way too. A client invokes a service asking for something and the service just responds.
If you want "events" in this type of communication the most simple approach is to invoke the service more often (i.e. polling). This has the advantage that nothing changes for the server or the client. It's the same RPC call but done more frequently.
These days everyone is connected to the web and everyone is subscribed to all sorts of services. They want to get notified as soon as something happens to the world around them. Pooling becomes inefficient in this sea of users and services because you are wasting resources. You might poll a service a hundred times just to get back one notification. For this reason technology is evolving so that resource use is minimized. And the direction this is moving to is push services.
Now almost everything happens in the browser. Every browser manufacturer rushes to implement the latest technology changes and HTML5 spec. This means actual pages that push notifications to users instead of faking it with Ajax, comet, etc.
SOAP has been around since 1998 and it's not moving as fast as the rest of the web, mainly because SOAP is mostly an enterprise player and because it's a protocol. Because it's a protocol you have to make new technology available to it without breaking that protocol. Things move slower so people have abandoned SOAP in favor of other ways of doing server-client communication.
SOAP is typically limited to pooling, and not event notifications...
That is correct. But be aware that "typically" does not mean "always".
You can have events, but it's harder. It involves using WS-* specifications like WS-Eventing and WS-Addressing. This is a change in the way SOAP clients operate because a client now becomes some sort of a service too because it needs to receive calls too, not just initiate them. If your technology stack implements these specifications then good for you, but if it doesn't, then you have to build it yourself and it's a real pain.
So for these reasons, if you don't have blocking performance or resource usage issues, you "typically" chose doing polling with SOAP and not event notifications.
I need to work with a REST service (build with an JAX-RS implementation) in an heterogeneous environment, so I wondered how the abstractions of programming languages are converted to the real restful endpoints. I think most aspects are clear, but when it comes to asynchronous communications in REST I know several possibilities: keeping the connection open, returning a resource that can constantly be queried, chunked messages or the client transmits a callback resource.
My approach was to read the JAX-RS 2.0 Specification, but I think there is actually little stated about the REST implementation of asynchronous requests. Then I read the Jersey documentation and came to the conclusion that the JAX-RS implementations just keep the connection open for as long as the processing needs. So with "asynchronous" JAX-RS just refers to the blocking of methods on the server/client side and does not use any special behavior of REST. My first question: Is my analysis correct?
If this is the case, I have two new questions:
Is this really compliant to the REST paradigm in respect to the stateless constraint?
Considering the long-running processes that maybe work for days, is an open connection eventually automatically closed (e.g. by the OS or by a TCP timer)?
Thanks in advance!
REST architecture has got nothing to do with asynchronous programming paradigms IMO. Asynchronous implementation using #Suspended and AsynResponse interface in JAX-RS involves suspending the thread which initiated the request
To answer your questions
'So with "asynchronous" JAX-RS just refers to the blocking of methods on the server/client side and does not use any special behavior of REST'
-> REST has got nothing to do with async design in JAX-RS, but the way you design that Resource class and the setup the async method should involve RESTful principles.
Also, there is no 'blocking' as such - in fact its exactly the opposite. The I/O thread on server end is immediately suspended and returned to the container. The actual processing might still take a long time, but the real goal was to 'not block' and occupy threads. A Web container has limited number of threads dedicated to serving input requests. Prospective clients will get blocked if ALL the container threads are busy processing other clients. This is avoided by JAX-RS because it suspends the thread, returns it to the web container and responds on a different thread (internal server thread). All this increases the overall responsiveness of the application
'Considering the long-running processes that maybe work for days, is an open connection eventually automatically closed (e.g. by the OS or by a TCP timer)?'
--> Not sure what would happen in case this happens. But its not necessary to have your clients waiting 'forever' - you can specify timeouts using the TimeoutHandler (guess you might have already read this)
Just my two cents!
I venture that most but not all web services today are synchronous. A fundamental design decision existing if to implement asynchronous processing.
Is there value in implementing a processing queue system for asynchronous web services? It is a MOM/infrastructure decision with which I am toying. Instead of going system-to-system implement a middleware which will broker said transactions. The ease of management and tracking/troubleshooting of a spider web of services seems to make the most sense.
How best have you implemented asynchronous web services?
It is interesting I stumble into this question. I have exactly the same concern with the current project I am developing.
Our web services are develop using TIBCO technology, and they are also synchronous by default. We are considering creating a queue mechanism to process these requests asynchronously; the reason being: the back-end storage technology we have to interface with is notoriously slow (it is an imposed technology, and we have to deal with it)
Personally I am considering creating a 2nd WSDL definition for the asynchronous replies (which can occur from a few seconds to a few hours later than the request, depending on the load on the mentioned back-end storage.) Clients calling our Web Services will have to in turn implement a web service using this "2nd WSDL" to which we act as clients.
I'd be interested in knowing the directions you are exploring.
I'm working on a real time application implemented using in a SOA-style (read loosely coupled components connected via some messaging protocol - JMS, MQ or HTTP).
The architect who designed this system opted to use JMS to connect the components. This system is real time so there no need to queue up messages should one component fail (the transaction will simply time out). Further, there is no need for guaranteed delivery or rollback.
In this instance, is there any benefit to using JMS over something like an HTTP web service (speed, resource footprint, etc)?
One thing that I'm thinking is since the JMS approach requires us to set a thread pool size (the number of components listening to a JMS topic/queue), wouldn't a HTTP service be a better fit since this additional configuration is not needed (a new thread is created for each HTTP request making the application scalable to an "unlimited" number of requests until the server runs out of resources).
Am I missing something?
I don't disagree with the points made by S.Lott at all, but here are a couple of points to consider regarding HTTP web services:
Your clients only need to know how to communicate via HTTP - a protocol well supported by just about every modern langauge in one form or another. JMS, though popular, is more specialist than HTTP, and so restricts the languages your interconnected systems can use. Perhaps not an issue for your system at the moment, but will you need to plug in other systems later that might struggle to support JMS connectivity?
Standards like WSDL and SOAP which you could levarage for your services are well supported by many langauges and there are plenty of tools around that will generate code to implement both ends of the pipeline (client and server) for you from a WSDL file, reducing the amount of dev you'll have to do. These standards also make it relatively simple to define and publish the specification of the data you'll be passing between your systems, something you'll presumably have to do by hand using a queueing technology like JMS.
On the downside, as pointed out by S.Lott, JMS gives you functionality that you throw away using the (stateless) HTTP protocol: guaranteed ordering & reliability; monitoring; scalability; etc. Are you sure you don't need these, and won't need these going forward?
Great question, btw.
I think it's really dependent on the situation. Where I work, we support Remoting, JMS, MQ, HTTP, and sFTP. We are implementing a middleware appliance that speaks Remoting, JMS, MQ, and HTTP, and a software middleware component that speaks JMS, MQ, and HTTP.
As sgreeve alluded to above, standards help us become flexible, but proprietary formats allow more functionality.
In a nutshell, I'd say use HTTP for stateless calls (which could end up meeting almost all of your needs), and whatever proprietary formats you need for stateful calls. If you work in a big enterprise, a hardware appliance is usually a great fit as middleware: Lightning fast compression, encryption, transformation, and translation, with very low total cost of ownership.
I don't know enough about your requirements, but you may be overlooking Manageability, Flexibility and Performance.
JMS allows you to monitor and manage the queue. These are features HTTP lacks, and you'd have to build rather than buy from a vendor.
Also, There are queues and topics in JMS, allowing multiple subscribers to a single publisher. Not possible in HTTP.
While you may not need those things in release 1.0, you might want them in the future.
Also, JMS may be able to use other transport mechanisms like named sockets, which reduces the overheads if there isn't all that socket negotiation going on with (almost) every request.
If you go down the HTTP route and you want to support more than one machine or some kind of reliability - you are going to need a load balancer capable of discovering the available web servers and loading requests across them - then failing over to another web server if a particular box/process dies. Clients making HTTP requests are also going to have to deal with servers failing and retrying operations in some loop.
This is one of the main features of a message queue - reliable load balancing with failover and loose coupling among the producers and consumers without them having to include retry logic - so your client or server code doesn't have to worry about this kinda thing. This is totally separate to whether or not you want message persistence or want to use ACID transactions to produce/consume messages (which can be very handy BTW).
If you focus just on the server side using Java - whether Servlets or MessageListener/MDBs they are kinda similar either way really. The difference is the load balancer.
So maybe the question should really be - is a JMS broker easier to setup & work with than setting up your DNS/NAT/IP/HTTP load balancer infrastructure?
I suppose it depends on what you mean by real-time... Neither JMS nor HTTP in my opinion support "real-time" applications well, meaning they cannot offer predictable/deterministic performance nor properly prioritize flows in the presence of contention.
Part of it is that these technologies are built on top of TCP which serializes all traffic into a single FIFO meaning that different traffic flows cannot be easily prioritized. Moreover TCP timers are not easily controlled resulting unpredictable blocking and timeouts... For this reason many streaming applications use UDP instead of TCP as an underlying protocol.
Another problem with JMS is that typical implementations use a broker that centralizes message dispatch. This is not the best architecture to get deterministic performance.
If you are looking for a middleware that can offer you the kind of reliability guarantees and publish-subscribe semantics you get with JMS, but was developed to fit the real-time application domain I recommend you take a look at the OMG Data-Distribution Service (DDS). See dds.omg.org and this article I wrote arguing why DDS is the best middleware to implement a real-time SOA. http://soa.sys-con.com/node/467488