I need to work with a REST service (build with an JAX-RS implementation) in an heterogeneous environment, so I wondered how the abstractions of programming languages are converted to the real restful endpoints. I think most aspects are clear, but when it comes to asynchronous communications in REST I know several possibilities: keeping the connection open, returning a resource that can constantly be queried, chunked messages or the client transmits a callback resource.
My approach was to read the JAX-RS 2.0 Specification, but I think there is actually little stated about the REST implementation of asynchronous requests. Then I read the Jersey documentation and came to the conclusion that the JAX-RS implementations just keep the connection open for as long as the processing needs. So with "asynchronous" JAX-RS just refers to the blocking of methods on the server/client side and does not use any special behavior of REST. My first question: Is my analysis correct?
If this is the case, I have two new questions:
Is this really compliant to the REST paradigm in respect to the stateless constraint?
Considering the long-running processes that maybe work for days, is an open connection eventually automatically closed (e.g. by the OS or by a TCP timer)?
Thanks in advance!
REST architecture has got nothing to do with asynchronous programming paradigms IMO. Asynchronous implementation using #Suspended and AsynResponse interface in JAX-RS involves suspending the thread which initiated the request
To answer your questions
'So with "asynchronous" JAX-RS just refers to the blocking of methods on the server/client side and does not use any special behavior of REST'
-> REST has got nothing to do with async design in JAX-RS, but the way you design that Resource class and the setup the async method should involve RESTful principles.
Also, there is no 'blocking' as such - in fact its exactly the opposite. The I/O thread on server end is immediately suspended and returned to the container. The actual processing might still take a long time, but the real goal was to 'not block' and occupy threads. A Web container has limited number of threads dedicated to serving input requests. Prospective clients will get blocked if ALL the container threads are busy processing other clients. This is avoided by JAX-RS because it suspends the thread, returns it to the web container and responds on a different thread (internal server thread). All this increases the overall responsiveness of the application
'Considering the long-running processes that maybe work for days, is an open connection eventually automatically closed (e.g. by the OS or by a TCP timer)?'
--> Not sure what would happen in case this happens. But its not necessary to have your clients waiting 'forever' - you can specify timeouts using the TimeoutHandler (guess you might have already read this)
Just my two cents!
Related
On SO there is a heavily-upvoted (albeit "closed") question entitled "Good use case for Akka" that I am looking for some further resolution on.
Although not the accepted answer, Akka's own Viktor Klang weighed in with a heavily-upvoted answer where he states that RESTful web services are a good use case for the actor-based framework.
But this seems to be in direct conflict with the most basic staple of Akka: Akka is for asynchronous systems, whereas REST services need to be synchronous, and are typically expected to produce results within 200ms (e.g., I shouldn't have to wait 6 seconds for a simple GET to return some JSON to me).
So I ask: how is Akka ideal for REST services if it is implicitly asynchronous and non-blocking, or is this just self-marketing hype?
Why do you think that asynchronous means slow? :)
Of course because of the nature of HTTP protocol, from client point of view HTTP calls will be synchronous. But internally Akka will use its asynchronous capabilities to process requests as soon as possible.
spray.io is a standard Akka HTTP layer that will be replaced with Akka HTTP module soon (which is basically Spray 2.0). It's very lightweight and super fast.
And this is an example about how you can integrate synchronous HTTP and asynchronous Akka Actors. As you can see it creates Future and sends result back when it's done.
More advanced examples: http://techblog.net-a-porter.com/2013/12/ask-tell-and-per-request-actors/
We would like to run our cms::MessageConsumer and cms::MessageProducer on different threads of the same process.
How do we do this safely?
Would having two cms::Connection objects and two cms::Session objects, one each for consumer and producer, be sufficient to guarantee safety? Is this necessary?
Is there shared state between objects at the static library level that would prevent this type of usage?
You should read the JMS v1.1 specification, it calls out clearly which objects are valid to use in multiple threads and which are not. Namely the Session, MessageConsumer and MessageProducer are considered unsafe to share amongst threads. We generally try to make them as thread safe as we can but there are certainly ways in which you can get yourself into trouble. Its generally a good idea to use a single session in each thread and in general its a good idea to use a session for each MessageConsumer / MessageProducer since the Session contains a single dispatch thread which means that a session with many consumers must share its dispatch thread for sending messages on to each consumer which can lower latency depending on the scenario.
I'm answering my own question to supplement Tim Bish's answer, which I am accepting as having provided the essential pieces of information.
From http://activemq.apache.org/cms/cms-api-overview.html
What is CMS?
The CMS API is a C++ corollary to the JMS API in Java which is used to
send and receive messages from clients spread out across a network or
located on the same machine. In CMS we've made every attempt to
maintain as much parity with the JMS api as possible, diverging only
when a JMS feature depended strongly on features in the Java
programming language itself. Even though there are some differences
most are quite minor and for the most part CMS adheres to the JMS
spec, so having a firm grasp on how JMS works should make using CMS
that much easier.
What does the JMS spec say about thread safety?
Download spec here:
http://download.oracle.com/otndocs/jcp/7195-jms-1.1-fr-spec-oth-JSpec/
2.8 Multithreading JMS could have required that all its objects support concurrent use. Since support for concurrent access typically
adds some overhead and complexity, the JMS design restricts its
requirement for concurrent access to those objects that would
naturally be shared by a multithreaded client. The remainder are
designed to be accessed by one logical thread of control at a time.
JMS defines some specific rules that restrict the concurrent use of
Sessions. Since they require more knowledge of JMS specifics than we
have presented at
Table 2-2 JMS Objects that Support Concurrent Use
Destination: YES
ConnectionFactory: YES
Connection: YES
Session: NO
MessageProducer: NO
MessageConsumer: NO
this point, they will be described later. Here we will describe the
rationale for imposing them.
There are two reasons for restricting concurrent access to Sessions.
First, Sessions are the JMS entity that supports transactions. It is
very difficult to implement transactions that are multithreaded.
Second, Sessions support asynchronous message consumption. It is
important that JMS not require that client code used for asynchronous
message consumption be capable of handling multiple, concurrent
messages. In addition, if a Session has been set up with multiple,
asynchronous consumers, it is important that the client is not forced
to handle the case where these separate consumers are concurrently
executing. These restrictions make JMS easier to use for typical
clients. More sophisticated clients can get the concurrency they
desire by using multiple sessions.
As far as I know from the Java side, the connection is thread safe (and rather expensive to create) but Session and messageProducer are not thread safe. Therefore it seems you should create a Session for each of your threads.
In synchronous model, when a client connects to the server, both the client and server have to sync with each other to finish some operations.
Meanwhile, the asynchronous model allows client and server to work separated and independently. The client sends a request to establish a connection and do something. While the server is processing the request, the client can do something else. Upon completion of an operation, the completion event is placed onto a queue in an Event Demultiplexer, waiting for a Proactor (such as HTTP Handler) to send the request back and invoke a Completion Handler (on the client). The terms are used as in boost::asio document The Proactor Design Pattern: Concurrency Without Threads.
By working this way, the asynchronous model can accepts simultaneous connections without having to create a thread per connection, thus improve overall performance. In order to achieve the same effect as asynchronous model, the first model (synchronous) must be multi-threaded. For more detail, refer to: Proactor Pattern (I actually learn proactor pattern which is used to asynchronous model from that document. In here it has description on a typical synchronous I/O web server).
Is my understanding on the subject correct? If so, which means the asynchronous server can accepts request and return results asynchronously (the first connection request the service on web server does not need to be the first to reply to)? In essence, asynchronous model does not use threading (or threading is used in individual components, such as in the Proactor, Asynchronous Event Multiplexer (boost::asio document) component, not by creating an entire client-server application stack, which is describe in the multi-threaded model in Proactor Pattern document, section 2.2 - Common Traps and Pitfalls of Conventional Concurrency Models).
The Proactor model assumes splitting the network session process in a subtasks like: resolving hostname, accepting or connecting, reading or writing some part of information, closing connection - and allows you to switch between subtasks from different sessions. Whereas, the Reactor model sees the network session process as a (almost) single task.
The absolute Proactor advantages:
The performance is boosted because of the task "outsourcing". For example, you can send resolution request to the DNS and wait 5 minutes for answer doing nothing (Reactor) - or you can do other stuff while waiting (Proactor).
The absolute Proactor disadvantages:
The performance is decreased because of the task switching, which means that for the single session you execute more code (Proactor) than it should be (Reactor).
But the overall performance usually is measured in a number of "satisfied" clients per time period. So, the advantages of Proactor vs. Reactor depend on the situation. Here goes some examples.
HTTP server. The client wants to see something in his browser window. He doesn't need to wait before the whole page is loaded to see the first pieces of text. The Proactor is effective, since the partial page loading is faster than the whole page loading. Still the whole page is loaded about the same time as in the Reactor model.
Low-latency game server. The client wants to get the complete result of his command as quick as possible. The Reactor is effective, since there are no subtasks like partial reading or writing - the client won't see anything until he reads the full response. So, the Reactor won't do additional switches between subtasks and at each moment it's guaranteed that some client gets progress on his command, while the Proactor will force all of the clients wait each other unpredictable time.
The multi-threading can give you a linear acceleration in both cases.
I'm writing a tcp server for an online turn-based game. I've already written a prototype using php sockets, but would like to move to C++. I've been looking at the popular network libraries (ASIO, ACE, POCO, LibEvent), but currently unclear which one would best suit my needs:
1) Connections are persistent (on the order of minutes), and the server must be able to handle 100+ simultaneous connections.
2) Connections must be able to maintain state information (user login info). [my php prototype currently requires each client request to contain the login info]
3) Optionally and preferably multi-threaded, but a single process. Prefer not to have 1 thread per connection, but a fixed number of threads working on all open connections.
I'm leaning towards POCO's TCPServer or Reactor frameworks, but not exactly sure if they meet my requirements. I think the Reactor is single threaded, and the TCPServer enforces 1:1 threading/connection. Am I correct?
In either case case, I'm not exactly sure how to do the most important task of associating login info to a specific connection with connections coming and going at random.
Boost.Asio should meet your requirements. The reactor queue can be serviced by multiple threads. Using asynchronous methods will enable your design of a fixed number of threads servicing all connections.
The tutorials and examples are probably the best place to start if you are unfamiliar with the library.
You might also take a look at MUSCLE, a multi-user networking library and server I wrote with this sort of application in mind. It's BSD-licensed, handles hundreds of users, and includes a server-side database mechanism for storing and sharing any information you want the clients to know about each other. The server is single-threaded by default, but I haven't found that to be a problem in practice (and it's possible to extend the server to be multithreaded if that turns out to be necessary).
I want to know the technical reasons why the lift webframework has high performance and scalability? I know it uses scala, which has an actor library, but according to the install instructions it default configuration is with jetty. So does it use the actor library to scale?
Now is the scalability built right out of the box. Just add additional servers and nodes and it will automatically scale, is that how it works? Can it handle 500000+ concurrent connections with supporting servers.
I am trying to create a web services framework for the enterprise level, that can beat what is out there and is easy to scale, configurable, and maintainable. My definition of scaling is just adding more servers and you should be able to accommodate the extra load.
Thanks
Lift's approach to scalability is within a single machine. Scaling across machines is a larger, tougher topic. The short answer there is: Scala and Lift don't do anything to either help or hinder horizontal scaling.
As far as actors within a single machine, Lift achieves better scalability because a single instance can handle more concurrent requests than most other servers. To explain, I first have to point out the flaws in the classic thread-per-request handling model. Bear with me, this is going to require some explanation.
A typical framework uses a thread to service a page request. When the client connects, the framework assigns a thread out of a pool. That thread then does three things: it reads the request from a socket; it does some computation (potentially involving I/O to the database); and it sends a response out on the socket. At pretty much every step, the thread will end up blocking for some time. When reading the request, it can block while waiting for the network. When doing the computation, it can block on disk or network I/O. It can also block while waiting for the database. Finally, while sending the response, it can block if the client receives data slowly and TCP windows get filled up. Overall, the thread might spend 30 - 90% of it's time blocked. It spends 100% of its time, however, on that one request.
A JVM can only support so many threads before it really slows down. Thread scheduling, contention for shared-memory entities (like connection pools and monitors), and native OS limits all impose restrictions on how many threads a JVM can create.
Well, if the JVM is limited in its maximum number of threads, and the number of threads determines how many concurrent requests a server can handle, then the number of concurrent requests will be determined by the number of threads.
(There are other issues that can impose lower limits---GC thrashing, for example. Threads are a fundamental limiting factor, but not the only one!)
Lift decouples thread from requests. In Lift, a request does not tie up a thread. Rather, a thread does an action (like reading the request), then sends a message to an actor. Actors are an important part of the story, because they are scheduled via "lightweight" threads. A pool of threads gets used to process messages within actors. It's important to avoid blocking operations inside of actors, so these threads get returned to the pool rapidly. (Note that this pool isn't visible to the application, it's part of Scala's support for actors.) A request that's currently blocked on database or disk I/O, for example, doesn't keep a request-handling thread occupied. The request handling thread is available, almost immediately, to receive more connections.
This method for decoupling requests from threads allows a Lift server to have many more concurrent requests than a thread-per-request server. (I'd also like to point out that the Grizzly library supports a similar approach without actors.) More concurrent requests means that a single Lift server can support more users than a regular Java EE server.
at mtnyguard
"Scala and Lift don't do anything to either help or hinder horizontal scaling"
Ain't quite right. Lift is highly statefull framework. For example if a user requests a form, then he can only post the request to the same machine where the form came from, because the form processeing action is saved in the server state.
And this is actualy a thing which hinders scalability in a way, because this behaviour is inconistent to the shared nothing architecture.
No doubt that lift is highly performant but perfomance and scalability are two different things. So if you want to scale horizontaly with lift you have to define sticky sessions on the loadbalancer which will redirect a user during a session to the same machine.
Jetty maybe the point of entry, but the actor ends up servicing the request, I suggest having a look at the twitter-esque example, 'skitter' to see how you would be able to create a very scalable service. IIRC, this is one of the things that made the twitter people take notice.
I really like #dre's reply as he correctly states the statefulness of lift being a potential problem for horizontal scalability.
The problem -
Instead of me describing the whole thing again, check out the discussion (Not the content) on this post. http://javasmith.blogspot.com/2010/02/automagically-cluster-web-sessions-in.html
Solution would be as #dre said sticky session configuration on load balancer on the front and adding more instances. But since request handling in lift is done in thread + actor combination you can expect one instance handle more requests than normal frameworks. This would give an edge over having sticky sessions in other frameworks. i.e. Individual instance's capacity to process more may help you to scale
you have Akka lift integration which would be another advantage in this.