I tried scouring the net and 90% of times came across pages detailing "HOW" to use Apache to implement the reverse proxy.
I am thinking how exactly the reverse proxy plugins is coded?
I know they parse the request and see to which server it should be routed to.
Do they then create a thread for every connection from the end user and then delegate that thread the responsibility to connect to right server.
Keep on accepting more requests from other clients and creating similar threads.
When thread gets the response from server, reply with that to the client. And close the thread. Or do they have a thread pool?
I am thinking about it from C++ angle. If multithreading is used to increase the proxy's throughput.
A bit dated, but very much worth the read - http://www.kegel.com/c10k.html. After reading that you should have a good idea of why a thread per connection is a really bad idea. If you are really interesting in learning how scalable or high performance servers are implemented, I suggest digging in and reading some source code. I particularly like the source for Apache HTTPD.
Related
I have to design a server which can able to send a same objects to many clients. clients may send some request to the server if it wants to update something in the database.
Things which are confusing:
My server should start the program (where I perform some operation and produce 'results' , this will be send to the client).
My server should listen to the incoming connection from the client, if any it should accept and start sending the ‘results’.
Server should accept as many clients as possible (Not more than 100).
My ‘result' should be secured. I don’t want some one take my ‘result' and see what my program logics look like.
I thought point 1. is one thread. And point 2. is another thread and it going to create multiple threads within its scope to serve point 3. Point 4 should be taken by my application logic while serialising the 'result' rather the server.
Is it a bad idea? If so where can i improve?
Thanks
Putting every connection on a thread is very bad, and is apparently a common mistake that beginners do. Every thread costs about 1 MB of memory, and this will overkill your program for no good reason. I did ask the very same question before, and I got a very good answer. I used boost ASIO, and the server/client project is finished since months, and it's a running project now beautifully.
If you use C++ and SSL (to secure your connection), no one will see your logic, since your programs are compiled. But you have to write your own communication protocol/serialization in that case.
I have been working with spring web applications using jetty/tomcat app server for around two years now, however the thing that eludes me still is how are multiple requests handled in these servers. I understand that spring is helpful in making singletons, but my understanding is just limited to that.
Can someone point to any good resource that can help me understand how multiple requests are handled.
This can be answered at so many levels I have been staring at it for two days trying to figure out how answer it...so I'll take a kinda high level shot at it.
There is this server port that jetty listens on and some number of acceptor threads whose job it is to get connection objects made between the client and server side. Once you have that connection it flows through the jetty handler architecture doing things like authentication perhaps, or pulling off a session id and attaching a session object to the request. Then it works its way into the servlet handler and the appropriate servlet is found and you start dealing with the servlet-api. At that point you have a thread allocated to your request for all of the time you are in the servlet-api, at least under servlet 2.5. In servlet 3.0 you have some async mechanisms available to you, or you can use jetty-continuations as a way to get async support on servlet 2.5 api.
Anyway, there is a thread pool that the server uses to allocate threads to those connectors which ultimately are the threads spending all their time in the servlet-api. The jetty continuations api and the newer servlet 3.0 support provide mechanisms to release threads back to the primary threadpool so they can spend time on accepting and processing other requests.
There is obviously a lot more going on under the covered related to usage of the nio api's and how jetty efficiently manages all of this stuff, but maybe this is enough to sate your initial question. If not, feel free to peruse the jetty docs (http://www.eclipse.org/jetty/documentation/current) or look to the jetty mailing lists. There has been some discussion on jetty-9 optimizations as it relates to under the covers with http, spdy, and websocket connection handling and processing in the blogs at Webtide (http://webtide.com/blogs).
My customer did not gave me details regarding the nature of it's application. It might
be multithreaded it might be not. His server serves SOAP messages (http requests)
Is there any special trick in order to understand if the peer is single or multi threaded?
I don't want to ask the customer and I don't have access to his server/machine. I want to find it myself.
It's irrelevant. Why do you feel it matters to you?
A more useful question would be:
Can the server accept multiple
simultaneous sessions?
The answer is likely to be 'yes, of course' but it's certainly possible to implement a server that's incapable of supporting multiple sessions.
Just because a server supports multiple sessions, it doesn't mean that it's multi-threaded. And, just because it's multi-threaded doesn't mean it will have good performance. When servers need to support many hundreds or thousands of sessions, multi-threading may be a very poor choice for performance.
Are you asking this question because you want to 'overlap' SOAP messages on the same connection - in other words, have three threads send requests, and then all three wait for a response? That won't work, because (like HTTP) request and response messages are paired together on each connection. You would need to open three connections in order to have three overlapped messages.
Unfortunately, no, at least not without accessing the computer directly. Multiple connections can even be managed by a single thread, however the good news is that this is highly unlikely. Most servers use thread pooling and assign a thread to a connection upon a handshake. Is there a particular reason why you need to know? If you're presumably going to work on this server, you'll know first-hand how it works.
It doesn't matter if the server is multithreaded or not. There are good and efficient ways to implement I/O multiplexing without threads [like select(2) and suchlike], if that's what worries you.
I already wrote here about the http chat server I want to create: Alternative http port?
This http server should stream text to every user in the same chat room on the website. The browser will stay connected and wait for further html code. (yes that works, the browser won't reject the connection).
I got a new question: Because this chat server doesn't need to receive information from the client, it's not necessary to listen to the client after the server sent its first response. New chat messages will be send to the server on a new connection.
So I can open 2 threads, one waiting for new clients (or new messages) and one for the html streaming.
Is this a good idea or should I use one thread per client? I don't think it's good to have one thread/client when there are many chat users online, since the server should handle multiple different chats with their own rooms.
3 posibilities:
1. One thread for all clients, send text to each client successive - there shouldn't be much lag since it's only text
this will be like: user1.send("text");user2.send("text"),...
2. One thread per chat or chatroom
3. One thread per chat user - ... many...
Thank you, I haven't done much with sockets yet ;).
Right now, you seem to be thinking in terms of a given thread always carrying out a given (type of) task. While that basic design can make sense, to produce a scalable server like this, it generally doesn't work very well.
Often a slightly more abstract viewpoint works out better: you have tasks that need to get done, and threads that do those tasks -- but a thread doesn't really "care" about what task it executes.
With this viewpoint, you simply need to create some sort of data structure that describes each task that needs to be done. When you have a task you want done, you fill in a data structure to describe the task, and hand it off to get done. Somewhere, there are some threads that do the tasks.
In this case, the exact number of threads becomes mostly irrelevant -- it's something you can (and do) adjust to fit the number of CPU cores available, the type of tasks, and so on, not something that affects the basic design of the program.
I think easiest pattern for this simple app is to have pool of threads and then for each client pick available thread or make it wait until one becomes available.
If you want serious understanding of http server architecture concepts google following:
apache architecture
nginx architecture
I want to know the technical reasons why the lift webframework has high performance and scalability? I know it uses scala, which has an actor library, but according to the install instructions it default configuration is with jetty. So does it use the actor library to scale?
Now is the scalability built right out of the box. Just add additional servers and nodes and it will automatically scale, is that how it works? Can it handle 500000+ concurrent connections with supporting servers.
I am trying to create a web services framework for the enterprise level, that can beat what is out there and is easy to scale, configurable, and maintainable. My definition of scaling is just adding more servers and you should be able to accommodate the extra load.
Thanks
Lift's approach to scalability is within a single machine. Scaling across machines is a larger, tougher topic. The short answer there is: Scala and Lift don't do anything to either help or hinder horizontal scaling.
As far as actors within a single machine, Lift achieves better scalability because a single instance can handle more concurrent requests than most other servers. To explain, I first have to point out the flaws in the classic thread-per-request handling model. Bear with me, this is going to require some explanation.
A typical framework uses a thread to service a page request. When the client connects, the framework assigns a thread out of a pool. That thread then does three things: it reads the request from a socket; it does some computation (potentially involving I/O to the database); and it sends a response out on the socket. At pretty much every step, the thread will end up blocking for some time. When reading the request, it can block while waiting for the network. When doing the computation, it can block on disk or network I/O. It can also block while waiting for the database. Finally, while sending the response, it can block if the client receives data slowly and TCP windows get filled up. Overall, the thread might spend 30 - 90% of it's time blocked. It spends 100% of its time, however, on that one request.
A JVM can only support so many threads before it really slows down. Thread scheduling, contention for shared-memory entities (like connection pools and monitors), and native OS limits all impose restrictions on how many threads a JVM can create.
Well, if the JVM is limited in its maximum number of threads, and the number of threads determines how many concurrent requests a server can handle, then the number of concurrent requests will be determined by the number of threads.
(There are other issues that can impose lower limits---GC thrashing, for example. Threads are a fundamental limiting factor, but not the only one!)
Lift decouples thread from requests. In Lift, a request does not tie up a thread. Rather, a thread does an action (like reading the request), then sends a message to an actor. Actors are an important part of the story, because they are scheduled via "lightweight" threads. A pool of threads gets used to process messages within actors. It's important to avoid blocking operations inside of actors, so these threads get returned to the pool rapidly. (Note that this pool isn't visible to the application, it's part of Scala's support for actors.) A request that's currently blocked on database or disk I/O, for example, doesn't keep a request-handling thread occupied. The request handling thread is available, almost immediately, to receive more connections.
This method for decoupling requests from threads allows a Lift server to have many more concurrent requests than a thread-per-request server. (I'd also like to point out that the Grizzly library supports a similar approach without actors.) More concurrent requests means that a single Lift server can support more users than a regular Java EE server.
at mtnyguard
"Scala and Lift don't do anything to either help or hinder horizontal scaling"
Ain't quite right. Lift is highly statefull framework. For example if a user requests a form, then he can only post the request to the same machine where the form came from, because the form processeing action is saved in the server state.
And this is actualy a thing which hinders scalability in a way, because this behaviour is inconistent to the shared nothing architecture.
No doubt that lift is highly performant but perfomance and scalability are two different things. So if you want to scale horizontaly with lift you have to define sticky sessions on the loadbalancer which will redirect a user during a session to the same machine.
Jetty maybe the point of entry, but the actor ends up servicing the request, I suggest having a look at the twitter-esque example, 'skitter' to see how you would be able to create a very scalable service. IIRC, this is one of the things that made the twitter people take notice.
I really like #dre's reply as he correctly states the statefulness of lift being a potential problem for horizontal scalability.
The problem -
Instead of me describing the whole thing again, check out the discussion (Not the content) on this post. http://javasmith.blogspot.com/2010/02/automagically-cluster-web-sessions-in.html
Solution would be as #dre said sticky session configuration on load balancer on the front and adding more instances. But since request handling in lift is done in thread + actor combination you can expect one instance handle more requests than normal frameworks. This would give an edge over having sticky sessions in other frameworks. i.e. Individual instance's capacity to process more may help you to scale
you have Akka lift integration which would be another advantage in this.