boost asio: different thread pool for different tasks - c++

There are many examples on the net about creating a simple thread pool such as Sample1 and Sample2
What I wanted to implement though is to have a separate thread pool for different tasks. For example, the app may have a pool of threads for processing incoming tcp connections (let's call this the network pool), while another pool for talking to a database (database pool).
These incoming tcp requests might want information from the database. In this case it will need to ask the those threads from the database pool to perform query, and return the result asynchronously.
Is there a recommended way to do so using boost::asio? Would it be having one instance of io_service for each pool? And how should those threads communicate with each other (using boost)?
I understand to explain all these, the code won't be that short and trivial, but if possible some sort of pseudo code would be much appreciated.
Thanks!

The communication between thread / thread pools should be through thread safe queues.
In your example, you should have a networking thread pool for handling network connections, a process pool for executing the network requests, and a database connection / thread pool (one pool per database; one thread per database connection, but possibly you could have multiple connections to the same database).
You would also need a thread safe queues, one for the network pool, one for the process pool and one for each of the database pools.
Say you have a network request that needs to get information from the database. You would receive the request while executing on a network thread, and append the handler for the request onto the process queue.
The process handler (in a process thread) would see that the request needs something from the database, and so it would append a database request as well as a callback handler onto the appropriate database queue.
The appropriate database thread would pick up the request from the database queue, execute the query, get the results back, and add the results to the callback handler. The callback handler object with the database results would then be pushed onto the process queue.
The callback handler (in a process thread) would then continue executing the request, and possibly package a response message, which is then pushed onto the network queue.
The network handler (in a network thread) would then pick up the response messsage and deliver it (encoding as necessary).
An example of a thread safe queue can be found here.
Albeit a little complicated, you can see an implementation of an application server that can handle what you're talking about here, although it may be overkill for what you're trying to do. The source code is fairly well documented so you should be able to follow it and see what it's doing.
My example uses boost for asio (see the TCP Connection implementation within that same system), but it does not use boost io_service for handlers.

Related

Handling multiple clients simultaneously in C++ UDP server

I have developed a C++ UDP based server application and I am in the process of implementing code to handle multiple clients simultaneously .
I have the following understanding regarding how to handle multiple clients and want to fill in the knowledge gaps
My step wise understanding is as mentioned below
UDP server listens at a specific port(say xxxx)
The server has a message queue .It can be array or linked list or Queue or anything for that matter
As soon as a request arrives at the port xxxx, its placed in the message queue
After putting it in the message queue a new thread(let us call it worked thread) is spawned and it picks up the queued message and the same is removed from the message queue
The worked thread knows about the clients IP:port from the message header
The worker thread processes the request and sends the response to the clients IP:port
The clients gets the response and the worker thread terminates.
Steps 3 to 7 take care of multiple client being handled simultaneously.
Is my understanding sufficient ? Where do I need improvement?
Thanks in advance
The clients gets the response and the worker thread terminates.
The worker thread should terminate when it completes processing. There is no practical way for it to wait for an acknowledgement from the client.
The worker thread processes the request and sends the response to the clients IP:port
I think it will be better to place the response on a queue. The main server thread can check the queue and send any responses found there. This prevents race conditions when two worker threads overlap in their attempts to send responses.
The server has a message queue .It can be array or linked list or Queue or anything for that matter
It pretty much has to be a queue. The interesting question is what queue priority. Initially FIFO would do. If your server becomes overloaded, then you need to consider alternatives. Perhaps it would be good to estimate the processing time required, and do the fast ones first. Or perhaps different clients deserve different priorities.
After putting it in the message queue a new thread(let us call it worked thread) is spawned
This is fine initially. However, you will want to do some time profiling and determine if a thread pool would be advantageous.
Deeper Discussion of threading issues
The job processing must be done in a separate worker thread, so that a long job will not block the server from accepting connections from other clients. However, you should consider carefully whether or not you want to use multiple worker threads. Since you are placing the job requests on a queue, a single worker thread can be used to process them one by one.
PRO single thread
Simpler, more reliable code. The processing code must be thread safe for context switches back to the main thread. However, there will not be any context switches between job processing code. This makes it easier to design and debug the processing code. For example, if the jobs are updating a database, then you do not require any extra code to ensure the database is always consistent - just that consistency is guaranteed at the end of each job process.
Faster response for short jobs. If there are many short jobs submitted at the same time, your CPU can spend more cycles switching between jobs than actually doing useful processing.
CON single thread
A big job will block other jobs until it completes.

C++ gRPC thread number configuration

Does gRPC server/client have any concept of thread pool for connections? That it is possible to reuse threads, pre-allocate threads, queue request on thread limit reached, etc.
If no, how does it work, is it just allocating/destroying a thread when it need, without any limitation and/or reuse? If yes, is it possible to configure it?
It depends whether you are using sync or async API.
For a sync client, your RPC calls blocks the calling thread, so it is not really relevant. For a sync server, there is an internal threadpool handling all the incoming requests, you can use a grpc::ResourceQuota on a ServerBuilder to limit the max number of threads used by the threadpool.
For async client and server, gRPC uses CompletionQueue as a way for users to define their own threading model. A common way of building clients and servers is to use a user-provided threadpool to run CompletionQueue::Next in each thread. Then once it gets some tag from the Next call, you can cast it to a user-defined type and run some methods to proceed the state. In this case, the user has full control of threads being used.
Note gRPC does create some internal threads, but they should not be used for the majority of the rpc work.

Connecting to remote services from multiple threaded requests

I have a boost asio application with many threads, similar to a web server, handling hundreds of concurrent requests. Every request will need to make calls to both memcached and redis (via libmemcached and redispp respectively). Is the best practice in this situation to make a separate connection to both redis and memcached from each thread (effectively tripling the open sockets on the server, three per request)? Or is there a way for me to build a static object, with a single memcached/redis connection, and allow all threads to share that single connection? I'm a bit confused when it comes to the thread safety of something like this, and everything needs to be asynchronous between the threads, but blocking for each thread's individual request (so each thread has a linear progression, but many threads can be in different places in their own progression at any given time). Does that make sense?
Thanks so much!
Since memcached have syncronous protocol you should not write next request before you got answer to prevous. So, no other thread can chat in same memcached connection. I'd prefer to make thread-local connection if you work with it in "blocking" mode.
Or you can make it work in "async" manner: make pool of connections, pick a connection from it (and lock it). After request is done, return it to pool.
Also, you can make a request queue and process it in special thread(s) (using multigets and callbacks).

How to design a client server architect

I like to know the server (TCP based) architecture to support large scale of clients(at least10K) to implement Fix server. My points are
How we design it.
How to listen on the open port? Use select or poll or any other function.
How to process the response of the client? On large scale we cannot create the one thread for each client.
Should the processing of response is in the different executable and share the request and response to the server executable through IPC.
There is much more on it. I would appreciate if anyone explains it or provide any link.
Thanks
An excellent resource for information on this topic is The C10K problem. Although the dimensions there seem a little old, the techniques are still applicable today.
The architecture depends on what you want to do with the clients incoming data. My guess is that for every incoming message you would perform some computations and probably also return a response.
In that case I would create 1 main listener thread that receives all the incoming messages (Actually, if your hardware has more than 1 physical network device, I would use a listener thread per device and make sure each one is listening to a specific device).
Get the number of CPUs that you have on your machine and create worker threads for each CPU and bind them each thread to one cpu (Maybe number of working thread should be num_of_cpu-1, to leave an availalbe cpu for the listener and dispatcher).
Each thread has a queue and semaphore, the main listener thread just push the incoming data into those queues. There are many way to perform load balancing (Will talk about it later).
Each working thread just works on the requests given to it, and put the response on another queue that is read by the dispatcher.
The dispatcher - there are 2 options here, use a thread for dispatcher (or thread per network device as for listeners), or have the dispatcher actually be the same thread as the listener.
There is some advantage to put them both on the same thread, since it makes it easier to detect lost socket connection and use the same fds for both reading and writing without thread synchronization. However, it could be that using 2 different threads would give better performance, it need to be tested.
Note about load balancing:
This is a topic of its own.
The simplest thing is to use 1 queue for all working threads, but the problem is that they have to lock in order to pop items and the locking can damage performance. (But you get the most balanced load).
Another quite simple approach would be to have a private queue for every worker and perform round-robin when inserting. After every X cycles check the size of all the queues. If some queues are much larger than others then leave them out for the next X cycles and then recheck them again. This is not the best approach, but a simple one to implement and gives some load balancing while no locking is needed.
By the way - There is a way to implement queue between 2 threads without blocking - but this is also another topic.
I hope it helps,
Guy
If the client and server are on a secure network then the security aspect is to be minimal - to the extent that the transfers are encrypted. If the clients and the server are not on a secure network - you first want the server and client to authenticate each other and then initiate encrypted data transfer. For data transfer, server-side authentication should suffice. At the end of this authentication use the session key to generate encrypted data stream (symmetric). consider using TFTP it is simple to implement and scales reasonably well.

How to get a Win32 Thread to wait on a work queue and a socket?

I need a client networking thread to be able to respond both to new messages to be transmitted, and the receipt of new data on the network. I wish to avoid this thread performing a polling loop, but rather to process only as needed.
The scenario is as follows:
A client application needs to communicate to a server via a protocol that is largely, but not entirely, synchronous. Typically, the client sends a message to the server and blocks until a response is received.
The server may process client requests asynchronously, in which case the response to client
is not a result, but a notification that processing has begun. A result message is sent to to the client at some point in the future, when the server has finish processing the client request.
The asynchronous result notifications can arrive at the client at any time. These notifications need processed when they are received i.e. it is not possible to process a backlog only when the client transmits again.
The clients networking thread receives and processes notifications from the server, and to transmit outgoing messages from the client.
To achieve this, I need to to make a thread wake to perform processing either when network data is received OR when a message to transmit is enqueued into an input queue.
How can a thread wake to perform processing of an enqueued work item OR data from a socket?
I am interested primarily in using the plain Win32 APIs.
A minimal example or relevant tutorial would be very welcome!
An alternative to I/O Completion Ports for sockets is using WSAEventSelect to associate an event with the socket. Then as others have said, you just need to use another event (or some sort of waitable handle) to signal when an item has been added to your input queue, and use WaitForMultipleObjects to wait for either kind of event.
You can set up an I/O Completion Port for the handles and have your thread wait on the completion port:
http://technet.microsoft.com/en-us/sysinternals/bb963891.aspx
Actually, you can have multiple threads wait on the port (one thread per processor usually works well).
Following on from Michael's suggestion, I have some free code that provides a framework for IO Completion Port style socket stuff; and it includes an IOCP based work queue too. You should be able to grab some stuff from it to solve your problem from here.
Well, if both objects have standard Windows handles, you can have your client call WaitForMultipleObjects to wait on them.
You might want to investiate splitting the servicing of the network port off onto its own thread. That might simplify things greatly. However, it won't help if you just end up having to synchonize something else between that new thread and your main one.