Synchronisation in a Multi-Threaded C++11 application

Synchronisation in a Multi-Threaded C++11 application - c++

I work on a multi-threaded Server application written in C++ and executed on a embedded Linux. One Thread (I call them Communication-Thread) should handle all socket I/Os (send and receive message).
Dependent on the received message, the Communication-Thread send the message to another Thread (e.g. Controller-Thread) which handle the required sequence. A return message is created at the end of the sequence by the Controller-Thread. This message is written back to the Communication-Thread, which should transfer them to the client.
The Communication between this two threads is implemented with Queues which are protected through mutex and condition_variable. Now when the Communication-Thread received a socket message, it transfer them to the Controller-Thread and wait for a message from the Controller-Thread.
So no benefit is given through the multi-threaded architecture. My goal is to wait in the Communication-Thread for a socket message OR a 'queued' message.
For that I thinking to change the queue implementation between the Threads and replace them with a pipe or a eventfd. Then I would use the select() function in the Communication-Thread to observe the queue and socket simultaneously. But I have some concerns about the performance of this solution.
Has someone a better idea or solution for this problem?
Compact question:
I would observe a socket and some kind of messages in a multi-threaded application, simultaneously.
Do some one know a more efficient implementation than pipes or eventfd for this type of problem?
Thanks for any hint on this topic

Use boost::asio, if it is available on your embedded Linux version. It is designed to do exactly what you want, has a great interface and really good functionality. Have a look at the tutorials. It takes a little getting used to, but it makes multithreaded networking applications a lot easier.

Related

accept a socket in one thread and write data in different thread [duplicate]

I am implementing a simple server, that accepts a single connection and then uses that socket to simultaneously read and write messages from the read and write threads.
What is the safe and easy way to simultaneously read and write from the same socket descriptor in c/c++ on linux?
I dont need to worry about multiple threads read and writing from the same socket as there will be a single dedicated read and single dedicated write thread writing to the socket.
In the above scenario, is any kind of locking required?
Does the above scenario require non blocking socket?
Is there any opensource library, that would help in the above scenario?

In the above scenario, is any kind of locking required?
None.
Does the above scenario require non blocking socket?
The bit you're probably worried about - the read/recv and write/send threads on an established connection - do not need to be non-blocking if you're happy for those threads to sit there waiting to complete. That's normally one of the reasons you'd use threads rather than select, epoll, async operations, or io_uring - keeps the code simpler too.
If the thread accepting new clients is happy to block in the call to accept(), then you're all good there too.
Still, there's one subtle issue with TCP servers you might want to keep in the back of your mind... if your program grows to handle multiple clients and have some periodic housekeeping to do. It's natural and tempting to use a select or epoll call with a timeout to check for readability on the listening socket - which indicates a client connection attempt - then accept the connection. There's a race condition there: the client connection attempt may have dropped between select() and accept(), in which case accept() will block if the listening socket's not non-blocking, and that can prevent a timely return to the select() loop and halt the periodic on-timeout processing until another client connects.
Is there any opensource library, that would help in the above scenario?
There are hundreds of libraries for writing basic servers (and asking for 3rd party lib recommendations is off-topic on SO so I won't get into it), but ultimately what you've asked for is easily achieved atop an OS-provided BSD sockets API or the Windows bastardisation ("winsock").

Sockets are BI-DIRECTIONAL. If you've ever actually dissected an Ethernet or Serial cable or seen the low-level hardware wiring diagram for them, you can actually SEE distinct copper wires for the "TX" (transmit) and "RX" (receive) lines. The software for sending the signals, from the device controller up to most OS APIs for a 'socket', reflects this and it is the key difference between a socket and an ordinary pipe on most systems (e.g. Linux).
To really get the most out of sockets, you need:
1) Async IO support that uses IO Completion Ports, epoll(), or some similar async callback or event system to 'wake up' whenever data comes in on the socket. This then must call your lowest-level 'ReadData' API to read the message off the socket connection.
2) A 2nd API that supports the low-level writes, a 'WriteData' (transmit) that pushes bytes onto the socket and does not depend on anything the 'ReadData' logic needs. Remember, your send and receive are independent even at the hardware level, so don't introduce locking or other synchronization at this level.
3) A pool of Socket IO threads, which blindly do any processing of data that is read from or will be written to a socket.
4) PROTOCOL CALLBACK: A callback object the socket threads have smart pointers to. It handles any PROTOCOL layer- such as parsing your data blob into a real HTTP request- that sits on top of the basic socket connection. Remember, a socket is just a data pipe between computers and data sent over it will often arrive as a series of fragments- the packets. In protocols like UDP the packets aren't even in order. The low-level 'ReadData' and 'WriteData' will callback from their threads into here, because it is where content-aware data processing actually begins.
5) Any callbacks the protocol handler itself needs. For HTTP, you package the raw request buffers into nice objects that you hand off to a real servlet, which should return a nice response object that can be serialized into an HTTP spec-compliant response.
Notice the basic pattern: You have to make the whole system fundamentally async (an 'onion of callbacks') if you wish to take full advantage of bi-directional, async IO over sockets. The only way to read and write simultaneously to the socket is with threads, so you could still synchronize between a 'writer' and 'reader' thread, but I'd only do it if the protocol or other considerations forced my hand. The good news is that you can get great performance with sockets using highly async processing, the bad is that building such a system in a robust way is a serious effort.

You don't have to worry about it. One thread reading and one thread writing will work as you expect. Sockets are full duplex, so you can read while you write and vice-versa. You'd have to worry if you had multiple writers, but this is not the case.

boost ASIO and message passing between thread

I am working on designing a websocket server which receives a message and saves it to an embedded database. For reading the messages I am using boost asio. To save the messages to the embedded database I see a few options in front of me:
Save the messages synchronously as soon as I receive them over the same thread.
Save the messages asynchronously on a separate thread.
I am pretty sure the second answer is what I want. However, I am not sure how to pass messages from the socket thread to the IO thread. I see the following options:
Use one io service per thread and use the post function to communicate between threads. Here I have to worry about lock contention. Should I?
Use Linux domain sockets to pass messages between threads. No lock contention as far as I understand. Here I can probably use BOOST_ASIO_DISABLE_THREADS macro to get some performance boost.
Also, I believe it would help to have multiple IO threads which would receive messages in a round robin fashion to save to the embedded database.
Which architecture would be the most performant? Are there any other alternatives from the ones I mentioned?
A few things to note:
The messages are exactly 8 bytes in length.
Cannot use an external database. The database must be embedded in the running
process.
I am thinking about using RocksDB as the embedded
database.

I don't think you want to use a unix socket, which is always going to require a system call and pass data through the kernel. That is generally more suitable as an inter-process mechanism than an inter-thread mechanism.
Unless your database API requires that all calls be made from the same thread (which I doubt) you don't have to use a separate boost::asio::io_service for it. I would instead create an io_service::strand on your existing io_service instance and use the strand::dispatch() member function (instead of io_service::post()) for any blocking database tasks. Using a strand in this manner guarantees that at most one thread may be blocked accessing the database, leaving all the other threads in your io_service instance available to service non-database tasks.
Why might this be better than using a separate io_service instance? One advantage is that having a single instance with one set of threads is slightly simpler to code and maintain. Another minor advantage is that using strand::dispatch() will execute in the current thread if it can (i.e. if no task is already running in the strand), which may avoid a context switch.
For the ultimate optimization I would agree that using a specialized queue whose enqueue operation cannot make a system call could be fastest. But given that you have network i/o by producers and disk i/o by consumers, I don't see how the implementation of the queue is going to be your bottleneck.

After benchmarking/profiling I found the facebook folly implementation of MPMC Queue to be the fastest by at least a 50% margin. If I use the non-blocking write method, then the socket thread has almost no overhead and the IO threads remain busy. The number of system calls are also much less than other queue implementations.
The SPSC queue with cond variable in boost is slower. I am not sure why that is. It might have something to do with the adaptive spin that folly queue uses.
Also, message passing (UDP domain sockets in this case) turned out to be orders of magnitude slower especially for larger messages. This might have something to do with copying of data twice.

You probably only need one io_service -- you can create additional threads which will process events occurring within the io_service by providing boost::asio::io_service::run as the thread function. This should scale well for receiving 8-byte messages from clients over the network socket.
For storing the messages in the database, it depends on the database & interface. If it's multi-threaded, then you might as well just send each message to the DB from the thread that received it. Otherwise, I'd probably set up a boost::lockfree::queue where a single reader thread pulls items off and sends them to the database, and the io_service threads append new messages to the queue when they arrive.
Is that the most efficient approach? I dunno. It's definitely simple, and gives you a baseline that you can profile if it's not fast enough for your situation. But I would recommend against designing something more complicated at first: you don't know whether you'll need it at all, and unless you know a lot about your system, it's practically impossible to say whether a complicated approach would perform any better than the simple one.

void Consumer( lockfree::queue<uint64_t> &message_queue ) {
// Connect to database...
while (!Finished) {
message_queue.consume_all( add_to_database ); // add_to_database is a Functor that takes a message
cond_var.wait_for( ... ); // Use a timed wait to avoid missing a signal. It's OK to consume_all() even if there's nothing in the queue.
}
}
void Producer( lockfree::queue<uint64_t> &message_queue ) {
while (!Finished) {
uint64_t m = receive_from_network( );
message_queue.push( m );
cond_var.notify_all( );
}
}

Assuming that the constraint of using cxx11 is not too hard in your situtation, I would try to use the std::async to make an asynchronous call to the embedded DB.

simultaneously read and write on the same socket in C or C++

I am implementing a simple server, that accepts a single connection and then uses that socket to simultaneously read and write messages from the read and write threads.
What is the safe and easy way to simultaneously read and write from the same socket descriptor in c/c++ on linux?
I dont need to worry about multiple threads read and writing from the same socket as there will be a single dedicated read and single dedicated write thread writing to the socket.
In the above scenario, is any kind of locking required?
Does the above scenario require non blocking socket?
Is there any opensource library, that would help in the above scenario?

You don't have to worry about it. One thread reading and one thread writing will work as you expect. Sockets are full duplex, so you can read while you write and vice-versa. You'd have to worry if you had multiple writers, but this is not the case.

Can I use Boost::Asio and not worry about network programming problems?

I have to make a server in my current project but I don't have any or little experience in this area. My question is, can I just use Asio in my project and it will simply handle any problems a normal server has to face (partial reads, multithreading problems, ...)?
(My server will have to handle hundreds of clients at the same time)

ASIO takes care of the low-level socket programming and polling code. You still have to provide all the functionality to process raw network data. Ultimately, you get an unpredictable number of bytes from the network any time a read callback is called, and it is up to you to take those bytes and reconstruct your application message from them.
But indeed, as far as receiving an unspecified number of bytes is concerned, you won't have to worry about how that is implemented.
Multithreading is "easy" in the sense that you can run the ASIO processor multiple times concurrently, but it is your responsibility to provide a read callback that can deal with being run multiple times at once.

Asio is intentionally not multithreaded. It handles concurrency by multiplexing via the operating system's select(), kqueue, epoll, or other mechanism.
As for partial receives, there is no automatic way to get TCP to respect message boundaries. Asio can't do anything about that, so you'll need some technique at the application level to indicate completion. HTTP traditionally handles this by closing the socket when it's finished, though it's also possible to pre-send the size of the message.

C++ Sockets Send() Thread-Safety

I am coding sockets server for 1000 clients maxmimum, the server is about my game, i'm using non-blocking sockets and about 10 threads that receive data simultaneously from different sockets (first thread receives from 0-100,second from 101-200 and so on..)
but if thread 1 wants to send data to all 1000 clients and thread 2 also wants to send data to all 1000 clients at the same time, is that safe? are there any chances of the data being messed in the other (client) side?
if yes, i guess the only problem that can happen is that sometimes client would receive 2 or 10 packets as 1 packet, is that correct? if yes, is there any solution to that :(

The usual pattern of dealing with many sockets is to have a dedicated thread polling for I/O events with select(2), poll(2), or better kqueue(2) or epoll(4) (depending on the platform) acting as socket event dispatcher. The sockets are usually handled in non-blocking mode. Then one might have pool of threads reacting to the events and either do reads and writes directly or via lower level buffers/queues.
All sorts of techniques are applicable here - from queues to event subscription whiteboards. It gets tricky with multiplexing accepts/reads/writes/EOFs on the I/O level and with event arbitration on the application level. Several libraries like libevent and boost::asio help structure the lower level (the ACE library is also in this space, but I'd hate recommending it to anybody). You would have to come up with application-level protocols and state machines yourself (again boost::statechart might be of help).
Some good links to get better understanding of what you are up against (this is probably the millionth time they are mentioned here on SO):
The C10K problem
High-Performance Server Architecture
Apologies for not offering a concrete solution, but this is a very wide design question and most decisions depend heavily on the context (lots of fun though). Hope this helps a bit.

Since you are sending data using different sockets, there must not be any problem. Rather when these different threads access same data you have to ensure data integrity.

Are you using UDP or TCP sockets?
If UDP, each write should be encapsulated in a separate packet and should be carried to the other side intact. The order may be swapped (as it may for any UDP packet) but they should be whole.
If TCP, there's no concept of packets on the transport layer and any 10 writes on one side may be bundled up on the other side in one read. TCP writes may also only accept part of your buffer so even if the send() function is atomic, your write isn't necessarily. In this case you'd need to synchronize it.

send() is not atomic in most implementations, so sending to 1000 different sockets from multiple threads could lead to mixed-up messages arriving on the client side, and all kinds of weirdness. (I know nothing, see Nicolai's and Robert's comments below the rest of my comment still stands though (in terms of being a solution to your problem))
What I would do is use threads for sending like you use them for receiving. One thread to manage sending to one (or more) sockets that ensures that you don't write to one socket from multiple threads at the same time.
Also look here for some additional discussion and more interesting links.
If you're on windows, the winsock programmers faq is an invaluable resource, for your issue see here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js