Receive/Send in single thread or separate threads?

Receive/Send in single thread or separate threads? - c++

I was in a discussion about Multiple Threads in a client application and was told that using a separate thread for receiving data and another thread for sending data is not the way to go.
Why?
From what I know TCP is Full-Duplex so this would be a performance improvement, or not?

Having a dedicated send thread and a dedicated receive thread is bad for two reasons.
First, it means that a context switch is required every time you go from receiving to sending unless you are doing both at the same time.
Second, it means that in the the typical path where you receive a query, formulate a response, and then send that response, data will need to be handed from one thread to another, blowing out caches.
That said, if performance isn't super-critical and it fits into your design well, it certainly works. It's just that there's usually no advantage.

I suppose it depends on the scale of your application. If you are doing a small app for a class project, it might be enough to have the send and receive on the same thread. Then you don't have to worry about threading issues.
However, I worked on an application that had to listen for several thousand incoming connections, and each connection might be sending a significant amount of data. We had a thread whose sole purpose was listening for socket connections and putting the new connections into a pool, and a variable number of threads (depending on how busy the app was) just for reading off of sockets, and a different pool of threads for writing.
The problem is that if your listening socket isn't reading the data off of the wire fast enough and the buffer fills up, an error is returned, and in the case of thousands of clients, caused there to be a lot of reconnects and re-sends of data, which compounded the problem that the data was not being read fast enough in the first place.
So it comes back to what I said in the first place - it depends on the scale of your application, but why not add in the ability now? Just make sure that you are thread safe, and you should be OK.

Related

Should I use multiple threads for a multi socket client?

I understand that for most cases using threads in Qt networking is overkill and unnecessary, especially if you do it the proper way and use the readyRead() signal. However, my "client" application will have multiple sockets open (about 5) at one time. It is possible for there to be data coming in on all sockets at the same time. I am really not going to be doing any intense processing with the incoming data. Simply reading it in and then sending out a signal to update the GUI with the newly received data. Do you think a single thread application should be able to handle all of the data coming in?
I understand that I haven't shown you any code and that my description is pretty vague and it could very well depend on how it performs once implemented, but from a general design perspective and your guys' expertise, what is your opinion?

Unless you are receiving really high-bandwidth streams (e.g. megabytes per second rather than kilobytes per second), a single-threaded design should be sufficient. Keep in mind that the OS's networking stack is running "in the background" at all times, receiving TCP packets and storing the received data inside fixed-size in-kernel memory buffers. This happens in parallel with your program's execution, so in most cases the fact that your program is single-threaded and busy dealing with a GUI update (or another socket) won't hamper your computer's reception of TCP packets.
The case where a single-threaded design would cause a slowdown of TCP traffic is if your program (via Qt) didn't call recv() quickly enough, such that the kernel's TCP-receive buffer for a socket became entirely filled with data. At that point the kernel would have no choice but to start dropping incoming TCP packets for that socket, which would cause the server to have to re-send those TCP packets, and that would cause the socket's TCP receive rate to slow down, at least temporarily. However, that problem can be avoided by making sure the buffers never (or at least rarely) get full.
The obvious way to do that is to ensure that your program reads all of the incoming data as quickly as possible -- something that QTCPSocket does by default. The only thing you need to do is make sure that your GUI updates don't take an inordinate amount of time -- and Qt's widget-update routines are fairly efficient, so they shouldn't, unless you have a really elaborate GUI or an inefficient custom paintEvent() routine or etc.
If that's not sufficient, the next thing you could do (if necessary) is tell the OS's TCP stack to increase the size of its in-kernel TCP receive buffer, e.g. by doing:
int fd = myQTCPSocketObject.descriptor();
int newBufSizeBytes = 128*1024; // request 128kB kernel recv-buffer for this socket
if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &newBufSizeBytes, sizeof(newBufSizeBytes)) != 0) perror("setsockopt");
Doing that would give your (single) thread more time to react before incoming packets start getting dropped for lack of in-kernel buffer space.
If, after trying all that, you still aren't getting the network performance you need, then you can try going multithreaded. I doubt it will come to that, but if it does, it needn't affect your program's design too much; you'd just write a wrapper class (called SocketThread or something) that holds your QTCPSocket object and runs an internal thread that handles the reading from the socket, and emits a bytesReceived(QByteArray) signal whenever the thread reads data from the socket. The rest of your code would remain approximately the same; just modify it to hold the SocketThread object instead of a QTCPSocket, and connect the SocketThread's bytesReceived(QByteArray) signal to a corresponding slot (via a QueuedConnection, of course, for thread-safety) and use that instead of responding directly to readReady().

Implement it without threads, using a thread-considerate design(*), measure the delay your data experiences, decide if it is within acceptable bounds. Then decide if you need to use threads to capture it more rapidly.
From your description, the key bottleneck is going to be GUI reception of the "data ready" signal, render it. If you use the approach of sending lots of these signals, your GUI is goign to be doing more re-renders.
If you use a single-thread approach, you can marshal the network reads and get all the updates and then refresh the GUI directly. As you've described it, this sounds like it will have the least degree of contention.
(* try to avoid constructs which will require an entire rewrite if you go threaded, but don't put so much effort into making it thread-proof that it will actually need threads to make it efficient, e.g. don't wrap everything with mutex calls)

I do not know much about Qt, but this could be a typical scenario where you use select() to multiplex multiple socket accesses with a single thread.
If the thread for selecting is used mainly for handling the data from/to the sockets you will be very fast(as you will have less context switches). So if you are not transfer really huge amounts of data it is likely possible that you will be faster will a single threaded solution.
That being said, i would go with the solution that fits the most for your needs, something that you can implement in a fair amount of time. Implementing select (async) can be quite a hassle, an overkill that might not be needed.
It's a C-like approach, but i hope i could help anyway.

Problems implementing a multi-threaded UDP server (threadpool?)

I am writing an audio streamer (client-server) as a project of mine (C/C++),
and I decided to make a multi threaded UDP server for this project.
The logic behind this is that each client will be handled in his own thread.
The problems I`m having are the interference of threads to one another.
The first thing my server does is create a sort of a thread-pool; it creates 5
threads that all are blocked automatically by a recvfrom() function,
though it seems that, on most of the times when I connect another device
to the server, more than one thread is responding and later on
that causes the server to be blocked entirely and not operate further.
It's pretty difficult to debug this as well so I write here in order
to get some advice on how usually multi-threaded UDP servers are implemented.
Should I use a mutex or semaphore in part of the code? If so, where?
Any ideas would be extremely helpful.

Take a step back: you say
each client will be handled in his own thread
but UDP isn't connection-oriented. If all clients use the same multicast address, there is no natural way to decide which thread should handle a given packet.
If you're wedded to the idea that each client gets its own thread (which I would generally counsel against, but it may make sense here), you need some way to figure out which client each packet came from.
That means either
using TCP (since you seem to be trying for connection-oriented behaviour anyway)
reading each packet, figuring out which logical client connection it belongs to, and sending it to the right thread. Note that since the routing information is global/shared state, these two are equivalent:
keep a source IP -> thread mapping, protected by a mutex, read & access from all threads
do all the reads in a single thread, use a local source IP -> thread mapping
The first seems to be what you're angling for, but it's poor design. When a packet comes in you'll wake up one thread, then it locks the mutex and does the lookup, and potentially wakes another thread. The thread you want to handle this connection may also be blocked reading, so you need some mechanism to wake it.
The second at least gives a seperation of concerns (read/dispatch vs. processing).
Sensibly, your design should depend on
number of clients
I/O load
amount of non-I/O processing (or IO:CPU ratio, or ...)

The first thing my server does is create a sort of a thread-pool; it creates 5 threads that all are blocked automatically by a recvfrom() function, though it seems that, on most of the times when I connect another device to the server, more than one thread is responding and later on that causes the server to be blocked entirely and not operate further
Rather than having all your threads sit on a recvfrom() on the same socket connection, you should protect the connection with a semaphore, and have your worker threads wait on the semaphore. When a thread acquires the semaphore, it can call recvfrom(), and when that returns with a packet, the thread can release the semaphore (for another thread to acquire) and handle the packet itself. When it's done servicing the packet, it can return to waiting on the semaphore. This way you avoid having to transfer data between threads.

Your recvfrom should be in the master thread and when it gets data you should pass the address IP:Port and data of the UDP client to the helper threads.
Passing the IP:port and data can be done by spawning a new thread everytime the master thread receives a UDP packet or can be passed to the helper threads through a message queue

I think that your main problem is the non-persistent udp connection. Udp is not keeping your connections alive, it exchanges only two datagrams per session. Depending on your application, in the worst case, it will have concurrent threads reading from the first available information, ie, recvfrom() will unblock even if it is not it's turn to do it.
I think the way to go is using select in the main thread and, with a concurrent buffer, manage what wich thread will do.
In this solution, you can have one thread per client, or one thread per file, assuming that you keep the clients necessary information to make sure you're sending the right file part.
TCP is another way to do it, since it keeps the connection alive for every thread you run, but is not the best transmission way on data lost allowed applications.

Boost asio - reading and writing at the same time

I'm trying to implement a two way communication using boost:asio. I'm writing the server that will communicate with multiple clients.
I want the writes and reads to and from clients to happen without any synchronization and order - the client can send a command to the server at any time and it still receives some data in a loop. Of course access to shared resources must be protected.
What is the best way to achieve this? Is having two threads - one for reading and one for writing a good option? What about accepting the connections and managing many clients?
//edit
By "no synchronization and order" I mean that the server should stream to the clients its data all the time and that it can respond(change its behaviour) to clients requests at any time regardless of what is now being sent to them.

One key idea behind asio is exactly that you don't need multiple threads to deal with multiple client sessions. Your description is a bit generic, and I'm not sure I understand what you mean by 'I want the writes and reads to and from clients to happen without any synchronization and order'.
A good starting point would be the asio chat server example. Notice how in this example an instance of the class chat_session is created for each connected client. Objects of that class keep on posting asynchronous reads as long as the connection is alive and at the same time they can write data to the connected clients. In the mean time an object of class chat_server keeps accepting new incoming client connections.

At work we're doing something conceptually very similar and there I noticed the big impact a heavy handler has on performance. The writing side of the code/write handler does too much work and occupies a worker thread for too long, thereby jeopardizing the program flow. Especially RST packets (closed connections) weren't detected quick enough by the read handler because the write actions were taking their sweet time and hogging most of the processing time in the worker thread. Currently I fixed that by creating two worker threads so that one line of code was not starved of processing time. Admittedly, this is far from ideal and it is on my lengthy to-do list of optimizations.
Long story short, you can get away with using a single thread for reading and writing if your handlers are light-weight while a second thread handles the rest of your program. Once you notice weird synchronization issues it's time to either lighten your network handlers or add an extra thread to the worker pool.

Writing a server application that Pushes to clients (TCP)

I'm writing a client-server application and one of the requirements is the Server, upon receiving an update from one of the clients, be able to Push out new data to all the other clients. This is a C++ (Qt) application meant to run on Linux (both client and server), but I'm more looking for high-level conceptual ideas of how this should work (though low-level thoughts are good, too).
Server:
It needs to (among its other duties) keep a socket open listening for incoming packets from potentially n different clients, presumably on a background thread (I haven't written much in terms of socket code other than some rinky-dink examples in school). Upon getting this data from a client, it processes it and then spits it out to all its clients, right?
Of course, I'm not sure how it actually does this. I'm guessing this means it has to keep persistent connections with every single client (at least the active clients), but I don't understand even conceptually how to maintain this connection (or the list of these connections).
So, how should I approach this?

In general when you have multiple clients, there are a few ways to handle this.
First of all, in TCP, when a client connects to you they're placed into a queue until they can be serviced. This is a given, you don't need to do anything except call the accept system call to receive a new client. Once the client is recieved, you'll be given a socket which you use to read and write. Who reads / writes first is entirely dependent on your protocol, but both sides need to know the protocol (which is up to you to define).
Once you've got the socket, you can do a few things. In a simple case, you just read some data, process it, write back to the socket, close the socket, and serve the next client. Unfortunately this means you can only serve one client at a time, thus no "push" updates are possible. Another strategy is to keep a list of all the open sockets. Any "updates" simply iterate over the list and write to each socket. This may present a problem though because it only allows push updates (if a client sent a request, who would be watching for it?)
The more advanced approach is to assign one thread to each socket. In this scenario, each time a socket is created, you spin up a new thread whose whole purpose is to serve exactly one client. This cuts down on latency and utilizes multiple cores (if available), but is far more difficult to program. Also if you have 10,000 clients connecting, that's 10,000 threads which gets to be too much. Pushing an update to a single client (in this scenario) is very simple (a thread just writes to its respective socket). Pushing to all of them at once is a little more tricky (requires either a thread event or a producer / consumer queue, neither of which are very fun to implement)
There are, of course, a million other ways to handle this (one process per client, a thread pool, a load-balancing proxy, you name it). Suffice it to say there's no way to cover all of these in one answer. I hope this answers your basic questions, let me know if you need me to clarify anything. It's a very large subject. However if I might make a suggestion, handling multiple clients is a wheel that has been re-invented a million times. There are very good libraries out there that are far more efficient and programmer-friendly than raw socket IO. I suggest libevent, which turns network requests into an event-driven paradigm (much more like GUI programming, which might be nice for you), and is incredibly efficient.

From what I understand, I think you need to keep an infinite loop going, (at least until the program terminates) that answers a connection request from your clients. It would be best to add them to a array of some sort. Use an event to see when a new client is added to that array, and wait for one of them to give data. Then you do what you have to do with that data and spit it back.

C++ Sockets Send() Thread-Safety

I am coding sockets server for 1000 clients maxmimum, the server is about my game, i'm using non-blocking sockets and about 10 threads that receive data simultaneously from different sockets (first thread receives from 0-100,second from 101-200 and so on..)
but if thread 1 wants to send data to all 1000 clients and thread 2 also wants to send data to all 1000 clients at the same time, is that safe? are there any chances of the data being messed in the other (client) side?
if yes, i guess the only problem that can happen is that sometimes client would receive 2 or 10 packets as 1 packet, is that correct? if yes, is there any solution to that :(

The usual pattern of dealing with many sockets is to have a dedicated thread polling for I/O events with select(2), poll(2), or better kqueue(2) or epoll(4) (depending on the platform) acting as socket event dispatcher. The sockets are usually handled in non-blocking mode. Then one might have pool of threads reacting to the events and either do reads and writes directly or via lower level buffers/queues.
All sorts of techniques are applicable here - from queues to event subscription whiteboards. It gets tricky with multiplexing accepts/reads/writes/EOFs on the I/O level and with event arbitration on the application level. Several libraries like libevent and boost::asio help structure the lower level (the ACE library is also in this space, but I'd hate recommending it to anybody). You would have to come up with application-level protocols and state machines yourself (again boost::statechart might be of help).
Some good links to get better understanding of what you are up against (this is probably the millionth time they are mentioned here on SO):
The C10K problem
High-Performance Server Architecture
Apologies for not offering a concrete solution, but this is a very wide design question and most decisions depend heavily on the context (lots of fun though). Hope this helps a bit.

Since you are sending data using different sockets, there must not be any problem. Rather when these different threads access same data you have to ensure data integrity.

Are you using UDP or TCP sockets?
If UDP, each write should be encapsulated in a separate packet and should be carried to the other side intact. The order may be swapped (as it may for any UDP packet) but they should be whole.
If TCP, there's no concept of packets on the transport layer and any 10 writes on one side may be bundled up on the other side in one read. TCP writes may also only accept part of your buffer so even if the send() function is atomic, your write isn't necessarily. In this case you'd need to synchronize it.

send() is not atomic in most implementations, so sending to 1000 different sockets from multiple threads could lead to mixed-up messages arriving on the client side, and all kinds of weirdness. (I know nothing, see Nicolai's and Robert's comments below the rest of my comment still stands though (in terms of being a solution to your problem))
What I would do is use threads for sending like you use them for receiving. One thread to manage sending to one (or more) sockets that ensures that you don't write to one socket from multiple threads at the same time.
Also look here for some additional discussion and more interesting links.
If you're on windows, the winsock programmers faq is an invaluable resource, for your issue see here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js