Separating messages in a simple TCP echo server using Winsock DLL

Separating messages in a simple TCP echo server using Winsock DLL - c++

Please consider a simple echo server using TCP and the Winsock DLL. The client application sends messages from multiple threads. The recv call on the server sometimes returns with multiple messages stored in the passed buffer. At this point, there's no chance for the server to know, whether this is one huge message or multiple small messages.
I've read that one could use setsockopt in combination with the TCP_NODELAY option. Besides that MSDN states, that this option is implemented for backward compatibility only, it doesn't even change the behavior described above.
Of course, I could introduce some kind of delimiter at the end of each message and split the message on server-side. But I don't think that's way one should do it. So, what is the right way to do it?

Firstly, TCP_NODELAY was not the right way to do this... TCP is a byte stream protocol and any given connection only maintains the byte ordering - not necessarily the boundaries of any given send/write. It's inherently broken to rely on multiple threads that don't use any synchronisation being able to even keep the messages they want to send together on the stream. For example, say thread 1 wants to send the two-byte message "AB" and thread 2 wants to send "XY"... say thread 1 starts first and the output buffer only has room for one byte, send will enqueue "A" and let thread 1 know it's only sent one byte (so it should loop and retry - preferable after waiting for notification that the output queue has more space). Then, thread 2 might get some or all of "XY" into the queue before thread 1 can get "Y". These sorts of problems become more severe on slower connections, for slow and loaded machines (e.g. perhaps a low-powered phone that's playing video and multitasking while your app runs over 3G).
The ways to ensure the logical messages stay together over TCP include:
have a single sending thread that picks up messages sequentially from a shared queue (a mutex might be used to let the threads enqueue messages)
contest a lock (mutex) so the threads' sends have an uninterrupted ability to loop to send until a complete message is sent (this wouldn't suit some apps because any of the threads could be held up for quite a while doing comms work)
use a separate TCP connection per thread

Related

simultanious read/write on the same serial port

I am building an application that intersepts a serial comunication line by recieving the transmition, modifieng the data, and echoing the changed result.
The transmitted data is made of status sentances at high baudrate with alot of data.
I have created two threads, one reads the sentaces and pushes a pointer to each new sentance into a queue, and the Other pops the pointers out of the queue, manipulates them, sends them to the serial port and deletes the pointer.
The queue operstions are in external functions with CririticalSection locks so that works fine.
To make sure the queue doesnt overflow quickly i need to send the messages quickly and not wait for the recieving to end.
To my understanding serial ports can recieve and transmit simultaniously but trying to do so gives error with access resttictions.
The other solution is to split the system into two diffrent ports but I try to avoid it because the hardware changes and the need of another USB and convertor.
I read about Overlapped structures but didnt fully understood what is their usage and, as I got it they manage asinc operation where my issue is parallel operation.
Sorry for my lame english, any help or explanation will help.
I used this class for the serial comunication, setting overlapped to enable when opening the comport to allow wait event timeouts:
http://www.codeproject.com/Articles/992/Serial-library-for-C
Thanks in advance.
Roman.
Clarification:
Im not opening the port twice, just once in the main program and pass the handler to both threads (writing it now maximizes the problem in this approach
More details:
The error comes from the Cserial library:
"Cserial::read overlapped complete without result." Commenting the send back to serial command in the sending thread will not raise an error and the queue is filled and displays correctly–
Im on a classified system without internet access so i cant upload the sample, writing from my tablet. The error accures after I get the first sentace, which triggers the first send command ss soon as queues size changes, and then the recieving thread exits because recieve failes, so the queue stops to fill and nothing sends out.
Probbly because both use same serial handler but whats the alternative to access the same port simultaniosly without locking one thread or the other
Ignoring error 996, which is the error id of the "read overlapped completed without results" and not exiting the thread when its detected makes both recieve an transmited data wrong (missing bytes)
At the buttom line, after asking alot of questions:
Why a read operation is interrupted by a write operation if these are two seperate comunication lines?can i use two handlers one for each task on the same port?
Is the D+/- in usb is transmit/recieve or both line used for transmit and recieve?

":read overlapped complete without result"
Are you preventing the read from being interrupted by the OS switching execution to the write thread? You need to protect this from happening by using a mutex or similar.
The real solution is to switch to an asynchronous library, such as bosst::asio.
Why a read operation is interrupted by a write operation if these are two seperate comunication lines?
here is a possible hand-waving visualization of what happens if you use synchronous operations in two threads without locking them against each other. ( I am guessing at the details of how you arranged your software )
Your app receives a read request from the port.
Your app requests the OS to start the read thread.
OS agrees, and your read thread completes the read.
-. Your app does its processing.
Your app asks the OS to start the write thread.
The OS agrees, and your write thread starts a write.
A second read request arrives on the port. This does not interrupt anything, it just waits.
The write is not yet finished, but the OS decides that the write thread has had enough time. It decides to switch context to the read thread which is waiting.
The read thread starts reading
Again the OS decides that the running thread ( read ) has had a fair crack at the CPU . It switches context back to the write thread. This crashes the unfinished read. Note that this happens in your software, not in the hardware, or the hardware driver.
This should give you a general insight into the sort of problems that occur, unless you keep the OS from running the reads and writes over the top of each other. It is a matter of opinion wehter it is better to use multithreading with mutexes ( or equivalent ) or asynchronous event-driven designs.

Two threads can't operate on single port / file descriptior. Depending on what library you used you should try to do this asynchronous or by checking how many bytes can be read/write without blocking thread. (if it is Linux raw filedescriptor you should look at poll / select)

Looking for best approach to sending the same data to multiple destinations using sockets

Looking for the best approach to sending the same message to multiple destinations using TCP/IP sockets. I'm working with an existing VS 2010 C++ application on Windows. Hoping to use a standard library/design pattern approach that has many of the complexities already worked out if possible.
Here's one approach I'm thinking about.. One main thread retrieves messages from a database and adds them to some sort of thread safe queue. The application also has one thread for each client socket connection to some destination server. Each one of these threads would read from the thread safe queue, and send the message over a tcp/ip socket.
There may be better/simpler/more robust approaches than this one though..
The issues I have to be concerned about mostly are latency. The destinations could be anywhere, and there may be significant latency between one socket connection and another.
The messages must go in an exact FIFO order to all the destinations.
Also one destination will be considered the primary destination.. all messages must get to this destination, no exceptions. For the other destinations, i.e. non-primary, the messages are just copies and it's not absolutely critical if the non-primary destinations do not receive a few messages. At any point, one of the non-primary destinations could become the primary destination. If one of the destinations falls too far behind, then that thread would need to catch up to the primary destination, but skipping some messages.
Looking for any suggestions. Preliminary research so far, my situation appears to be something akin to a single producer and multiple consumers pattern, or possibly master-worker pattern in Java.
I need to implement this in C++ on Windows, and the application must use tcp/ip sockets using an existing defined protocol.
Any help at all would be greatly appreciated.

You need exactly two threads, one that saturates the IO channel to the database and another that saturates the IO channel to the network leading to the 12 servers. Unless you have multiple network interfaces (which you should think about!) you don't send things faster by using multiple threads. Also, since you don't have multiple threads taking care of the network, you don't have to sync them.
What you definitely need to know about is select(). In the case of WinSock, also take a look at WSAEventSelect/WaitForMultipleObjects. Basically, you take a message from the queue and then send it to all clients when they're ready. select() tells you when one of a set of sockets is ready to accept data, so you don't waste time waiting or block trying to send data. What you need to come up with is a schema to reconnect after broken connections, when to drop messages to lagging clients etc. Also, in case the throughput to the different targets varies a lot, you need to think about handling multiple messages in parallel. If they are small (less than a network packet's payload) it makes sense combining them anyway to avoid overhead.
I hope this short overview helps getting you started, otherwise I can elaborate on the details.

Multi-reader IPC solution?

I'm working on a framework in C++ (just for fun for now), that lets the user write plugins that use a standard API to stream data between each other. There's going to be three basic transport mechanisms for the data: files, sockets, and some kind of IPC piping system. The system is set up so that for the non-file transport, each stream can have multiple readers. IE once a server socket it setup, multiple computers can connect and stream the data. I'm a little stuck at the multi-reader IPC system though.
All my plugins run in threads (though I may want to go to a process-based system eventually) so they live in the same address space, so some kind of shared memory system would work fine, I was thinking I'd write my own circular buffer with a write pointer and read pointers chassing it around the buffer, but I have my doubts that I can achieve the same performance as something like linux pipes.
I'm curious what people would suggest for a multi-reader solution to something like this? Is the overhead for pipes or domain sockets low enough that I could just open a connection to each reader and issue separate writes to each reader? This is intended to be significant volumes of data (tens of mega-samples/sec), so performance is a must.

I develop a media server, and i usually use a single reader for a group of all active sockets of the same class. You can use a select() (in a blocking or non blocking mode) function for each group to read the sockets that became ready to be read. When a socket data is ready or a new connection occur i just call a notify callback function to manage it.
Each reader (that controls a group of sockets) could be managed by a separate thread, avoiding your main threads to block while waiting for new connections or socket data.

If I understand the description correctly, it seems to me that using a circular queue as you mention would be a good IPC solution. I think it could scale very well and would ultimately be better than individual pipes or individual shared memory for each client. One (of several) of the issues of using a single queue/buffer for multiple clients is to synchronize access to the buffers. A client needs to be able to successfully read an entry in the queue without the server changing it. Here is a possible mechanism for implementing that.
This requires that the server know how many active clients there are. That, I assume, would be possible as long as the clients are doing some kind of registration/login with the server (almost certainly true if they are in-process but not necessarily true for out-of-process clients).
Suppose there are N clients. For this example, assume 100 active clients.
Maintain two counting semaphores for each entry in the circular queue. If using out-of-process clients, these need to be shared between processes. Call the semaphores SemReady and SemDone.
Use SemReady to indicate that the buffer is ready for clients to read. The server writes to the buffer entry and then sets the value of the semaphore to the number of clients (100 in this case). More on this in a bit.
When a client wants to read an entry in the queue, it waits on the associated SemReady semaphore. If the initial value is at 100, then all 100 clients can successfully get the semaphore and “concurrently” read the data.
When a client is done reading/using the entry, it increments/releases the SemDone semaphore.
When a server wants to write to a buffer entry, it needs to make sure of two things: a) no clients are currently reading it, and b) no clients start to read it once the server is writing to it.
Therefore, first, block any further access to the buffer by waiting on the SemReady semaphore until the count is zero (obviously, use a zero timeout). When it hits zero, the server knows that no additional clients will start reading it.
To know that clients are done with the buffer, the server uses the SemDone semaphore. It checks the SemDone and waits until it is at value is at N minus the number of waits it did on SemReady. In other words, if SemReady was at zero, then it means all clients read the buffer entry, therefore, SemDone should be at N (100) when they are done. If, though, the server waited 10 times on SemReady, then SemDone should be at 90 (N-10) when all clients are done.
The above step needs some kind of timeout and status check on client “liveness” in case a client crashes/quits after getting SemReady and before releasing SemDone. Also, it would need to account for the possibility of new client registering during that step as well in order to keep the semaphore count values in sync.
Once the server has found no more clients are reading the buffer, it can reset SemDone to zero, write new data to the entry, and set SemReady to N (100).
Rinse and repeat.
Note 1 There are other synchronization issues to maintain the head/tail of the circular queue so that clients know where it is.
Note 2 SemDone could probably be an integer counter handled with atomic increments… I think it could anyway. Needs a bit of thought.
Note 3 It might make sense to have multiple threads in the server writing to the buffer entries. That way, if the server has to wait/timeout a bit on a crashed client that started reading but did not finish, it would not block subsequent queue entries that other clients might already be waiting for.

How can I slow down a TCP connection on Windows?

I am developing a Windows proxy program where two TCP sockets, connected through different adapters are bridged by my program. That is, my program reads from one socket and writes to the other, and vice versa. Each socket is handled by its own thread. When one socket reads data it is queued for the other socket to write it. The problem I have is the case when one link runs at 100Mb and the other runs at 10Mb. I read data from the 100Mb link faster than I can write it to the 10Mb link. How can I "slow down" the faster connection so that it is essentially running at the slower link speed? Changing the faster link to a slower speed is not an option. --Thanks

Create a fixed length queue between reading and writing threads. Block on the enqueue when queue is full and on dequeue when it's empty. Regular semaphore or mutex/condition variable should work. Play with the queue size so the slower thread is always busy.

If this is a problem, then you're writing your program incorrectly.
You can't put more than 10mbps on a 10mbps link, so your thread that is writing on the slower link should start to block as you write. So as long as your thread uses the same size read buffer as write buffer, the thread should only consume data as quickly as it can throw it back out the 10mbps pipe. Any flow control needed to keep the remote sender from putting more than 10mbps into the 100mbps pipe to you will be taken care of automatically by the TCP protocol.
So it just shouldn't be an issue as long as your read and write buffers are the same size in that thread (or any thread).

Stop reading the data when you are not able to write it.
There is a queue of bytes coming into your program from the 100Mb/s link, and a queue out of your program to the 10Mb/s link. When the outgoing queue is full, stop reading from the incoming queue and TCP with throttle back the client on the 100Mb/s link.
You can use an internal queue between the reader and the writer to implement this cleanly.

A lot of complicated - and correct - solutions have been expounded. But really, to get to the crux of the matter - why do you have two threads? If you did the socket-100 read, socket-10 write in a single thread, it would naturally block on the write and you wouldn't have to design anything complicated.

If you are doing a non-blocking, select()-style event loop: only call FD_SET(readSocket, &readSet) if your outgoing-data queue is smaller than some hard-coded maximum size.
That way, when the outgoing socket falls behind, your proxy will stop reading data from the faster client until it catches back up. The TCP protocol will take care of the rest (in particular, it will tell your faster client to slow down for a while)

Producer/Consumer For Talking to Devices Serially

Here is my problem: I have to be able to send and receive to a device over serial. This has to be done in a multi-threaded fashion. The flow is as follows:
Wait for device to send me something - or if idle, then query status to see if online with device
If device sends me something, then process message, acknowledge, and tell device to perform other commands as necessary
Right now, I have a receive thread and transmit thread. The receive thread has a while loop that keeps checking the serial port via ReadFile(...) for one byte. If I have a byte, then I begin building my buffer and then parse the data to determine what was sent to me.
The send thread takes the next command defined by the read thread and sends it via WriteFile to the same COM port. The key is that there is a receive/send relationship between myself and the device.
My question is, do I have a nested Producer/Consumer model here? If my receive thread is consuming from the device and the send thread is producing to the device, the threads need to inherently talk so they are synchronized-right? What is the best way to synchronize my efforts in efficiently and quickly talk to the device? Note: I am using C++ Builder 5 which has TThreads and can use critical sections and mutexes.
Edit: I am also using polling so I am open to using WaitCommEvent as well if this will work better!

What resources are you sharing that you think you need to synchronize?
If you have something like a queue in between the two threads then that is a pretty classic producer/consumer model. E.G. If you just have one thread reading and then putting commands in a queue while another thread extracts from the queue, processes the command and writes to the device then you need to synchronize access to the queue with a mutex or semaphore.
Perhaps I'm missing something but this should only get complicated if you have multiple threads reading from the queue and the commands which need to be transmitted need to stay in order. So try to keep it simple.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js