How can I slow down a TCP connection on Windows? - c++

I am developing a Windows proxy program where two TCP sockets, connected through different adapters are bridged by my program. That is, my program reads from one socket and writes to the other, and vice versa. Each socket is handled by its own thread. When one socket reads data it is queued for the other socket to write it. The problem I have is the case when one link runs at 100Mb and the other runs at 10Mb. I read data from the 100Mb link faster than I can write it to the 10Mb link. How can I "slow down" the faster connection so that it is essentially running at the slower link speed? Changing the faster link to a slower speed is not an option. --Thanks

Create a fixed length queue between reading and writing threads. Block on the enqueue when queue is full and on dequeue when it's empty. Regular semaphore or mutex/condition variable should work. Play with the queue size so the slower thread is always busy.

If this is a problem, then you're writing your program incorrectly.
You can't put more than 10mbps on a 10mbps link, so your thread that is writing on the slower link should start to block as you write. So as long as your thread uses the same size read buffer as write buffer, the thread should only consume data as quickly as it can throw it back out the 10mbps pipe. Any flow control needed to keep the remote sender from putting more than 10mbps into the 100mbps pipe to you will be taken care of automatically by the TCP protocol.
So it just shouldn't be an issue as long as your read and write buffers are the same size in that thread (or any thread).

Stop reading the data when you are not able to write it.
There is a queue of bytes coming into your program from the 100Mb/s link, and a queue out of your program to the 10Mb/s link. When the outgoing queue is full, stop reading from the incoming queue and TCP with throttle back the client on the 100Mb/s link.
You can use an internal queue between the reader and the writer to implement this cleanly.

A lot of complicated - and correct - solutions have been expounded. But really, to get to the crux of the matter - why do you have two threads? If you did the socket-100 read, socket-10 write in a single thread, it would naturally block on the write and you wouldn't have to design anything complicated.

If you are doing a non-blocking, select()-style event loop: only call FD_SET(readSocket, &readSet) if your outgoing-data queue is smaller than some hard-coded maximum size.
That way, when the outgoing socket falls behind, your proxy will stop reading data from the faster client until it catches back up. The TCP protocol will take care of the rest (in particular, it will tell your faster client to slow down for a while)

Related

simultanious read/write on the same serial port

I am building an application that intersepts a serial comunication line by recieving the transmition, modifieng the data, and echoing the changed result.
The transmitted data is made of status sentances at high baudrate with alot of data.
I have created two threads, one reads the sentaces and pushes a pointer to each new sentance into a queue, and the Other pops the pointers out of the queue, manipulates them, sends them to the serial port and deletes the pointer.
The queue operstions are in external functions with CririticalSection locks so that works fine.
To make sure the queue doesnt overflow quickly i need to send the messages quickly and not wait for the recieving to end.
To my understanding serial ports can recieve and transmit simultaniously but trying to do so gives error with access resttictions.
The other solution is to split the system into two diffrent ports but I try to avoid it because the hardware changes and the need of another USB and convertor.
I read about Overlapped structures but didnt fully understood what is their usage and, as I got it they manage asinc operation where my issue is parallel operation.
Sorry for my lame english, any help or explanation will help.
I used this class for the serial comunication, setting overlapped to enable when opening the comport to allow wait event timeouts:
http://www.codeproject.com/Articles/992/Serial-library-for-C
Thanks in advance.
Roman.
Clarification:
Im not opening the port twice, just once in the main program and pass the handler to both threads (writing it now maximizes the problem in this approach
More details:
The error comes from the Cserial library:
"Cserial::read overlapped complete without result." Commenting the send back to serial command in the sending thread will not raise an error and the queue is filled and displays correctly–
Im on a classified system without internet access so i cant upload the sample, writing from my tablet. The error accures after I get the first sentace, which triggers the first send command ss soon as queues size changes, and then the recieving thread exits because recieve failes, so the queue stops to fill and nothing sends out.
Probbly because both use same serial handler but whats the alternative to access the same port simultaniosly without locking one thread or the other
Ignoring error 996, which is the error id of the "read overlapped completed without results" and not exiting the thread when its detected makes both recieve an transmited data wrong (missing bytes)
At the buttom line, after asking alot of questions:
Why a read operation is interrupted by a write operation if these are two seperate comunication lines?can i use two handlers one for each task on the same port?
Is the D+/- in usb is transmit/recieve or both line used for transmit and recieve?
":read overlapped complete without result"
Are you preventing the read from being interrupted by the OS switching execution to the write thread? You need to protect this from happening by using a mutex or similar.
The real solution is to switch to an asynchronous library, such as bosst::asio.
Why a read operation is interrupted by a write operation if these are two seperate comunication lines?
here is a possible hand-waving visualization of what happens if you use synchronous operations in two threads without locking them against each other. ( I am guessing at the details of how you arranged your software )
Your app receives a read request from the port.
Your app requests the OS to start the read thread.
OS agrees, and your read thread completes the read.
-. Your app does its processing.
Your app asks the OS to start the write thread.
The OS agrees, and your write thread starts a write.
A second read request arrives on the port. This does not interrupt anything, it just waits.
The write is not yet finished, but the OS decides that the write thread has had enough time. It decides to switch context to the read thread which is waiting.
The read thread starts reading
Again the OS decides that the running thread ( read ) has had a fair crack at the CPU . It switches context back to the write thread. This crashes the unfinished read. Note that this happens in your software, not in the hardware, or the hardware driver.
This should give you a general insight into the sort of problems that occur, unless you keep the OS from running the reads and writes over the top of each other. It is a matter of opinion wehter it is better to use multithreading with mutexes ( or equivalent ) or asynchronous event-driven designs.
Two threads can't operate on single port / file descriptior. Depending on what library you used you should try to do this asynchronous or by checking how many bytes can be read/write without blocking thread. (if it is Linux raw filedescriptor you should look at poll / select)

Non-blocking data sharing through OpenMPI

I'm trying to spread data across multiple workers using OpenMPI, however, I'm doing the data division in a fairly custom way that is not amenable to MPI_Scatter or MPI_Broadcast. What I would like to do is to give each processor some work in a queue (or, some other async mechanism) such that they can do their work on the first chunk of data, take the next chunk, repeat until no more chunks.
I know of MPI_Isend, however if I send data with MPI_Isend I can't modify it until it's finished sending; forcing me to use MPI_Wait and thus having to wait until the thread is finished receiving the data anyway!
Is there a standard a solution to this problem, or must I rethink my approach?
Using MPI_ISEND doesn't necessarily mean that the message is received on the remote end. It just means that the buffer is available for reuse. It could be that the message has been buffered internally by Open MPI or that the message actually has been received on the other end. It depends on your message size.
Another option would be to have your workers ask the master process for work when they need it instead of having it pushed to them. Then the master can work only as needed. You could do an MPI_SCATTER for the first message since everyone will be receiving some data. Then after that, have the master do an MPI_RECV(MPI_ANY_SOURCE) to get a message from one of the worker processes.

Unix IPC sockets send/recv sync

I'm using local a Unix socket to communicate between two different processes. Thing is, some parts of the code on bth ends take different time to run, and I need recv and send to be synced across both processes. Is there a way to force send and recv to wait for the next corresponding line on the opposite process?
You must implement a protocol. After all, you can not be sure that the sockets are in sync. For example you could send one package with 100 bytes and then receive two ore even more packages adding it up.
By default, recv() will block (wait) until there are data to read, while send() will block until there is space in the buffer to write to. For most applications, this is enough synchronisation (if you design your protocol sanely).
So I recommend you just think about the details of how your communication will work, and try it out. Then if there is still a problem, come back with a question that is as specific as possible.

network delays and Application->ProcessMessages()

I am writing a networking DLL that I use in my C++Builder project. This DLL works with remote FTP servers. I noticed a strange behavior when recv() is called. Sometimes it returns 0. But in another thread when recv() is called on the same socket, data is received as expected.
What does this mean? I also noticed that calling Application->ProcessMessage() inside the DLL thread speeds up data receiving.
But what is wrong? Doesn't ProcessMessages() just process window messages or am I missing something?
Thank you
If I understood you correctly and you are trying to recv on the same SOCKET in parallel threads then don't do that, there is nothing to gain from it. The data you are recv is already buffered by the underlying system and you are accessing that, the thing you could do is to make multiple buffers for the recv so that when it returns data you could pass one buffer to the "upper levels" for processing and use the other one for the new recv call. You can also use just one large buffer with notifications what is for processing and what part is being used for receiving. The system probably has locks that forbid multiple reading on the same socket and so the result in one recv is 0. If it didn't have that you would probably end up with some almost randomly split data.
EDIT: Full and long explanation
I think that using multiple threads to read from a single socket is not useful
Sockets are a software regulated thing. You network device doesn't create any "connections", it just processes the data received and wraps/unwrapps them into IP (or any other
supported Internet Layer) packets (the previous depending on the network device, some of them are almost entirely software emulated by the os and actually perform just the basic "write to tx-read rx" services but to us its the same deal) . The WinSock2 service recognizes packets with specific data ( as you have already noticed ) so that you may use one network device for simultaneously
communicating with multiple peers. WinSock2 activly monitors the traffic before handing it out to you. In other words: when you are about to get a successfull recv the data
was already there and the underlying system has checked the socket you used as a parameter in recv and only handed you over the data that has already been marked as the data
for that socket. Reading with multiple threads from one socket (without the almost useless MSG_PEEK) would make the system, if it didn't have locks, copy unknown number of bytes
to the location supplied in recv in the thread one and increment the internal pointer to data by number of copied bytes permanently, then, before whole data availible in the
recv is copied at the location1, the other thread would kick in and copy also unknown number of bytes thus also incrementing the internal pointer to data by that many bytes.
Result of this type of reading would ideally be half of the data stored from location supplied in thread 1, the other half starting from location supplied in thread 2. Since the ideal result is uncertain (time allocated by the system for theese two threads is not guarantied to be equal) you would end up with unsorted data without any means of sorting
it, since the info that the underlying system uses for knowing what data belongs to which socket will not be able to you.
Being that your system is most likely faster than your network device I stand by my two solutions, first one prefered as I have been using this method for both big and small chunks of data transfer:
Make one reading thread per connected socket and one circular buffer, size of the buffer depends on the size of chunks you expect to receive and the time you will need to process the stuff further, save current read position, save "to process count", when data is received notify the thread/threads that it is supposed to process the data in the buffer, save the position of the data being used for reading, continue recv if there is buffer space not being processed else wait until there is (must implement this in case your computer chokes somewhere, in normal situations it shouldn't). You must sync the receiving thread with the processing thread/threads when they are accesing the "to_process_count" and "current read pos" vars as those will tell
you which bytes you can reuse in your circular buffer.
Create and connect one socket per desired reading thread so that the system will know how to regulate the data on its own
The thing you are refering too as random threads reading from a single socket, is maybe acievable through the following scenarios:
1 Thread Enumerates socket to see if there is data availible
when data is availible it uses some mutex to wait if some thread is already in the reading state starts a new thread to read and process the existing data
or it can be achieved with something like this
Thread does its recv as soon as it has done a successful recv (yey, the data is in the buffer) it starts another thread from some thread pool to do recv and continues to process data and end itself
Theese are the only ways I can imagine that "reading with multiple threads on a single socket" is achievable. Yes, there won't be multiple threads calling recv at the same time
Sorry for the long post, the spelling and grammar errors and hope this helps you a bit
Ensure that socket is properly bound to the handle you are using in recv function.
You cannot speedup data reception, unless there is channel to receive the data.

Multi-reader IPC solution?

I'm working on a framework in C++ (just for fun for now), that lets the user write plugins that use a standard API to stream data between each other. There's going to be three basic transport mechanisms for the data: files, sockets, and some kind of IPC piping system. The system is set up so that for the non-file transport, each stream can have multiple readers. IE once a server socket it setup, multiple computers can connect and stream the data. I'm a little stuck at the multi-reader IPC system though.
All my plugins run in threads (though I may want to go to a process-based system eventually) so they live in the same address space, so some kind of shared memory system would work fine, I was thinking I'd write my own circular buffer with a write pointer and read pointers chassing it around the buffer, but I have my doubts that I can achieve the same performance as something like linux pipes.
I'm curious what people would suggest for a multi-reader solution to something like this? Is the overhead for pipes or domain sockets low enough that I could just open a connection to each reader and issue separate writes to each reader? This is intended to be significant volumes of data (tens of mega-samples/sec), so performance is a must.
I develop a media server, and i usually use a single reader for a group of all active sockets of the same class. You can use a select() (in a blocking or non blocking mode) function for each group to read the sockets that became ready to be read. When a socket data is ready or a new connection occur i just call a notify callback function to manage it.
Each reader (that controls a group of sockets) could be managed by a separate thread, avoiding your main threads to block while waiting for new connections or socket data.
If I understand the description correctly, it seems to me that using a circular queue as you mention would be a good IPC solution. I think it could scale very well and would ultimately be better than individual pipes or individual shared memory for each client. One (of several) of the issues of using a single queue/buffer for multiple clients is to synchronize access to the buffers. A client needs to be able to successfully read an entry in the queue without the server changing it. Here is a possible mechanism for implementing that.
This requires that the server know how many active clients there are. That, I assume, would be possible as long as the clients are doing some kind of registration/login with the server (almost certainly true if they are in-process but not necessarily true for out-of-process clients).
Suppose there are N clients. For this example, assume 100 active clients.
Maintain two counting semaphores for each entry in the circular queue. If using out-of-process clients, these need to be shared between processes. Call the semaphores SemReady and SemDone.
Use SemReady to indicate that the buffer is ready for clients to read. The server writes to the buffer entry and then sets the value of the semaphore to the number of clients (100 in this case). More on this in a bit.
When a client wants to read an entry in the queue, it waits on the associated SemReady semaphore. If the initial value is at 100, then all 100 clients can successfully get the semaphore and “concurrently” read the data.
When a client is done reading/using the entry, it increments/releases the SemDone semaphore.
When a server wants to write to a buffer entry, it needs to make sure of two things: a) no clients are currently reading it, and b) no clients start to read it once the server is writing to it.
Therefore, first, block any further access to the buffer by waiting on the SemReady semaphore until the count is zero (obviously, use a zero timeout). When it hits zero, the server knows that no additional clients will start reading it.
To know that clients are done with the buffer, the server uses the SemDone semaphore. It checks the SemDone and waits until it is at value is at N minus the number of waits it did on SemReady. In other words, if SemReady was at zero, then it means all clients read the buffer entry, therefore, SemDone should be at N (100) when they are done. If, though, the server waited 10 times on SemReady, then SemDone should be at 90 (N-10) when all clients are done.
The above step needs some kind of timeout and status check on client “liveness” in case a client crashes/quits after getting SemReady and before releasing SemDone. Also, it would need to account for the possibility of new client registering during that step as well in order to keep the semaphore count values in sync.
Once the server has found no more clients are reading the buffer, it can reset SemDone to zero, write new data to the entry, and set SemReady to N (100).
Rinse and repeat.
Note 1 There are other synchronization issues to maintain the head/tail of the circular queue so that clients know where it is.
Note 2 SemDone could probably be an integer counter handled with atomic increments… I think it could anyway. Needs a bit of thought.
Note 3 It might make sense to have multiple threads in the server writing to the buffer entries. That way, if the server has to wait/timeout a bit on a crashed client that started reading but did not finish, it would not block subsequent queue entries that other clients might already be waiting for.