client-server design - c++

i want to develop a pretty basic client-server program.
one software reads xml (or any data) and send it to the server who in turn will manipulate it a little bit and eventually will write it to the disk.
the thing is that if i have many xml files on disk (on my client side), i want to open multiple connection to the server , and not doint one by one.
my first question is : let's say i have one thread who keeps all the files handles and waitformultipleobjects on them, so it will know when one of them is ready to be read from disk. and for every file i have an appropriate socket who suppose to send that specifi file to the server. for the socket i can use the select function to know which sockets are ready for sent. but is there way to know that both the file and the appropraite socket are ready to be sent ?
second, is there a more efficient way to design the client, cuase on my current design i'm using just one thread which on multi processor computer is rather not efficient enough.
(though i'm sure is till better then laucning new thread for every socket connection)
third, for the server i read about the reactor pattern. it seems appropriate but still ,like my second question, seems not effient enought while using one thread.
maybe i can use something with completion ports ? think they are pretty efficient but never really used them, so don't know exactly how.
any answers and general suggestion would be great.

Take a look at boost::asio it uses a proactor pattern (see the docs) that basically uses the OS wait operations (waitforsingle/multiple,select,epoll, etc...) to make very efficient use of a single thread in a system like you're looking at implementing.
asio can read/write files as well as sockets. You could sumbit an async read for the file using asio, it would call your callback on completion then you would submit that read buffer as an async write to the socket. Asio would take care of delivering all async writes buffers as the socket completed each pending write operation.
Each of these operations is done asynchronously so the thread is only really busy to initiate reads or writes, sitting idle the rest of the time.

Related

c++ socket programming: creating multiple streams

I am working on an app to start multiple streams in listener and caller modes after creating sockets. Right now, if I start one stream, the process kind of hangs because the stream is waiting for data. So this is clear to me that I need to start the stream in an async kind of process, so that the rest of the app keeps working.
Do I start the stream in:
separate threads
separate processes using fork
also read about select, will that work
Does blocking/non-blocking sockets solve this problem.
This app is being done in c++.
You can either use a library like Boost.Asio or the C function poll() (or select() which does basically the same thing) to wait on multiple sockets at once. Either way, you want to "multiplex" the sockets, meaning you block until any of them has data available, then you read from that one. This is how many network applications are built, and is usually more efficient, more scalable, and less error-prone than having a thread or process for each connection.

Boost asio - synchronous write / read - how to do?

First, I want to say that I'm new with Boost asio, and I see a lot of examples but it remains things I don't understand.
I want to create a server, that will accept two clients (it will use two socket). The first client will send messages to the server and the server will send this message to the other client (yes, it is useless to use a server, but it's not the point here, I want to understand how all this work). This will happen until one of the client close.
So, I created a server, the server wait for the clients, and then, it must wait for the first client to send some message. And this is my question: what must I do after?
I thought I need to read the first socket, and then write on the second, and so and so, but how I know if the first client writed on the socket? Same, how I know if the second client read the second socket?
I don't need code, I just want to know the good way to do that.
Thanks a lot for reading!
When you perform async_read you specifify a callback which is going to be called whenever any data is read to the buffer ( you should provide the buffer also, check the async_read's documentation ). Respectively you should provide callback for the async_write to know when your data is already sent. So, from the server perspective, for the client which 'writes' you should do async_read, and for the second client which 'reads' you should do async write. With the offered dataflow client1->server->client2 it is hard to recognize which client the server should read from and which one is write to. It's up to you. You can choose the first connected client as writer and the second as reader, for example.
You might want to start with asio iostreams. It's a high-level iostream-like abstraction above asynchronous sockets.
P.S.: also, don't forget to run io_service.run() loop somewhere. Because all the asio callbacks are executed within that loop.

Unix IPC sockets send/recv sync

I'm using local a Unix socket to communicate between two different processes. Thing is, some parts of the code on bth ends take different time to run, and I need recv and send to be synced across both processes. Is there a way to force send and recv to wait for the next corresponding line on the opposite process?
You must implement a protocol. After all, you can not be sure that the sockets are in sync. For example you could send one package with 100 bytes and then receive two ore even more packages adding it up.
By default, recv() will block (wait) until there are data to read, while send() will block until there is space in the buffer to write to. For most applications, this is enough synchronisation (if you design your protocol sanely).
So I recommend you just think about the details of how your communication will work, and try it out. Then if there is still a problem, come back with a question that is as specific as possible.

Asynchronous event loop design and issues

I'm designing event loop for asynchronous socket IO using epoll/devpoll/kqueue/poll/select (including windows-select).
I have two options of performing, IO operation:
Non-blocking mode, poll on EAGAIN
Set socket to non-blocking mode.
Read/Write to socket.
If operation succeeds, post completion notification to event loop.
If I get EAGAIN, add socket to "select list" and poll socket.
Polling mode: poll and then execute
Add socket to select list and poll it.
Wait for notification that it is readable writable
read/write
Post completion notification to event loop of sucseeds
To me it looks like first would require less system calls when using in normal mode,
especially for writing to socket (buffers are quite big).
Also it looks like that it would be possible to reduce the overhead over number of "select"
executions, especially it is nice when you do not have something that scales well
as epoll/devpoll/kqueue.
Questions:
Are there any advantages of the second approach?
Are there any portability issues with non-blocking operations on sockets/file descriptors over numerous operating systems: Linux, FreeBSD, Solaris, MacOSX, Windows.
Notes: Please do not suggest using existing event-loop/socket-api implementations
I'm not sure there's any cross-platform problem; at the most you would have to use Windows Sockets API, but with the same results.
Otherwise, you seem to be polling in either case (avoiding blocking waits), so both approaches are fine. As long as you don't put yourself in a position to block (ex. read when there's no data, write when buffer's full), it makes no difference at all.
Maybe the first approach is easier to code/understand; so, go with that.
It might be of interest to you to check out the documentation of libev and the c10k problem for interesting ideas/approaches on this topic.
The first design is the Proactor Pattern, the second is the Reactor Pattern
One advantage of the reactor pattern is that you can design your API such that you don't have to allocate read buffers until the data is actually there to be read. This reduces memory usage while you're waiting for I/O.
from my experience with low latency socket apps:
for writes - try to write directly into the socket from writing thread (you need to obtain event loop mutex for that), if write is incomplete subscribe to write readiness with event loop (select/waitformultipleobjects) and write from event loop thread when socket gets writable
for reads - be always "subscribed" for read readiness for all sockets, so you always read from within event loop thread when the socket gets readable

Multi-reader IPC solution?

I'm working on a framework in C++ (just for fun for now), that lets the user write plugins that use a standard API to stream data between each other. There's going to be three basic transport mechanisms for the data: files, sockets, and some kind of IPC piping system. The system is set up so that for the non-file transport, each stream can have multiple readers. IE once a server socket it setup, multiple computers can connect and stream the data. I'm a little stuck at the multi-reader IPC system though.
All my plugins run in threads (though I may want to go to a process-based system eventually) so they live in the same address space, so some kind of shared memory system would work fine, I was thinking I'd write my own circular buffer with a write pointer and read pointers chassing it around the buffer, but I have my doubts that I can achieve the same performance as something like linux pipes.
I'm curious what people would suggest for a multi-reader solution to something like this? Is the overhead for pipes or domain sockets low enough that I could just open a connection to each reader and issue separate writes to each reader? This is intended to be significant volumes of data (tens of mega-samples/sec), so performance is a must.
I develop a media server, and i usually use a single reader for a group of all active sockets of the same class. You can use a select() (in a blocking or non blocking mode) function for each group to read the sockets that became ready to be read. When a socket data is ready or a new connection occur i just call a notify callback function to manage it.
Each reader (that controls a group of sockets) could be managed by a separate thread, avoiding your main threads to block while waiting for new connections or socket data.
If I understand the description correctly, it seems to me that using a circular queue as you mention would be a good IPC solution. I think it could scale very well and would ultimately be better than individual pipes or individual shared memory for each client. One (of several) of the issues of using a single queue/buffer for multiple clients is to synchronize access to the buffers. A client needs to be able to successfully read an entry in the queue without the server changing it. Here is a possible mechanism for implementing that.
This requires that the server know how many active clients there are. That, I assume, would be possible as long as the clients are doing some kind of registration/login with the server (almost certainly true if they are in-process but not necessarily true for out-of-process clients).
Suppose there are N clients. For this example, assume 100 active clients.
Maintain two counting semaphores for each entry in the circular queue. If using out-of-process clients, these need to be shared between processes. Call the semaphores SemReady and SemDone.
Use SemReady to indicate that the buffer is ready for clients to read. The server writes to the buffer entry and then sets the value of the semaphore to the number of clients (100 in this case). More on this in a bit.
When a client wants to read an entry in the queue, it waits on the associated SemReady semaphore. If the initial value is at 100, then all 100 clients can successfully get the semaphore and “concurrently” read the data.
When a client is done reading/using the entry, it increments/releases the SemDone semaphore.
When a server wants to write to a buffer entry, it needs to make sure of two things: a) no clients are currently reading it, and b) no clients start to read it once the server is writing to it.
Therefore, first, block any further access to the buffer by waiting on the SemReady semaphore until the count is zero (obviously, use a zero timeout). When it hits zero, the server knows that no additional clients will start reading it.
To know that clients are done with the buffer, the server uses the SemDone semaphore. It checks the SemDone and waits until it is at value is at N minus the number of waits it did on SemReady. In other words, if SemReady was at zero, then it means all clients read the buffer entry, therefore, SemDone should be at N (100) when they are done. If, though, the server waited 10 times on SemReady, then SemDone should be at 90 (N-10) when all clients are done.
The above step needs some kind of timeout and status check on client “liveness” in case a client crashes/quits after getting SemReady and before releasing SemDone. Also, it would need to account for the possibility of new client registering during that step as well in order to keep the semaphore count values in sync.
Once the server has found no more clients are reading the buffer, it can reset SemDone to zero, write new data to the entry, and set SemReady to N (100).
Rinse and repeat.
Note 1 There are other synchronization issues to maintain the head/tail of the circular queue so that clients know where it is.
Note 2 SemDone could probably be an integer counter handled with atomic increments… I think it could anyway. Needs a bit of thought.
Note 3 It might make sense to have multiple threads in the server writing to the buffer entries. That way, if the server has to wait/timeout a bit on a crashed client that started reading but did not finish, it would not block subsequent queue entries that other clients might already be waiting for.