About write buffer in general network programming - c++

I'm writing server using boost.asio. I have read and write buffer for each connection and use asynchronized read/write function (async_write_some / async_read_some).
With read buffer and async_read_some, there's no problem. Just invoking async_read_some function is okay because read buffer is read only in read handler (means in same thread usually).
But, write buffer need to be accessed from several threads so it need to be locked for modifying.
FIRST QUESTION!
Are there any way to avoid LOCK for write buffer?
I write my own packet into stack buffer and copy it to the write buffer. Then, call async_write_some function to send the packet. In this way, if I send two packet in serial, is it okay invoking async_write_some function two times?
SECOND QUESTION!
What is common way for asynchronized writing in socket programming?
Thanks for reading.

Sorry but you have two choices:
Serialise the write statement, either with locks, or better
start a separate writer thread which reads requests from
a queue, other threads can then stack up requests on the
queue without too much contention (some mutexing would be required).
Give each writing thread its own socket!
This is actually the better solution if the program at the other end
of the wire can support it.

Answer #1:
You are correct that locking is a viable approach, but there is a much simpler way to do all of this. Boost has a nice little construct in ASIO called a strand. Any callback that has been wrapped using the strand will be serialized, guaranteed, no matter which thread executes the callback. Basically, it handles any locking for you.
This means that you can have as many writers as you want, and if they are all wrapped by the same strand (so, share your single strand among all of your writers) they will execute serially. One thing to watch out for is to make sure that you aren't trying to use the same actual buffer in memory for doing all of the writes. For example, this is what to avoid:
char buffer_to_write[256]; // shared among threads
/* ... in thread 1 ... */
memcpy(buffer_to_write, packet_1, std::min(sizeof(packet_1), sizeof(buffer_to_write)));
my_socket.async_write_some(boost::asio::buffer(buffer_to_write, sizeof(buffer_to_write)), &my_callback);
/* ... in thread 2 ... */
memcpy(buffer_to_write, packet_2, std::min(sizeof(packet_2), sizeof(buffer_to_write)));
my_socket.async_write_some(boost::asio::buffer(buffer_to_write, sizeof(buffer_to_write)), &my_callback);
There, you're sharing your actual write buffer (buffer_to_write). If you did something like this instead, you'll be okay:
/* A utility class that you can use */
class PacketWriter
{
private:
typedef std::vector<char> buffer_type;
static void WriteIsComplete(boost::shared_ptr<buffer_type> op_buffer, const boost::system::error_code& error, std::size_t bytes_transferred)
{
// Handle your write completion here
}
public:
template<class IO>
static bool WritePacket(const std::vector<char>& packet_data, IO& asio_object)
{
boost::shared_ptr<buffer_type> op_buffer(new buffer_type(packet_data));
if (!op_buffer)
{
return (false);
}
asio_object.async_write_some(boost::asio::buffer(*op_buffer), boost::bind(&PacketWriter::WriteIsComplete, op_buffer, boost::asio::placeholder::error, boost::asio::placeholder::bytes_transferred));
}
};
/* ... in thread 1 ... */
PacketWriter::WritePacket(packet_1, my_socket);
/* ... in thread 2 ... */
PacketWriter::WritePacket(packet_2, my_socket);
Here, it would help if you passed your strand into WritePacket as well. You get the idea, though.
Answer #2:
I think you are already taking a very good approach. One suggestion I would offer is to use async_write instead of async_write_some so that you are guaranteed the whole buffer is written before your callback gets called.

You could queue your modifications and perform them on the data in the write handler.
Network would most probably be the slowest part of the pipe (assuming your modification are not
computationaly expensive), so that you could perform mods while the socket layer is sending the
previous data.
Incase you are handling large number of clients with frequent connect/disconnect take a look at
IO completion ports or similar mechanism.

Related

Will async_receive_from writes in the buffer if the ioservice is busy handling a callback?

So, suppose I have the following callback for async_recv_from
void recv_callback(error_code&, std::size_t len) {
socket.async_recv_from(buffer,endpoint,recv_callback);
handle(buffer);
}
So, the first thing I do in the callback is request more receives, but since the ioservice is busy handling the callback, I though that maybe my buffer would not be overwritten before the callback is finished. Is that correct?
This depends.
It depends on the way the underlying IO operations are actually implemented. I think some OSes might actually write directly into user-space memory.
I'd always hand-off the actual buffer.

Boost Asio - Message content transmitted wrong

I am building a client server communication. The server sends Header+Data (using async_write and a seperate IO Thread), Client receives Header of fixed size and knows how much data it has to read.
The Problem: Sometimes the client receives wrong data. It seems like the server sends the wrong data.
void Session::do_write(std::shared_ptr<DataItem> data)
{
std::lock_guard<std::mutex> lk(doWrite_mutex);
std::vector<boost::asio::const_buffer> buffers;
buffers.push_back(boost::asio::buffer(&data->length, sizeof(uint32_t)));
buffers.push_back(boost::asio::buffer(&data->callbackID, sizeof(uint8_t)));
buffers.push_back(boost::asio::buffer(&data->isString, sizeof(bool)));
//Get the data to send into the buffer and make sure the given shared ptr to the data item keeps living until this function is finished.
buffers.push_back(boost::asio::buffer(data->getData(), data->length));
boost::asio::async_write(*socket_, buffers, boost::bind(&Session::onSend, this, data, _1,_2));
}
void Session::onSend(std::shared_ptr<DataItem> data,const boost::system::error_code ec, std::size_t length)
{ //Some logging, nothing special here
}
The data item is a polymorphic class to handle different kinds of data (vectors, strings,...). The getData() method returns a const void* to the actual data (e.g. myData->data() in case of vector). The data is stored as a shared_ptr inside the DataItem (to keep it from being destroyed).
In most cases the data is transmitted correctly.
I don't know where to debug or what I am doing wrong.
Invoking a write operation on a stream that has an outstanding async_write() operation fails to meet a requirement of async_write(), which can result in interwoven data. Additionally, if multiple threads are servicing the io_service event loop or Session::do_write() is invoked from a thread that is not processing the event loop, then the use of a mutex will fail to meet the thread safety requirement of the stream. This answer demonstrates using a queue to serialize multiple async_write() operations, and processing the queue with an asynchronous call chain within a strand, fulfilling both the requirements of async_write() and the stream's thread safety.
For further details, the async_write() function is a composed operations, resulting in zero or more calls to the stream's async_write_some() function. Therefore, if the program does not ensure that the stream performs no other write operations until the outstanding operation completes, the intermediate write operations can be mixed between other write operations, resulting in interwoven data. Furthermore, these intermediate operations invoke async_write_some() on the stream without having acquired doWrite_mutex, potentially violating the thread safety requirement for the stream. For more information about composed operations and strand usage, consider reading this answer.

Boost mutex usage for multipart processes

I have a c++ program with a socket communications class. Each socket has a large dedicated
buffer for assembling an output message, so usage would be like:
class CSocketClass {
public:
SetMsgHeader(int n) { Mutex_.lock(); DoWhateverIsNeededToSetHeaderInBuffer(n); } // where n would be the message type
SetMsgField(double a); { DoWhateverIsNeededToSetDataInBuffer(a); } // where a would be some arbitrary content
SendMsg(); { DoWhateverIsNeededToSendBuffer(); Mutex_.unlock(); } // where this would send the number of bytes added to the buffer since the header was set
private:
char buffer[reallylarge];
MiscSocketApparatus...
boost::mutex Mutex_;
};
Multiple threads could be trying to send messages, each consisting of three or more calls the set the header, the content, and finally sending the message on its way. To keep them from conflicting, I've tried to keep only a single writer at a time by using the Mutex. The desired behavior would be for a second-to-arrive writer to be blocked until the first-to-arrive writer unlocked the mutex. Then the blocked writer would be able to proceed.
This seems to work most of the time, but on rare occasions (not every day), deadlocks still seem to occur.
I'm much more familiar with simpler lock issues using scoped locks, but those concepts may not translate perfectly to this problem, where the lock needs to be persistent across a number of calls to the object owning the lock.
From reading the Boost synchronication tutorial, I think there are better ways to do this, but its not clear what would be best.
Any recommendations would be greatly appreciated.
Since each thread has its own buffer, have each build the complete message in its own buffer, then lock the mutex and send the message.
Better still, have one thread to actually dispatch messages, and N threads to create them. Put a thread-safe queue in between, so a thread creates a message, puts it in the queue, then (if needed) goes back to creating another message. The message sender just constantly waits for a message in the queue, retrieves it, sends it, and repeats.
You probably also want a thread-safe collection of buffers, so when a message has been sent, the sending thread can put the buffer where a message-builder thread can use it again when needed.
As an aside: for the buffer I'd use an std::string or a std::vector, instead of a raw array.

boost::asio asynchronous operations and resources

So I've made a socket class that uses boost::asio library to make asynchronous reads and writes. It works, but I have a few questions.
Here's a basic code example:
class Socket
{
public:
void doRead()
{
m_sock->async_receive_from(boost::asio::buffer(m_recvBuffer), m_from, boost::bind(&Socket::handleRecv, this, boost::asio::placeholders::error(), boost::asio::placeholders::bytes_transferred()));
}
void handleRecv(boost::system::error_code e, int bytes)
{
if (e.value() || !bytes)
{
handle_error();
return;
}
//do something with data read
do_something(m_recvBuffer);
doRead(); //read another packet
}
protected:
boost::array<char, 1024> m_recvBuffer;
boost::asio::ip::udp::endpoint m_from;
};
It seems that the program will read a packet, handle it, then prepare to read another. Simple.
But what if I set up a thread pool? Should the next call to doRead() be before or after handling the read data? It seems that if it is put before do_something(), the program can immediately begin reading another packet, and if it is put after, the thread is tied up doing whatever do_something() does, which could possibly take a while. If I put the doRead() before the handling, does that mean the data in m_readBuffer might change while I'm handling it?
Also, if I'm using async_send_to(), should I copy the data to be sent into a temporary buffer, because the actual send might not happen until after the data has fallen out of scope? i.e.
void send()
{
char data[] = {1, 2, 3, 4, 5};
m_sock->async_send_to(boost::buffer(&data[0], 5), someEndpoint, someHandler);
} //"data" gets deallocated, but the write might not have happened yet!
Additionally, when the socket is closed, the handleRecv will be called with an error indicating it was interrupted. If I do
Socket* mySocket = new Socket()...
...
mySocket->close();
delete mySocket;
could it cause an error, because there is a chance that mySocket will be deleted before handleRecv() gets called/finished?
Lots of questions here, I'll try to address them one at a time.
But what if I set up a thread pool?
The traditional way to use a thread pool with Boost.Asio is to invoke io_service::run() from multiple threads. Beware this isn't a one-size-fits-all answer though, there can be scalability or performance issues, but this methodology is by far the easiest to implement. There are many similar questions on Stackoverflow with more information.
Should the next call to doRead be before or after handling the read
data? It seems that if it is put before do_something(), the program
can immediately begin reading another packet, and if it is put after,
the thread is tied up doing whatever do_something does, which could
possibly take a while.
This really depends on what do_something() needs to do with m_recvBuffer. If you wish to invoke do_something() in parallel with doRead() using io_service::post() you will likely need to make a copy of m_recvBuffer.
If I put the doRead() before the handling, does
that mean the data in m_readBuffer might change while I'm handling it?
as I mentioned previously, yes this can and will happen.
Also, if I'm using async_send_to(), should I copy the data to be sent
into a temporary buffer, because the actual send might not happen
until after the data has fallen out of scope?
As the documentation describes, it is up to the caller (you) to ensure the buffer remains in scope for the duration of the asynchronous operation. As you suspected, your current example invokes undefined behavior because data[] will go out of scope.
Additionally, when the socket is closed, the handleRecv() will be called
with an error indicating it was interrupted.
If you wish to continue to use the socket, use cancel() to interrupt outstanding asynchronous operations. Otherwise, close() will work. The error passed to outstanding asynchronous operations in either scenario is boost::asio::error::operation_aborted.

concurrent async_write. is there a wait-free solution?

async_write() is forbidden to be called concurrently from different threads. It sends data by chunks using async_write_some and such chunks can be interleaved. So it is up to the user to take care of not calling async_write() concurrently.
Is there a nicer solution than this pseudocode?
void send(shared_ptr<char> p) {
boost::mutex::scoped_lock lock(m_write_mutex);
async_write(p, handler);
}
I do not like the idea to block other threads for a quite long time (there are ~50Mb sends in my application).
May be something like that would work?
void handler(const boost::system::error_code& e) {
if(!e) {
bool empty = lockfree_pop_front(m_queue);
if(!empty) {
shared_ptr<char> p = lockfree_queue_get_first(m_queue);
async_write(p, handler);
}
}
}
void send(shared_ptr<char> p) {
bool q_was_empty = lockfree_queue_push_back(m_queue, p)
if(q_was_empty)
async_write(p, handler);
}
I'd prefer to find a ready-to-use cookbook recipe. Dealing with lock-free is not easy, a lot of subtle bugs can appear.
async_write() is forbidden to be
called concurrently from different
threads
This statement is not quite correct. Applications can freely invoke async_write concurrently, as long as they are on different socket objects.
Is there a nicer solution than this
pseudocode?
void send(shared_ptr<char> p) {
boost::mutex::scoped_lock lock(m_write_mutex);
async_write(p, handler);
}
This likely isn't accomplishing what you intend since async_write returns immediately. If you intend the mutex to be locked for the entire duration of the write operation, you will need to keep the scoped_lock in scope until the completion handler is invoked.
There are nicer solutions for this problem, the library has built-in support using the concept of a strand. It fits this scenario nicely.
A strand is defined as a strictly
sequential invocation of event
handlers (i.e. no concurrent
invocation). Use of strands allows
execution of code in a multithreaded
program without the need for explicit
locking (e.g. using mutexes).
Using an explicit strand here will ensure your handlers are only invoked by a single thread that has invoked io_service::run(). With your example, the m_queue member would be protected by a strand, ensuring atomic access to the outgoing message queue. After adding an entry to the queue, if the size is 1, it means no outstanding async_write operation is in progress and the application can initiate one wrapped through the strand. If the queue size is greater than 1, the application should wait for the async_write to complete. In the async_write completion handler, pop off an entry from the queue and handle any errors as necessary. If the queue is not empty, the completion handler should initiate another async_write from the front of the queue.
This is a much cleaner design that sprinkling mutexes in your classes since it uses the built-in Asio constructs as they are intended. This other answer I wrote has some code implementing this design.
We've solved this problem by having a seperate queue of data to be written held in our socket object. When the first piece of data to be written is "queued", we start an async_write(). In our async_write's completion handler, we start subsequent async_write operations if there is still data to be transmitted.