Boost Asio - Message content transmitted wrong - c++

I am building a client server communication. The server sends Header+Data (using async_write and a seperate IO Thread), Client receives Header of fixed size and knows how much data it has to read.
The Problem: Sometimes the client receives wrong data. It seems like the server sends the wrong data.
void Session::do_write(std::shared_ptr<DataItem> data)
{
std::lock_guard<std::mutex> lk(doWrite_mutex);
std::vector<boost::asio::const_buffer> buffers;
buffers.push_back(boost::asio::buffer(&data->length, sizeof(uint32_t)));
buffers.push_back(boost::asio::buffer(&data->callbackID, sizeof(uint8_t)));
buffers.push_back(boost::asio::buffer(&data->isString, sizeof(bool)));
//Get the data to send into the buffer and make sure the given shared ptr to the data item keeps living until this function is finished.
buffers.push_back(boost::asio::buffer(data->getData(), data->length));
boost::asio::async_write(*socket_, buffers, boost::bind(&Session::onSend, this, data, _1,_2));
}
void Session::onSend(std::shared_ptr<DataItem> data,const boost::system::error_code ec, std::size_t length)
{ //Some logging, nothing special here
}
The data item is a polymorphic class to handle different kinds of data (vectors, strings,...). The getData() method returns a const void* to the actual data (e.g. myData->data() in case of vector). The data is stored as a shared_ptr inside the DataItem (to keep it from being destroyed).
In most cases the data is transmitted correctly.
I don't know where to debug or what I am doing wrong.

Invoking a write operation on a stream that has an outstanding async_write() operation fails to meet a requirement of async_write(), which can result in interwoven data. Additionally, if multiple threads are servicing the io_service event loop or Session::do_write() is invoked from a thread that is not processing the event loop, then the use of a mutex will fail to meet the thread safety requirement of the stream. This answer demonstrates using a queue to serialize multiple async_write() operations, and processing the queue with an asynchronous call chain within a strand, fulfilling both the requirements of async_write() and the stream's thread safety.
For further details, the async_write() function is a composed operations, resulting in zero or more calls to the stream's async_write_some() function. Therefore, if the program does not ensure that the stream performs no other write operations until the outstanding operation completes, the intermediate write operations can be mixed between other write operations, resulting in interwoven data. Furthermore, these intermediate operations invoke async_write_some() on the stream without having acquired doWrite_mutex, potentially violating the thread safety requirement for the stream. For more information about composed operations and strand usage, consider reading this answer.

Related

boost::asio::async_write_some - sequential function call

I am writing an application using boost.asio. I've an object of type boost::asio::ip::tcp::socket and (of course) I've boost::asio::io_context which run's function was called from only one thread. For writing data to the socket there are a couple of ways but currently I use socket's function async_write_some, something like the code below:
void tcp_connection::write(packet_ptr packet)
{
m_socket.async_write_some(boost::asio::buffer(packet->data(), packet->size()),
std::bind(&tcp_connection::on_write, this, std::placeholders::_1, std::placeholders::_2, packet));
}
There is another function in boost::asio namespace - async_write. And the documentation of async_write says:
This operation is implemented in terms of zero or more calls to the stream's async_write_some function, and is known as a composed operation. The program must ensure that the stream performs no other write operations (such as async_write, the stream's async_write_some function, or any other composed operations that perform writes) until this operation completes.
In async_write_some's documentation there is no such kind of 'caution'.
That's a little bit confusing to me and here I've got the following questions:
Is it safe to call async_write_some without waiting for the previous call to be finished? As far as I understood from boost's documentation I shouldn't do that with async_write, but what about async_write_some?
If yes, is the order in which the data is written to the socket the same as the functions were called? I mean if I called async_write_some(packet1) and async_write_some(packet2) - are the packets going to be written to the socket in the same order?
Which function I should use? What is the difference between them?
What is the reason that it's not safe to call async_write while the previous one hasn't finished yet?
no; the reason for that is probably documented with the underlying sockets API (BSD/WinSock).
not applicable. Note that the order in which handlers are invoked is guaranteed to match the order in which they were posted, so you could solve it using an async chain of async_write_some calls where the completion handler posts the next write. This is known as an implicit strand (see https://www.boost.org/doc/libs/master/doc/html/boost_asio/overview/core/async.html and Why do I need strand per connection when using boost::asio?).
99% of the time, use the free function. The difference is that it implements composed operation to send a "unit" of information, i.e. an entire buffer, message, or until a given completion condition is met.
async_write_some is the lowest-level building block, which doesn't even guarantee to write all of the data: remarks:
The write operation may not transmit all of the data to the peer.
Consider using the async_write function if you need to ensure that all
data is written before the asynchronous operation completes.
It's not unsafe¹ in the strictest sense. It just will not lead to correct results: this is because the order in which handlers are invoked leads to data being written to the socket in mixed-up order.
¹(unless you access the shared IO objects concurrently without synchronization)

How strands guarantee correct execution of pending events in boost.asio

Consider an echo server implemented using Boost.asio. Read events from connected clients result in blocks of data being placed on to an arrival event queue. A pool of threads works through these events - for each event, a thread takes the data in the event and echos it back to the connected client.
As shown in the diagram above, there could be multiple events in the event queue all from a single client. In order to ensure that these events for a given client are executed and delivered in order, strands are used. In this case, all events from a given connected client with be executed in a strand for the client.
My question is: how do strands guarantee the correct order of processing of events? I presume there must be some kind of lock-per-strand, but even that won't be sufficient, so there must be more to it, and I was hoping someone could perhaps explain it our point me to some code which does this?
I found this document:
How strands work and why you should use them
It sheds some light on the mechanism, but says that in a strand "Handler execution order is not guaranteed". Does that mean that we could end up with receiving back "Strawberry forever. fields"?
Also - whenever a new client connects, do we have to create a new strand, so that there is one strand per client?
Finally - when a read event arrives, how do we know which strand to add it to? The strand has to be looked up form all strands using the connection as a key?
strand provides a guarantee for non-concurrency and the invocation order of handlers; strand does not control the order in which operations are executed and demultiplexed. Use a strand if you have either:
multiple threads accessing a shared object that is not thread safe
a need for a guaranteed sequential ordering of handlers
The io_service will provide the desired and expected ordering of buffers being filled or used in the order in which operations are initiated. For instance, if the socket has "Strawberry fields forever." available to be read, then given:
buffer1.resize(11); // buffer is a std::vector managed elsewhere
buffer2.resize(7); // buffer is a std::vector managed elsewhere
buffer3.resize(8); // buffer is a std::vector managed elsewhere
socket.async_read_some(boost::asio::buffer(buffer1), handler1);
socket.async_read_some(boost::asio::buffer(buffer2), handler2);
socket.async_read_some(boost::asio::buffer(buffer3), handler3);
When the operations complete:
handler1 is invoked, buffer1 will contain "Strawberry "
handler2 is invoked, buffer2 will contain "fields "
handler3 is invoked, buffer3 will contain "forever."
However, the order in which the completion handlers are invoked is unspecified. This unspecified ordering remains true even with a strand.
Operation Demultiplexing
Asio uses the Proactor design pattern[1] to demultiplex operations. On most platforms, this is implemented in terms of a Reactor. The official documentation mentions the components and their responsibilities. Consider the following example:
socket.async_read_some(buffer, handler);
The caller is the initiator, starting an async_read_some asynchronous operation and creating the handler completion handler. The asynchronous operation is executed by the StreamSocketService operation processor:
Within the initiating function, if the socket has no other outstanding asynchronous read operations and data is available, then StreamSocketService will read from the socket and enqueue the handler completion handler into the io_service
Otherwise, the read operation is queued onto the socket, and the reactor is informed to notify Asio once data becomes available on the socket. When the io_service is ran and data is available on the socket, then the reactor will inform Asio. Next, Asio will dequeue an outstanding read operation from the socket, execute it, and enqueue the handler completion handler into the io_service
The io_service proactor will dequeue a completion handler, demultiplex the handler to threads that are running the io_service, from which the handler completion handler will be executed. The order of invocation of the completion handlers is unspecified.
Multiple Operations
If multiple operations of the same type are initiated on a socket, it is currently unspecified as to the order in which the buffers will be used or filled. However, in the current implementation, each socket uses a FIFO queue for each type of pending operation (e.g. a queue for read operations; a queue for write operations; etc). The networking-ts draft, which is based partially on Asio, specifies:
the buffers are filled in the order in which these operations were issued. The order of invocation of the completion handlers for these operations is unspecified.
Given:
socket.async_read_some(buffer1, handler1); // op1
socket.async_read_some(buffer2, handler2); // op2
As op1 was initiated before op2, then buffer1 is guaranteed to contain data that was received earlier in the stream than the data contained in buffer2, but handler2 may be invoked before handler1.
Composed Operations
Composed operations are composed of zero or more intermediate operations. For example, the async_read() composed asynchronous operation is composed of zero or more intermediate stream.async_read_some() operations.
The current implementation uses operation chaining to create a continuation, where a single async_read_some() operation is initiated, and within its internal completion handle, it determines whether or not to initiate another async_read_some() operation or to invoke the user's completion handler. Because of the continuation, the async_read documentation requires that no other reads occur until the composed operation completes:
The program must ensure that the stream performs no other read operations (such as async_read, the stream's async_read_some function, or any other composed operations that perform reads) until this operation completes.
If a program violates this requirement, one may observe interwoven data, because of the aforementioned order in which buffers are filled.
For a concrete example, consider the case where an async_read() operation is initiated to read 26 bytes of data from a socket:
buffer.resize(26); // buffer is a std::vector managed elsewhere
boost::asio::async_read(socket, boost::asio::buffer(buffer), handler);
If the socket receives "Strawberry ", "fields ", and then "forever.", then the async_read() operation may be composed of one or more socket.async_read_some() operations. For instance, it could be composed of 3 intermediate operations:
The first async_read_some() operation reads 11 bytes containing "Strawberry " into the buffer starting at an offset of 0. The completion condition of reading 26 bytes has not been satisfied, so another async_read_some() operation is initiated to continue the operation
The second async_read_some() operation reads 7 byes containing "fields " into the buffer starting at an offset of 11. The completion condition of reading 26 bytes has not been satisfied, so another async_read_some() operation is initiated to continue the operation
The third async_read_some() operation reads 8 byes containing "forever." into the buffer starting at an offset of 18. The completion condition of reading 26 bytes has been satisfied, so handler is enqueued into the io_service
When the handler completion handler is invoked, buffer contains "Strawberry fields forever."
Strand
strand is used to provide serialized execution of handlers in a guaranteed order. Given:
a strand object s
a function object f1 that is added to strand s via s.post(), or s.dispatch() when s.running_in_this_thread() == false
a function object f2 that is added to strand s via s.post(), or s.dispatch() when s.running_in_this_thread() == false
then the strand provides a guarantee of ordering and non-concurrency, such that f1 and f2 will not be invoked concurrently. Furthermore, if the addition of f1 happens before the addition of f2, then f1 will be invoked before f2.
With:
auto wrapped_handler1 = strand.wrap(handler1);
auto wrapped_handler2 = strand.wrap(handler2);
socket.async_read_some(buffer1, wrapped_handler1); // op1
socket.async_read_some(buffer2, wrapped_handler2); // op2
As op1 was initiated before op2, then buffer1 is guaranteed to contain data that was received earlier in the stream than the data contained in buffer2, but the order in which the wrapped_handler1 and wrapped_handler2 will be invoked is unspecified. The strand guarantees that:
handler1 and handler2 will not be invoked concurrently
if wrapped_handler1 is invoked before wrapped_handler2, then handler1 will be invoked before handler2
if wrapped_handler2 is invoked before wrapped_handler1, then handler2 will be invoked before handler1
Similar to the composed operation implementation, the strand implementation uses operation chaining to create a continuation. The strand manages all handlers posted to it in a FIFO queue. When the queue is empty and a handler is posted to the strand, then the strand will post an internal handle to the io_service. Within the internal handler, a handler will be dequeued from the strand's FIFO queue, executed, and then if the queue is not empty, the internal handler posts itself back to the io_service.
Consider reading this answer to find out how a composed operation uses asio_handler_invoke() to wrap intermediate handlers within the same context (i.e. strand) of the completion handler. The implementation details can be found in the comments on this question.
1. [POSA2] D. Schmidt et al, Pattern Oriented Software Architecture, Volume 2. Wiley, 2000.
A strand is an execution context which executes handlers within a critical section, on a correct thread.
That critical section is implemented (more or less) with a mutex.
It's a little cleverer than that because if a dispatcher detects that a thread is already in the strand, it appends the handler to a queue of handlers to be executed before the critical section has been left, but after the current handler has completed.
thus in this case the new handler is 'sort of' posted to the currently executing thread.
There are some guarantees in ordering.
strand::post/dispatch(x);
strand::post/dispatch(y);
will always result in x happening before y.
but if x dispatches a handler z during its execution, then the execution order will be:
x, z, y
note that the idiomatic way to handle io completion handlers with strands is not to post work to a strand in the completion handler, but to wrap the completion handler in the strand, and do the work there.
asio contains code to detect this and will do the right thing, ensuring correct ordering and eliding un-necessary intermediate posts.
e.g.:
async_read(sock, mystrand.wrap([](const auto& ec, auto transferred)
{
// this code happens in the correct strand, in the correct order.
});

is it valid to async send data before completion handler of the previous one was invoked?

I'm sending data asynchronously to TCP socket. Is it valid to send the next data piece before the previous one was reported as sent by completion handler?
As I know it's not allowed when sending is done from different threads. In my case all sending are done from the same thread.
Different modules of my client send data to the same socket. E.g. module1 sent some data and will continue when corresponding completion handler is invoked. Before this io_service invoked deadline_timer handler of module2 which leads to another async_write call. Should I expect any problems here?
Is it valid to send the next data piece before the previous one was
reported as sent by completion handler?
No it is not valid to interleave write operations. This is very clear in the documentation
This operation is implemented in terms of zero or more calls to the
stream's async_write_some function, and is known as a composed
operation. The program must ensure that the stream performs no other
write operations (such as async_write, the stream's async_write_some
function, or any other composed operations that perform writes) until
this operation completes.
emphasis added by me.
As I know it's not allowed when sending is done from different
threads. In my case all sending are done from the same thread.
Your problem has nothing to do with threads.
Yes, you can do that as long as the underlying memory (buffer) is not modified until the write handler is called. Calling async_write means you hand over the buffer ownership to Asio. When the write handler is called, the buffer ownership is given back to you.

concurrent async_write. is there a wait-free solution?

async_write() is forbidden to be called concurrently from different threads. It sends data by chunks using async_write_some and such chunks can be interleaved. So it is up to the user to take care of not calling async_write() concurrently.
Is there a nicer solution than this pseudocode?
void send(shared_ptr<char> p) {
boost::mutex::scoped_lock lock(m_write_mutex);
async_write(p, handler);
}
I do not like the idea to block other threads for a quite long time (there are ~50Mb sends in my application).
May be something like that would work?
void handler(const boost::system::error_code& e) {
if(!e) {
bool empty = lockfree_pop_front(m_queue);
if(!empty) {
shared_ptr<char> p = lockfree_queue_get_first(m_queue);
async_write(p, handler);
}
}
}
void send(shared_ptr<char> p) {
bool q_was_empty = lockfree_queue_push_back(m_queue, p)
if(q_was_empty)
async_write(p, handler);
}
I'd prefer to find a ready-to-use cookbook recipe. Dealing with lock-free is not easy, a lot of subtle bugs can appear.
async_write() is forbidden to be
called concurrently from different
threads
This statement is not quite correct. Applications can freely invoke async_write concurrently, as long as they are on different socket objects.
Is there a nicer solution than this
pseudocode?
void send(shared_ptr<char> p) {
boost::mutex::scoped_lock lock(m_write_mutex);
async_write(p, handler);
}
This likely isn't accomplishing what you intend since async_write returns immediately. If you intend the mutex to be locked for the entire duration of the write operation, you will need to keep the scoped_lock in scope until the completion handler is invoked.
There are nicer solutions for this problem, the library has built-in support using the concept of a strand. It fits this scenario nicely.
A strand is defined as a strictly
sequential invocation of event
handlers (i.e. no concurrent
invocation). Use of strands allows
execution of code in a multithreaded
program without the need for explicit
locking (e.g. using mutexes).
Using an explicit strand here will ensure your handlers are only invoked by a single thread that has invoked io_service::run(). With your example, the m_queue member would be protected by a strand, ensuring atomic access to the outgoing message queue. After adding an entry to the queue, if the size is 1, it means no outstanding async_write operation is in progress and the application can initiate one wrapped through the strand. If the queue size is greater than 1, the application should wait for the async_write to complete. In the async_write completion handler, pop off an entry from the queue and handle any errors as necessary. If the queue is not empty, the completion handler should initiate another async_write from the front of the queue.
This is a much cleaner design that sprinkling mutexes in your classes since it uses the built-in Asio constructs as they are intended. This other answer I wrote has some code implementing this design.
We've solved this problem by having a seperate queue of data to be written held in our socket object. When the first piece of data to be written is "queued", we start an async_write(). In our async_write's completion handler, we start subsequent async_write operations if there is still data to be transmitted.

About write buffer in general network programming

I'm writing server using boost.asio. I have read and write buffer for each connection and use asynchronized read/write function (async_write_some / async_read_some).
With read buffer and async_read_some, there's no problem. Just invoking async_read_some function is okay because read buffer is read only in read handler (means in same thread usually).
But, write buffer need to be accessed from several threads so it need to be locked for modifying.
FIRST QUESTION!
Are there any way to avoid LOCK for write buffer?
I write my own packet into stack buffer and copy it to the write buffer. Then, call async_write_some function to send the packet. In this way, if I send two packet in serial, is it okay invoking async_write_some function two times?
SECOND QUESTION!
What is common way for asynchronized writing in socket programming?
Thanks for reading.
Sorry but you have two choices:
Serialise the write statement, either with locks, or better
start a separate writer thread which reads requests from
a queue, other threads can then stack up requests on the
queue without too much contention (some mutexing would be required).
Give each writing thread its own socket!
This is actually the better solution if the program at the other end
of the wire can support it.
Answer #1:
You are correct that locking is a viable approach, but there is a much simpler way to do all of this. Boost has a nice little construct in ASIO called a strand. Any callback that has been wrapped using the strand will be serialized, guaranteed, no matter which thread executes the callback. Basically, it handles any locking for you.
This means that you can have as many writers as you want, and if they are all wrapped by the same strand (so, share your single strand among all of your writers) they will execute serially. One thing to watch out for is to make sure that you aren't trying to use the same actual buffer in memory for doing all of the writes. For example, this is what to avoid:
char buffer_to_write[256]; // shared among threads
/* ... in thread 1 ... */
memcpy(buffer_to_write, packet_1, std::min(sizeof(packet_1), sizeof(buffer_to_write)));
my_socket.async_write_some(boost::asio::buffer(buffer_to_write, sizeof(buffer_to_write)), &my_callback);
/* ... in thread 2 ... */
memcpy(buffer_to_write, packet_2, std::min(sizeof(packet_2), sizeof(buffer_to_write)));
my_socket.async_write_some(boost::asio::buffer(buffer_to_write, sizeof(buffer_to_write)), &my_callback);
There, you're sharing your actual write buffer (buffer_to_write). If you did something like this instead, you'll be okay:
/* A utility class that you can use */
class PacketWriter
{
private:
typedef std::vector<char> buffer_type;
static void WriteIsComplete(boost::shared_ptr<buffer_type> op_buffer, const boost::system::error_code& error, std::size_t bytes_transferred)
{
// Handle your write completion here
}
public:
template<class IO>
static bool WritePacket(const std::vector<char>& packet_data, IO& asio_object)
{
boost::shared_ptr<buffer_type> op_buffer(new buffer_type(packet_data));
if (!op_buffer)
{
return (false);
}
asio_object.async_write_some(boost::asio::buffer(*op_buffer), boost::bind(&PacketWriter::WriteIsComplete, op_buffer, boost::asio::placeholder::error, boost::asio::placeholder::bytes_transferred));
}
};
/* ... in thread 1 ... */
PacketWriter::WritePacket(packet_1, my_socket);
/* ... in thread 2 ... */
PacketWriter::WritePacket(packet_2, my_socket);
Here, it would help if you passed your strand into WritePacket as well. You get the idea, though.
Answer #2:
I think you are already taking a very good approach. One suggestion I would offer is to use async_write instead of async_write_some so that you are guaranteed the whole buffer is written before your callback gets called.
You could queue your modifications and perform them on the data in the write handler.
Network would most probably be the slowest part of the pipe (assuming your modification are not
computationaly expensive), so that you could perform mods while the socket layer is sending the
previous data.
Incase you are handling large number of clients with frequent connect/disconnect take a look at
IO completion ports or similar mechanism.