How to stop boost::asio async reads from getting mixed up? - c++

I am using boost::asio::ip::tcp::socket to receive data. I need an interface which allows me to specify a buffer and call a completion handler once this buffer is filled asynchronously.
When reading from sockets, we can use the async_read_some method.
However, the async_read_some method may read less than the requested number of bytes, so it must call itself with the rest of the buffer if this happens. Here is my current approach:
template<typename CompletionHandler>
void read(boost::asio::ip::tcp::socket* sock, char* target, size_t size, CompletionHandler completionHandler){
struct ReadHandler {
boost::asio::ip::tcp::socket* sock;
char* target;
size_t size;
CompletionHandler completionHandler;
ReadHandler(ip::tcp::socket* sock, char* target, size_t size, CompletionHandler completionHandler)
: sock(sock),target(target),size(size),completionHandler(completionHandler){}
// either request the remaining bytes or call the completion handler
void operator()(const boost::system::error_code& error, std::size_t bytes_transferred){
if(error){
return;
}
if(bytes_transferred < size){
// Read incomplete
char* newTarg =target+bytes_transferred;
size_t newSize = size-bytes_transferred;
sock->async_read_some(boost::asio::buffer(newTarg, newSize), ReadHandler(sock,newTarg,newSize,completionHandler));
return;
} else {
// Read complete, call handler
completionHandler();
}
}
};
// start first read
sock->async_read_some(boost::asio::buffer(target, size), ReadHandler(this,target,size,completionHandler));
}
So basically, we call async_read_some until the whole buffer is filled, then we call the completion handler. So far so good. However, I think that things get mixed up once I call this method more than once before the first call finishes a receive:
void thisMayFail(boost::asio::ip::tcp::socket* sock){
char* buffer1 = new char[128];
char* buffer2 = new char[128];
read(sock, buffer1, 128,[](){std::cout << "Buffer 1 filled";});
read(sock, buffer2, 128,[](){std::cout << "Buffer 2 filled";});
}
of course, the first 128 received bytes should go into the first buffer and the second 128 should go into the second. But in my understanding, it may be the case that this does not happen here:
Suppose the first async_read_some returns only 70 bytes, then it would issue a second async_read_some with the remaining 58 bytes. However, this read will be queued behind the second 128 byte read(!), so the first buffer will receive the first 70 bytes, the next 128 will go into the second buffer and the final 50 go into the first. I.e., in this case the second buffer would even be filled before the first is filled completely. This may not happen.
How to solve this? I know there is the async_read method, but its documentation says it is simply implemented by calling async_read_some multiple times, so it is basically the same as my read implementation and will not fix the problem.

You simply can't have two asynchronous read operations active at the same time: that's undefined behaviour.
You can
use the free function async_read_until or async_read functions, which already have the higher-level semantics and loop callling the socket's async_read_some until a condition is matched or the buffer is full.
use asynchronous operation chaining to sequence the next async read after the first. In short, you initiate the second boost::asio::async_read* call in the completion handler of the first.
Note:
Gives you the opportunity to act on transport errors first too.
together the free function interface will both raise the abstraction level of the code and solve the problem (the problem was initiating two simultaneous read operations)
use a strand in case you run multiple IO service threads; See Why do I need strand per connection when using boost::asio?

Related

use member function of socket class or common function in Boost::asio?

I am learning Boost::asio socket; I saw some examples, where they use the member function of socket class to read and receive messages, or use the boost::asio common function which passes the socket as the first param.
So I am wondering what the difference is between the two approaches? thanks!
//one kind
tcp::socket sock;
sock.async_write_some(***);
//another kind
boost::asio::async_write(socket,***);
async_write as static function guarantees that all data in buffer
is written before this function returns.
async_write_some as function member guarantees that at least one byte
is written from buffer before this function ends.
So if you want to use async_write_some you need to provide more code
to handle the situation when not all data from buffer was written.
Suppose you have string with 10 bytes, it is your buffer and you want to ensure
all buffer is send:
// Pseudocode:
string str;
// add 10 bytes to str
SIZE = 10;
int writtenBytes = 0;
socket.async_write_some (buffer(str.c_str(),SIZE), makeCallback());
void callback (
error_code ec,
size_t transferredBytes)
{
// check errors
// [1]
writtenBytes += transferredBytes;
if (writtenBytes == SIZE)
return;
// call async_write_some
socket.async_write_some (buffer(str.c_str()+writtenBytes,SIZE-writtenBytes),makeCallback());
}
in callback [1] you need to check how many bytes were written,
if result is different from SIZE you need to call async_write_some again
to send the remainder of data and so on, your callback may be invoked many times.
The use of async_write is simpler:
string str; // add 10 bytes
async_write (buffer(str.c_str(),SIZE),makeCallback());
void callback() {
// check errors
[1]
}
if no errors occured in [1] you know that all data was sent.

How can we sequentially receive multiple data from boost::asio::tcp::ip::read_some calls?

Let us suppose that a client holds two different big objects (in terms of byte size) and serializes those followed by sending the serialized objects
to a server over TCP/IP network connection using boost::asio.
For client side implementation, I'm using boost::asio::write to send binary data (const char*) to the server.
For server side implementation, I'm using read_some rather than boost::asio::ip::tcp::iostream for future improvement for efficiency. I built the following recv function at the server side. The second parameter std::stringstream &is holds a big received data (>65536 bytes) in the end of the function.
When the client side calls two sequential boost::asio::write in order to send two different binary objects separately, the server side sequentially calls two corresponding recv as well.
However, the first recv function absorbs all of two incoming big data while the second call receives nothing ;-(.
I am not sure why this happens and how to solve it.
Since each of two different objects has its own (De)Serialization function, I'd like to send each data separately. In fact, since there are more than 20 objects (not just 2) that have to be sent over the network.
void recv (
boost::asio::ip::tcp::socket &socket,
std::stringstream &is) {
boost::array<char, 65536> buf;
for (;;) {
boost::system::error_code error;
size_t len = socket.read_some(boost::asio::buffer(buf), error);
std::cout << " read "<< len << " bytes" << std::endl; // called multiple times for debugging!
if (error == boost::asio::error::eof)
break;
else if (error)
throw boost::system::system_error(error); // Some other error.
std::stringstream buf_ss;
buf_ss.write(buf.data(), len);
is << buf_ss.str();
}
}
Client main file:
int main () {
... // some 2 different big objects are constructed.
std::stringstream ss1, ss2;
... // serializing bigObj1 -> ss1 and bigObj2-> ss2, where each object is serialized into a string. This is due to the dependency of our using some external library
const char * big_obj_bin1 = reinterpret_cast<const char*>(ss1.str().c_str());
const char * big_obj_bin2 = reinterpret_cast<const char*>(ss2.str().c_str());
boost::system::error_code ignored_error;
boost::asio::write(socket, boost::asio::buffer(big_obj_bin1, ss1.str().size()), ignored_error);
boost::asio::write(socket, boost::asio::buffer(big_obj_bin2, ss2.str().size()), ignored_error);
... // do something
return 0;
}
Server main file:
int main () {
... // socket is generated. (communication established)
std::stringstream ss1, ss2;
recv(socket,ss1); // this guy absorbs all of incoming data
recv(socket,ss2); // this guy receives 0 bytes ;-(
... // deserialization to two bib objects
return 0;
}
recv(socket,ss1); // this guy absorbs all of incoming data
Of course it absorbs everything. You explicitly coded recv to do an infinite loop until eof. That's the end of the stream, which means "whenever the socket is closed on the remote end".
So the essential thing missing from the protocol is framing. The most common way to address it are:
sending data length before data, this way the server knows how much to read
sending a "special sequence" to delimit frames. In text, a common special delimiter would be '\0'. However, for binary data it is (very) hard to arrive at a delimiter that cannot naturally occur in the payload.
Of course, if you know extra characteristics of your payload you can use that. E.g. if your payload is compressed, you know you won't regularly find a block of 512 identical bytes (they would have been compressed). Alternatively you resort to encoding the binary data in ways that removes the ambiguity. yEnc, Base122 et al. come to mind (see Binary Data in JSON String. Something better than Base64 for inspiration).
Notes:
Regardless of that
it's clumsy to handwrite the reading loop. Next it is very unnecessary to do that and also copy the blocks into a stringstream anyways. If you're doing all that copying anyways, just use boost::asio::[async_]read with boost::asio::streambuf directly.
This is clear UB:
const char * big_obj_bin1 = reinterpret_cast<const char*>(ss1.str().c_str());
const char * big_obj_bin2 = reinterpret_cast<const char*>(ss2.str().c_str());
str() returns a temporary copy of the buffer - which not only is wasteful, but means that the const char* are dangling the moment they have been initialized.

C++ weird async behaviour

Note that I'm using boost async, due to the lack of threading classes support in MinGW.
So, I wanted to send a packet every 5 seconds and decided to use boost::async (std::async) for this purpose.
This is the function I use to send the packet (this is actually copying to the buffer and sending in the main application loop - nvm - it's working fine outside async method!)
m_sendBuf = new char[1024]; // allocate buffer
[..]
bool CNetwork::Send(const void* sourceBuffer, size_t size) {
size_t bufDif = m_sendBufSize - m_sendInBufPos;
if (size > bufDif) {
return false;
}
memcpy(m_sendBuf + m_sendInBufPos, sourceBuffer, size);
m_sendInBufPos += size;
return true;
}
Packet sending code:
struct TestPacket {
unsigned char type;
int code;
};
void SendPacket() {
TestPacket myPacket{};
myPacket.type = 10;
myPacket.code = 1234;
Send(&TestPacket, sizeof(myPacket));
}
Async code:
void StartPacketSending() {
SendPacket();
std::this_thread::sleep_for(std::chrono::seconds{5});
StartPacketSending(); // Recursive endless call
}
boost::async(boost::launch::async, &StartPacketSending);
Alright. So the thing is, when I call SendPacket() from the async method, received packet is malformed on the server side and the data is different than specified. This doesn't happend when called outside the async call.
What is going on here? I'm out of ideas.
I think I have my head wrapped around what you are doing here. You are loading all unsent in to buffer in one thread and then flushing it in a different thread. Even thought the packets aren't overlapping (assuming they are consumed quickly enough), you still to synchronize all the shared data.
m_sendBuf, m_sendInPos, and m_sendBufSize are all being read from the main thread, likely while memcpy or your buffer size logic is running. I suspect you will have to use a proper queue to get your program to work as intended in the long run, but try protecting those variables with a mutex.
Also as other commenters have pointed out, infinite recursion is not supported in C++, but that probably does not contribute to your malformed packets.

boost read_some function lost data

I'm implementing a tcp server with boost asio library.
In the server, I use asio::async_read_some to get data, and use asio::write to write data. The server code is something like that.
std::array<char, kBufferSize> buffer_;
std::string ProcessMessage(const std::string& s) {
if (s == "msg1") return "resp1";
if (s == "msg2") return "resp2";
return "";
}
void HandleRead(const boost::system::error_code& ec, size_t size) {
std::string message(buffer_.data(), size);
std::string resp = ProcessMessage(message);
if (!resp.empty()) {
asio::write(socket, boost::asio::buffer(message), WriteCallback);
}
socket.async_read_some(boost::asio::buffer(buffer_));
}
Then I write a client to test the server, the code is something like
void MessageCallback(const boost::system::error_code& ec, size_t size) {
std::cout << string(buffer_.data(), size) << std::endl;
}
//Init socket
asio::write(socket, boost::asio::buffer("msg1"));
socket.read_some(boost::asio::buffer(buffer_), MessageCallback);
// Or async_read
//socket.async_read_some(boost::asio::buffer(buffer_), MessageCallback);
asio::write(socket, boost::asio::buffer("msg1"));
socket.read_some(boost::asio::buffer(buffer_), MessageCallback);
// Or async_read
//socket.async_read_some(boost::asio::buffer(buffer_), MessageCallback);
If I run the client, the code will be waiting at second read_some, and output is:resp1.
If I remove the first read_some, the ouput is resp1resp2, that means the server done the right thing.
It seems the first read_some EAT the second response but don't give the response to MessageCallback function.
I've read the quesion at What is a message boundary?, I think if this problem is a "Message Boundary" problem, the second read_some should print something as the first read_some only get part of stream from the tcp socket.
How can I solve this problem?
UPDATE:
I've try to change the size of client buffer to 4, that output will be:
resp
resp
It seems the read_some function will do a little more than read from the socket, I'll read the boost code to find out is that true.
The async_read_some() member function is very likely not doing what you intend, pay special attention to the Remarks section of the documentation
The read operation may not read all of the requested number of bytes.
Consider using the async_read function if you need to ensure that the
requested amount of data is read before the asynchronous operation
completes.
Note that async_read() free function does offer the guarantee that you are looking for
This operation is implemented in terms of zero or more calls to the
stream's async_read_some function, and is known as a composed
operation. The program must ensure that the stream performs no other
read operations (such as async_read, the stream's async_read_some
function, or any other composed operations that perform reads) until
this operation completes.

How can I get the amount of transferred bytes on asynchronous reading boost asio c++

As I've seen in Boost::asio the async read functions does not return the amount of bytes transferred but normal read functions does. How can I get the amount of bytes transferred when I use async_read_some? (Params: buffer, handler)
All forms of async_read expect a "ReadHandler" callback of the form
void handler(
const boost::system::error_code& error, // Result of operation.
std::size_t bytes_transferred // Number of bytes copied into the
// buffers. If an error occurred,
// this will be the number of
// bytes successfully transferred
// prior to the error.
);
The second parameter of your callback will be the number of bytes read.
The asynchronous read functions call a "handler" function (or function object) once the read is complete. The number of bytes transferred is passed to that function; the signature of the function must be:
void handler(
const boost::system::error_code& error, // Result of operation.
std::size_t bytes_transferred // Number of bytes read.
);
The requirements for read handlers are documented here