Boost ASIO TCP separation of messages - c++

I just started working with the Boost ASIO library, version 1.52.0. I am using TCP/SSL encryption with async sockets. From other questions asked here about ASIO, it seems that ASIO does not support receiving a variable length message and then passing the data for that message to a handler.
I'm guessing that ASIO puts the data into a cyclical buffer and loses all track of each separate message. If I have missed something and ASIO does provide a way to pass individual messages, then please advise as to how.
My question is that assuming I can't somehow obtain just the bytes associated with an individual message, can I use transfer_exactly in async_read to obtain just the first 4 bytes, which our protocol always places the length of the message. Then call either read or async_read (if read won't work with async sockets) to read in the rest of the message? Will this work? Any better ways to do it?

Typically I like to take the data I receive in an async_read and put it in a boost::circular_buffer and then let my message parser layer decide when a message is complete and pull the data out.
http://www.boost.org/doc/libs/1_52_0/libs/circular_buffer/doc/circular_buffer.html
Partial code snippets below
boost::circular_buffer TCPSessionThread::m_CircularBuffer(BUFFSIZE);
void TCPSessionThread::handle_read(const boost::system::error_code& e, std::size_t bytes_transferred)
{
// ignore aborts - they are a result of our actions like stopping
if (e == boost::asio::error::operation_aborted)
return;
if (e == boost::asio::error::eof)
{
m_SerialPort.close();
m_IoService.stop();
return;
}
// if there is not room in the circular buffer to hold the new data then warn of overflow error
if (m_CircularBuffer.reserve() < bytes)
{
ERROR_OCCURRED("Buffer Overflow");
m_CircularBuffer.clear();
}
// now place the new data in the circular buffer (overwrite old data if needed)
// note: that if data copying is too expensive you could read directly into
// the circular buffer with a little extra effor
m_CircularBuffer.insert(m_CircularBuffer.end(), pData, pData + bytes);
boost::shared_ptr<MessageParser> pParser = m_pParser.lock(); // lock the weak pointer
if ((pParser) && (bytes_transferred))
pParser->HandleInboundPacket(m_CircularBuffer); // takes a reference so that the parser can consume data from the circ buf
// start the next read
m_Socket.async_read_some(boost::asio::buffer(*m_pBuffer), boost::bind(&TCPSessionThread::handle_read, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
}

Related

Receiving large binary data over Boost::Beast websocket

I am trying to receive a large amount of data using a boost::beast::websocket, fed by another boost::beast::websocket. Normally, this data is sent to a connected browser but I'd like to set up a purely C++ unit test validating certain components of the traffic. I set the auto fragmentation to true from the sender with a max size of 1MB but after a few messages, the receiver spits out:
Read 258028 bytes of binary
Read 1547176 bytes of binary
Read 168188 bytes of binary
"Failed read: The WebSocket message exceeded the locally configured limit"
Now, I should have no expectation that a fully developed and well supported browser should exhibit the same characteristics as my possibly poorly architected unit test, which it does not. The browser has no issue reading 25MB messages over the websocket. My boost::beast::websocket on the other hand hits a limit.
So before I go down a rabbit hole, I'd like to see if anyone has any thoughts on this. My read sections looks like this:
void on_read(boost::system::error_code ec, std::size_t bytes_transferred)
{
boost::ignore_unused(bytes_transferred);
if (ec)
{
m_log.error("Failed read: " + ec.message());
// Stop the websocket
stop();
return;
}
std::string data(boost::beast::buffers_to_string(m_buffer.data()));
// Yes I know this looks dangerous. The sender always sends as binary but occasionally sends JSON
if (data.at(0) == '{')
m_log.debug("Got message: " + data);
else
m_log.debug("Read " + utility::to_string(m_buffer.data().buffer_bytes()) + " of binary data");
// Do the things with the incoming doata
for (auto&& callback : m_read_callbacks)
callback(data);
// Toss the data
m_buffer.consume(bytes_transferred);
// Wait for some more data
m_websocket.async_read(
m_buffer,
std::bind(
&WebsocketClient::on_read,
shared_from_this(),
std::placeholders::_1,
std::placeholders::_2));
}
I saw in a separate example that instead of doing an async read, you can do a for/while loop reading some data until the message is done (https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/using_websocket/send_and_receive_messages.html). Would this be the right approach for an always open websocket that could send some pretty massive messages? Would I have to send some indicator to the client that the message is indeed done? And would I run into the exceeded buffer limit issue using this approach?
If your use pattern is fixed:
std::string data(boost::beast::buffers_to_string(m_buffer.data()));
And then, in particular
callback(data);
Then there will be no use at all reading block-wise, since you will be allocating the same memory anyways. Instead, you can raise the "locally configured limit":
ws.read_message_max(20ull << 20); // sets the limit to 20 miB
The default value is 16 miB (as of boost 1.75).
Side Note
You can probably also use ws.got_binary() to detect whether the last message received was binary or not.

Truncated data (if more than 512 bytes) when using boost::asio::async_read_until from serial port

I'm using the boost::asio::async_read_until function to read from a serial port in Windows 10. The delimiter is a Regex pattern. It works as expected as long as the data recieved is not larger than 512 bytes.
If the data received is larger than 512 bytes, it is simply truncated and the "readComplete" function will not be called again. However if I send more data, 1 byte is enough, the missing data is received together with the new data.
I have used the same implementation on a tcp/socket and that works flawlessly. Is there any limit in the native serial interface in Windows causing this behaviour?
EDIT 1: I have noted that if the baud rate is lowered from 115200 to 28800 no data is missing.
// from .h-file: boost::asio::streambuf streamBuf_;
void RS232Instrument::readAsyncChars()
{
boost::asio::async_read_until(
serial_,
streamBuf_,
boost::regex(regexStr_.substr(6, regexStr_.length() - 7)),
boost::bind(
&RS232Instrument::readComplete,
this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void RS232Instrument::readComplete(const boost::system::error_code& error, size_t bytes_transferred)
{
if(error)
{
// Error handling
}
else
{
std::string rawStr(
boost::asio::buffers_begin(streamBuf_.data()),
boost::asio::buffers_begin(streamBuf_.data()) + bytes_transferred);
// Log the data in rawStr....
// Remove data from beginning until all data sent to log
streamBuf_.consume(bytes_transferred);
if(abort_ == false)
{
readAsyncChars();
}
}
}
Since I have found out was caused this problem I'll answer the question myself.
I had left out some code above for the sake of clarity, code which I did not realise was actually the problem.
Example of code left out:
LOG_DEBUG("Rs232Data received");
I use the boost:log functionality and I have added more "sinks" to the the log framework. The sink used in this case logs to a vector in ram and prints to console when triggered from user input.
It turns out that the log framework consumes about 1 ms before the "consume" function in the sink is called. That is enough to cause loss of data from the serial port when using async_read_until.
Lessons learned: Do not call any time consuming tasks in the handler function in async_read_until

How can I convert serialized data in boost::beast to a string so that I could process it in a FIFO manner?

I have an application of a client where I need to receive http "long running requests" from a server. I send a command, and after getting the header of the response, I have to just receive json data separated by \r\n until the connection is terminated.
I managed to adapt boost beast client example to send the message and receive the header and parse it and receive responses from the server. However, I failed at finding a way to serialize the data so that I could process the json messages.
The closest demonstration of the problem can be found in this relay example. In that example (p is a parser, sr is a serializer, input is a socket input stream and output is an socket output stream), after reading the http header, we have a loop that reads continuously from the server:
do
{
if(! p.is_done())
{
// Set up the body for writing into our small buffer
p.get().body().data = buf;
p.get().body().size = sizeof(buf);
// Read as much as we can
read(input, buffer, p, ec);
// This error is returned when buffer_body uses up the buffer
if(ec == error::need_buffer)
ec = {};
if(ec)
return;
// Set up the body for reading.
// This is how much was parsed:
p.get().body().size = sizeof(buf) - p.get().body().size;
p.get().body().data = buf;
p.get().body().more = ! p.is_done();
}
else
{
p.get().body().data = nullptr;
p.get().body().size = 0;
}
// Write everything in the buffer (which might be empty)
write(output, sr, ec);
// This error is returned when buffer_body uses up the buffer
if(ec == error::need_buffer)
ec = {};
if(ec)
return;
}
while(! p.is_done() && ! sr.is_done());
A few things I don't understand here:
We're done reading the header. Why do we need boost beast and not boost asio to read a raw tcp message? When I tried to do that (with both async_read/async_read_some) I got an infinite reads of zero size.
The documentation of parser says (at the end of the page) that a new instance is needed for every message, but I don't see that in the example.
Since tcp message reading is not working, is there a way to convert the parser/serializer data to some kind of string? Even write it to a text file in a FIFO manner, so that I could process it with some json library? I don't want to use another socket like the example.
The function boost::beast::buffers() failed to compile for the parser and the serializer, and for the parser there's no consume function, and the serializer's consume seems to be for particular http parts of the message, which fires an assert if I do it for body().
Besides that, I also failed at getting consistent chunks of data from the parser and the buffer with old-school std::copy. I don't seem to understand how to combine the data together to get the stream of data. Consuming the buffer with .consume() at any point while receiving data leads to need buffer error.
I would really appreciate someone explaining the logic of how all this should work together.
Where is buf? You could read directly into the std::string instead. Call string.resize(N), and set the pointer and size in the buffer_body::value_type to string.data() and string.size().

Asynchronous processing of streaming HTTP with boost::beast

I'm implementing a client which accesses a REST endpoint and then begins processing an SSE stream and monitoring events as they occur. To this end, I'm using Boost::Beast version 124 with Boost 1.63 and attempting to use async_read_some to incrementally read the body of the response.
Here's my code thus far:
namespace http = boost::beast::http;
http::response_parser<http::string_body> sse_client::m_parser;
http::response<http::string_body> sse_client::m_response;
boost::beast::flat_buffer m_buffer;
void sse_client::monitor_sse()
{
http::request<http::empty_body> req{http::verb::get, m_target, 11};
req.set(http::field::host, m_host);
req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);
req.set(http::field::accept, "text/event-stream");
http::async_write(m_socket, req,
std::bind(
&sse_client::process_sse,
shared_from_this(),
std::placeholders::_1,
std::placeholders::_2));
}
void sse_client::process_sse(boost::system::error_code ec, std::size_t byte_count)
{
http::read_header(m_socket, m_buffer, m_parser);
http::async_read_some(m_socket, m_buffer, m_parser,
std::bind(
&sse_client::read_event,
shared_from_this(),
std::placeholders::_1));
}
void sse_client::read_event(boost::system::error_code ec)
{
// TODO: process event
http::async_read_some(m_socket, m_buffer, m_parser,
std::bind(
&sse_client::read_event,
shared_from_this(),
std::placeholders::_1));
}
My questions are:
Is this the right approach for this particular use case?
Is there a more appropriate type to use with response_parser and response than http::string_body?
When the read_event handler is invoked, how does it access the content retrieved by async_read_some? Should it be pulled from the buffer?
I'll answer your questions first and then provide explanation.
Yes, you want to read the header and then call read_some (or read, see below) until the parser returns true from is_complete(). However, in your code I notice you are mixing synchronous and asynchronous calls (read_header followed by async_read_some). It would be best to stick to just one model instead of mixing them.
For your purposes you probably want buffer_body instead of string_body. There is an example in the documentation which shows how to do this (http://www.boost.org/doc/libs/1_66_0/libs/beast/doc/html/beast/using_http/parser_stream_operations/incremental_read.html)
The "buffer" you refer to is the dynamic buffer argument passed to the HTTP stream operation. While this buffer will hold the message data, it is not for the application to inspect. This buffer is used to hold additional data past the end of the current message that the stream algorithm can read (this is explained in http://www.boost.org/doc/libs/1_66_0/libs/beast/doc/html/beast/using_http/message_stream_operations.html#beast.using_http.message_stream_operations.reading). You will access the content by inspecting the body of the message when using buffer_body
http::response_parser::get() will provide you with access to the message being read in.
The best solution for you is to use buffer_body as in the example, provide an area of memory to point it to and then call read or async_read in a loop. Every time the buffer is full, the read will return with the error beast::http::error::need_buffer, indicating that further calls are required.
Hope this helps!

boost asio find beginning of message in tcp based protocol

I want to implement a client for a sensor that sends data over tcp and uses the following protocol:
the message-header starts with the byte-sequence 0xAFFEC0CC2 of type uint32
the header in total is 24 Bytes long (including the start sequence) and contains the size in bytes of the message-body as a uint32
the message-body is sent directly after the header and not terminated by a demimiter
Currently, I got the following code (assume a connected socket exists)
typedef unsigned char byte;
boost::system::error_code error;
boost::asio::streambuf buf;
std::string magic_word_s = {static_cast<char>(0xAF), static_cast<char>(0xFE),
static_cast<char>(0xC0), static_cast<char>(0xC2)};
ssize_t n = boost::asio::read_until(socket_, buf, magic_word_s, error);
if(error)
std::cerr << boost::system::system_error(error).what() << std::endl;
buf.consume(n);
n = boost::asio::read(socket_, buf, boost::asio::transfer_exactly(20);
const byte * p = boost::asio::buffer_cast<const byte>(buf.data());
uint32_t size_of_body = *((byte*)p);
unfortunately the documentation for read_until remarks:
After a successful read_until operation, the streambuf may contain additional data beyond the delimiter. An application will typically leave that data in the streambuf for a subsequent read_until operation to examine.
which means that I loose synchronization with the described protocol.
Is there an elegant way to solve this?
Well... as it says... you just "leave" it in the object, or temporary store it in another, and handle the whole message (below called 'packet') if it is complete.
I have a similar approach in one of my projects. I'll explain a little how I did it, that should give you a rough idea how you can handle the packets correctly.
In my Read-Handler (-callback) I keep checking if the packet is complete. The meta-data information (header for you) is temporary stored in a map associated with the remote-partner (map<RemoteAddress, InfoStructure>).
For example it can look like this:
4 byte identifier
4 byte message-length
n byte message
Handle incoming data, check if identifier + message-length are received already, continue to check if message-data is completed with received data.
Leave rest of the packet in the temporary buffer, erase old data.
Continue with handling when next packet arrives or check if received data completes next packet already...
This approach may sound a little slow, but I get even with SSL 10MB/s+ on a slow machine.
Without SSL much higher transfer-rates are possible.
With this approach, you may also take a look into read_some or its asynchronous version.