Sending a large text via Boost ASIO - c++

I am trying to send a very large string to one of my clients. I am mostly following code in HTTP server example: https://www.boost.org/doc/libs/1_78_0/doc/html/boost_asio/examples/cpp11_examples.html
Write callbacks return with error code 14, that probably means EFAULT, "bad address" according to this link:
https://mariadb.com/kb/en/operating-system-error-codes/
Note that I could not use message() member function of error_code to read error message, that was causing segmentation fault. (I am using Boost 1.53, and the error might be due to this: https://github.com/boostorg/system/issues/50)
When I try to send small strings, let's say of size 10 for example, write callback does not return with an error.
Here is how I am using async_write:
void Connection::do_write(const std::string& write_buffer)
{
auto self(shared_from_this());
boost::asio::async_write(socket_, boost::asio::buffer(write_buffer, write_buffer.size()),
[this, self, write_buffer](boost::system::error_code ec, std::size_t transfer_size)
{
if (!ec)
{
} else {
// code enters here **when** I am sending a large text.
// transfer_size always prints 65535
}
});
}
Here is how I am using async_read_some:
void Connection::do_read()
{
auto self(shared_from_this());
socket_.async_read_some(boost::asio::buffer(buffer_),
[this, self](boost::system::error_code ec, std::size_t bytes_transferred)
{
if (!ec)
{
do_write(VERY_LARGE_STRING);
do_read();
} else if (ec != boost::asio::error::operation_aborted) {
connection_manager_.stop(shared_from_this());
}
});
}
What could be causing write callback to return with error with large string?

The segfault indicates likely Undefined Behaviour to me.
Of course there's to little code to tell, but one strong smell is from you using a reference to a non-member as the buffer:
boost::asio::buffer(write_buffer, write_buffer.size())
Besides that could simply be spelled boost::asio::buffer(writer_buffer), there's not much hope that write_buffer stays around for the duration of the asynchronous operation that depends on it.
As the documentation states:
Although the buffers object may be copied as necessary, ownership of the underlying memory blocks is retained by the caller, which must guarantee that they remain valid until the handler is called.
I would check that you're doing that correctly.
Another potential cause for UB is when you cause overlapping writes on the same socket/stream object:
This operation is implemented in terms of zero or more calls to the stream's async_write_some function, and is known as a composed operation. The program must ensure that the stream performs no other write operations (such as async_write, the stream's async_write_some function, or any other composed operations that perform writes) until this operation completes.
If you checked both these causes of concern and find that something must be wrong, please post a new question including a fully selfcontained example (SSCCE or MCVE)

Related

Boost asio:async_read() using boost::asio::use_future

When calling asio::async_read() using a future, is there a way to get the number of bytes transferred when a boost:asio::error::eof exception occurs? It would seem that there are many cases when one would want to get the data transferred even if the peer disconnects.
For example:
namespace ba = boost::asio;
int32_t Session::read (unsigned char* pBuffer, uint32_t bufferSizeToRead)
{
// Create a mutable buffer
ba::mutable_buffer buffer (pBuffer, bufferSizeToRead);
int32_t result = 0;
// We do an async call using a future. A thread from the io_context pool does the
// actual read while the the thread calling this method will blocks on the
// std::future::get()
std::future<std::size_t> future =
ba::async_read(m_socket, buffer, ba::bind_executor(m_sessionStrand, ba::use_future));
try
{
// We block the calling thread here until we get the results of the async_read_some()...
result = future.get();
}
catch (boost::system::system_error &ex) // boost::system::system_error
{
auto exitCode = ex.code().value();
if ( exitCode == ba::error::eof )
{
log ("Connection closed by the peer");
}
}
return results; // This is zero if eof occurs
}
The code sample above represents our issue. It was designed to support a 3rd-party library. The library expects a blocking call. The new code under development is using ASIO with a minimal number of network threads. The expectation is that this 3rd party library calls session::read using its dedicated thread and we adapt the call to an asynchronous call. The network call must be async since we are supporting many such calls from different libraries with minimal threads.
What was unexpected and discovered late is that ASIO treats a connection closed as an error. Without the future, using a handler we could get the bytes transferred up to the point where the disconnect occurred. However, using a future, the exception is thrown and the bytes transferred becomes unknown.
void handler (const boost::system::error_code& ec,
std::size_t bytesTransferred );
Is there a way to do the above with a future and also get the bytes transferred?
Or ss there an alternative approach where we can provide the library a blocking call by still use an asio::async_read or similar.
Our expectation is that we could get the bytes transferred even if the client closed the connection. We're puzzled that when using a future this does not seem possible.
It's an implementation limitation of futures.
Modern async_result<> specializations (that use the initiate member approach) can be used together with as_tuple, e.g.:
ba::awaitable<std::tuple<boost::system::error_code, size_t>> a =
ba::async_read(m_socket, buffer, ba::as_tuple(ba::use_awaitable));
Or, more typical:
auto [ec, n] = co_await async_read(m_socket, buffer, ba::as_tuple(ba::use_awaitable));
However, the corresponding:
auto future = ba::async_read(m_socket, buffer, ba::as_tuple(ba::use_future));
isn't currently supported. It arguably could, but you'd have to create your own completion token, or ask Asio devs to add support to use_future: https://github.com/chriskohlhoff/asio/issues
Side-note: if you construct the m_socket from the m_sessioStrand executor, you do not need to bind_executor to the strand:
using Executor = net::io_context::executor_type;
struct Session {
int32_t read(unsigned char* pBuffer, uint32_t bufferSizeToRead);
net::io_context m_ioc;
net::strand<Executor> m_sessionStrand{m_ioc.get_executor()};
tcp::socket m_socket{m_sessionStrand};
};

Using lambdas with auto declaration vs in-place?

I'm trying to learn modern C++ and I'm using Boost.Asio for networking. I wrote a TCP connection class, which uses Asio's asynchronous operations. This is currently my method for reading data from a socket:
template<class T>
inline auto connection<T>::read(size_t length) -> void
{
auto handler = [&](const boost::system::error_code& error, size_t bytes_transferred) {
if (error == boost::asio::error::eof or error == boost::asio::error::connection_reset) {
close();
} else {
on_read(bytes_transferred);
}
};
socket.async_read_some(boost::asio::buffer(read_buffer, length), handler);
}
Here I declared the read handler separately with auto, because I think it looks more readable than an in-place lambda, i.e.
template<class T>
inline auto connection<T>::read(size_t length) -> void
{
socket.async_read_some(boost::asio::buffer(read_buffer, length), [&](const boost::system::error_code& error, size_t bytes_transferred) {
if (error == boost::asio::error::eof or error == boost::asio::error::connection_reset) {
close();
} else {
on_read(bytes_transferred);
}
});
}
However I ran into a segmentation fault with the first version, and I believe this is because the handler lambda is lost when the method goes out of scope. Then I tried to move the handler with std::move
socket.async_read_some(boost::asio::buffer(read_buffer, length), std::move(handler));
which seems to fix the segfault.
Now my question is: Are there any performance or other issues with using the first version (with std::move) vs in-place? Which one do you think is better practice?
Both of these code examples should work. The first example passes the handler as an lvalue, in which case the implementation will make a copy. The second example passes a lambda as a prvalue, in which case the implementation will perform a move-construction. As both the lvalue and prvalue are trivial, the two operations are the same.
Asynchronous initiating functions in Networking TS (and by extension, Asio and Boost.Asio) take ownership of handlers by performing a "decay-copy." That means the handler is either copied or moved from depending on whether the argument is an lvalue or not.
I am not sure why your first example crashes, but it has nothing to do with the lifetime of the lambda. For obvious reasons, asynchronous initiating functions never receive the handle by reference, and always take ownership by decay-copy.
There must be some other problem with your code, in the part that you haven't pasted. For example, what is keeping the connection object alive after the function returns?

What is the advantage of class member functions to call each other in C++?

I am new to C++. I found that the following programming style is quite interesting to me. I wrote a simplified version here.
#include <iostream>
using namespace std;
class MyClass {
public :
MyClass(int id_) : id(id_) {
cout<<"I am a constructor"<<endl;
}
bool error = false;
void run() {
//do something ...
if (!error) {
read();
}
}
void read() {
//do something ...
if (!error) {
write();
}
}
void write() {
//do something ...
if (!error) {
read();
}
}
private :
int id;
};
int main() {
MyClass mc(1);
mc.run();
return 0;
}
The example here is compilable, but I didn't run it because I must go into an infinite loop. But, I hope to use this as a reference. The read() and write() are calling each other. I first encountered this programming style in boost.asio. When the server received a message in do_read(), it calls do_write() to echo the client, then it calls do_read() again at the end of the do_write().
I have two questions regarding this type of coding.
Will this cause stack overflow? Because the functions are keeping calling themselves and the function ends only an error occurs.
What is the advantage of it? Why can't I use a function to loop them orderly and break the loop whenever it encounters an error.
bool replied = true;
while (!error) {
if (replied) read();
else {
write();
replied = !replied;
}
}
Your simplified version leaves out the most important aspect: the write() and read() calls are asynchronous.
Therefore, the functions don't actually cause recursion, see this recent answer: Do "C++ boost::asio Recursive timer callback" accumulate callstack?
The "unusual" thing about async_read(...) and async_write(...) is that the functions return before the IO operation has actually been performed, let alone completed. The actual execution is done on a different schedule¹.
To signal compleion back to the "caller" the async calls typically take a completion handler, which gets called with the result of the IO operation.
In that completion handler, it's typical to see either the end of the communication channel, or the next IO operation being scheduled. This is known as asynchronous call chaining and is very prominently present in many languages that support asynchronous operations ²
It takes some getting used to, but ultimately you get used to the pattern.
With this in mind, revisit one of the boost samples and see if the penny drops:
Documentation sample Chat Client
void handle_connect(const boost::system::error_code& error)
{
if (!error)
{
boost::asio::async_read(socket_,
boost::asio::buffer(read_msg_.data(), chat_message::header_length),
boost::bind(&chat_client::handle_read_header, this,
boost::asio::placeholders::error));
}
}
void handle_read_header(const boost::system::error_code& error)
{
if (!error && read_msg_.decode_header())
{
boost::asio::async_read(socket_,
boost::asio::buffer(read_msg_.body(), read_msg_.body_length()),
boost::bind(&chat_client::handle_read_body, this,
boost::asio::placeholders::error));
}
else
{
do_close();
}
}
void handle_read_body(const boost::system::error_code& error)
{
if (!error)
{
std::cout.write(read_msg_.body(), read_msg_.body_length());
std::cout << "\n";
boost::asio::async_read(socket_,
boost::asio::buffer(read_msg_.data(), chat_message::header_length),
boost::bind(&chat_client::handle_read_header, this,
boost::asio::placeholders::error));
}
else
{
do_close();
}
}
void do_write(chat_message msg)
{
bool write_in_progress = !write_msgs_.empty();
write_msgs_.push_back(msg);
if (!write_in_progress)
{
boost::asio::async_write(socket_,
boost::asio::buffer(write_msgs_.front().data(),
write_msgs_.front().length()),
boost::bind(&chat_client::handle_write, this,
boost::asio::placeholders::error));
}
}
void handle_write(const boost::system::error_code& error)
{
if (!error)
{
write_msgs_.pop_front();
if (!write_msgs_.empty())
{
boost::asio::async_write(socket_,
boost::asio::buffer(write_msgs_.front().data(),
write_msgs_.front().length()),
boost::bind(&chat_client::handle_write, this,
boost::asio::placeholders::error));
}
}
else
{
do_close();
}
}
void do_close()
{
socket_.close();
}
Benefit Of Asynchronous Operations
Asynchronous IO are useful for a more event-based model of IO. Also they remove the first "ceiling" when scaling to large volumes of IO operations. In traditional, imperative code patterns many clients/connections would require many threads in order to be able to serve them simultaneously. In practice, though, threads fail to scale (since a typical server has a smallish number of logical CPUs) and it would mean that IO operations block each other ³.
With asynchronous IO you can often do all IO operations on a single thread, greatly improving efficiency - and thereby some aspects of the program design (because fewer threading issues need to be involved).
¹ Many choices exist, but imagine that io_service::run() is running on a separate thread, that would lead to the IO operations being actually executed, potentially resumed when required and completed on that thread
² I'd say javascript is infamous for this pattern
³ A classical example is when a remote procedure call keeps a thread occupied while waiting for e.g. a database query to complete
This is my opinion:
Regarding recursion
One way to cause a stack overflow is to have a function calling itself recursively, overflowing the call stack. A set of functions calling each other in a circular manner would be equivalent to that, so yes, your intuition is correct.
An iterative version of the algorithm, such as the loop you describe, could prevent that.
Now, another thing that can prevent a stack overflow is the presence of code that could be optimized for tail recursion. Tail recursion optimization requires a compiler implementing this feature. Most major compilers implement it. The Boost.Asio function you mention seems to be benefiting from this optimization.
Regarding code design
Now, C++ implements many programming paradigms. These paradigms are also implemented by many other programming languages. The programming paradigms relevant to what you are discussing would be:
Structured programming
Object oriented programming
From a structured programming point of view, you should try to emphasize code reuse as much as possible by diving the code in subroutines that minimize redundant code.
From an object oriented point of view, you should model classes in a way that encapsulates their logic as much as possible.
The logic you present so far seems encapsulated enough, however, you may need to review if the methods write and read should remain public, or if they should be private instead. Minimizing the number of public methods helps achieving a higher level of encapsulation.

boost read_some function lost data

I'm implementing a tcp server with boost asio library.
In the server, I use asio::async_read_some to get data, and use asio::write to write data. The server code is something like that.
std::array<char, kBufferSize> buffer_;
std::string ProcessMessage(const std::string& s) {
if (s == "msg1") return "resp1";
if (s == "msg2") return "resp2";
return "";
}
void HandleRead(const boost::system::error_code& ec, size_t size) {
std::string message(buffer_.data(), size);
std::string resp = ProcessMessage(message);
if (!resp.empty()) {
asio::write(socket, boost::asio::buffer(message), WriteCallback);
}
socket.async_read_some(boost::asio::buffer(buffer_));
}
Then I write a client to test the server, the code is something like
void MessageCallback(const boost::system::error_code& ec, size_t size) {
std::cout << string(buffer_.data(), size) << std::endl;
}
//Init socket
asio::write(socket, boost::asio::buffer("msg1"));
socket.read_some(boost::asio::buffer(buffer_), MessageCallback);
// Or async_read
//socket.async_read_some(boost::asio::buffer(buffer_), MessageCallback);
asio::write(socket, boost::asio::buffer("msg1"));
socket.read_some(boost::asio::buffer(buffer_), MessageCallback);
// Or async_read
//socket.async_read_some(boost::asio::buffer(buffer_), MessageCallback);
If I run the client, the code will be waiting at second read_some, and output is:resp1.
If I remove the first read_some, the ouput is resp1resp2, that means the server done the right thing.
It seems the first read_some EAT the second response but don't give the response to MessageCallback function.
I've read the quesion at What is a message boundary?, I think if this problem is a "Message Boundary" problem, the second read_some should print something as the first read_some only get part of stream from the tcp socket.
How can I solve this problem?
UPDATE:
I've try to change the size of client buffer to 4, that output will be:
resp
resp
It seems the read_some function will do a little more than read from the socket, I'll read the boost code to find out is that true.
The async_read_some() member function is very likely not doing what you intend, pay special attention to the Remarks section of the documentation
The read operation may not read all of the requested number of bytes.
Consider using the async_read function if you need to ensure that the
requested amount of data is read before the asynchronous operation
completes.
Note that async_read() free function does offer the guarantee that you are looking for
This operation is implemented in terms of zero or more calls to the
stream's async_read_some function, and is known as a composed
operation. The program must ensure that the stream performs no other
read operations (such as async_read, the stream's async_read_some
function, or any other composed operations that perform reads) until
this operation completes.

Am I getting a race condition with my boost asio async_read?

bool Connection::Receive(){
std::vector<uint8_t> buf(1000);
boost::asio::async_read(socket_,boost::asio::buffer(buf,1000),
boost::bind(&Connection::handler, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
int rcvlen=buf.size();
ByteBuffer b((std::shared_ptr<uint8_t>)buf.data(),rcvlen);
if(rcvlen <= 0){
buf.clear();
return false;
}
OnReceived(b);
buf.clear();
return true;
}
The method works fine but only when I make a breakpoint inside it. Is there an issue with timing as it waits to receive? Without the breakpoint, nothing is received.
You are trying to read from the receive buffer immediately after starting the asynchronous operation, without waiting for it to complete, that is why it works when you set a breakpoint.
The code after your async_read belongs into Connection::handler, since that is the callback you told async_read to invoke after receiving some data.
What you usually want is a start_read and a handle_read_some function:
void connection::start_read()
{
socket_->async_read_some(boost::asio::buffer(read_buffer_),
boost::bind(&connection::handle_read_some, shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void connection::handle_read_some(const boost::system::error_code& error, size_t bytes_transferred)
{
if (!error)
{
// Use the data here!
start_read();
}
}
Note the shared_from_this, it's important if you want the lifetime of your connection to be automatically taken care of by the number of outstanding I/O requests. Make sure to derive your class from boost::enable_shared_from_this<connection> and to only create it with make_shared<connection>.
To enforce this, your constructor should be private and you can add a friend declaration (C++0x version; if your compiler does not support this, you will have to insert the correct number of arguments yourself):
template<typename T, typename... Arg> friend boost::shared_ptr<T> boost::make_shared(const Arg&...);
Also make sure your receive buffer is still alive by the time the callback is invoked, preferably by using a statically sized buffer member variable of your connection class.