Related
I'm working on a RS485 communication class and I'm trying to make a function that reads until a certain char is on the line, but with a time out. The problem is that my system timer immediately returns, doesn't matter which time out I enter. I tried changing the timer to be a member variable of the class, so it doesn't go out of scope, but that wasn't the problem. I tried different implementations of timers (deadline_timer mostly) but that didn't help. If I remove the timer from the code, then the read succeeds, but when I add it, even if I give it a timeout of 10 seconds (which should be waay more than enough), it will respond with an immediate timeout.
I tried making a simple version of the class here, but I guess that the options mostly depend on the type of machine you're talking to:
class RS485CommunicationLayer final {
public:
RS485CommunicationLayer(
const std::string& path,
/* options */
): io(), port(io), timer(port.get_io_service()) {
open(/* options */);
};
std::size_t write(const char* const buffer, const size_t size) {
/*impl*/
}
// THIS FUNCTION --v
void readUntil(std::vector<char>& buffer, char delim,std::chrono::microseconds timeout) {
boost::optional<boost::system::error_code> timer_result;
boost::optional<boost::system::error_code> read_result;
port.get_io_service().reset();
timer.expires_from_now(timeout);
boost::asio::async_read_until(port, asio::dynamic_buffer(buffer), delim, [&read_result] (const boost::system::error_code& error, size_t) { read_result.reset(error); });
timer.async_wait([&timer_result] (const boost::system::error_code& error) { timer_result.reset(error); });
while (port.get_io_service().run_one())
{
if (read_result)
timer.cancel();
else if (timer_result) {
port.cancel();
}
}
if (read_result)
throw boost::system::system_error(*read_result);
};
private:
asio::io_context io;
asio::serial_port port;
boost::asio::system_timer timer;
void open(/*args*/) {
port.open(path);
/*set options*/
}
};
Edit:
I also tried the following implementation after finding out that run_for() exists. But then the buffer stays empty weirdly enough.
void RS485CommunicationLayer::readUntil(std::vector<char>& buffer, char delim, std::chrono::microseconds timeout) {
boost::optional<boost::system::error_code> read_result;
boost::asio::async_read_until(port, asio::dynamic_buffer(buffer), delim, [&read_result] (const boost::system::error_code& error, size_t) { read_result.reset(error); });
port.get_io_service().run_for(timeout);
if (read_result)
throw boost::system::system_error(*read_result);
}
First off, get_io_service() indicates a Very Old(TM) boost version. Also, it just returns io.
Secondly, why so complicated? I don't even really have the energy to see whether there is a subtle problem with the run_one() loop (it looks fine at a glance).
I'd simplify:
size_t readUntil(std::vector<char>& buffer, char delim,
std::chrono::microseconds timeout) {
error_code read_result;
size_t msglen = 0;
io.reset();
asio::system_timer timer(io, timeout);
asio::async_read_until(port, asio::dynamic_buffer(buffer), delim,
[&](error_code ec, size_t n) {
timer.cancel();
read_result = ec;
msglen = n;
});
timer.async_wait([&](error_code ec) { if (!ec) port.cancel(); });
io.run();
if (read_result)
boost::throw_with_location(boost::system::system_error(read_result),
read_result.location());
return msglen;
}
You can just cancel the complementary IO object from the respective completion handlers.
The timer is per-op and local to the readUntil, so it doesn't have to be a member.
Let's also throw in the write side, which is all of:
size_t write(char const* const data, const size_t size) {
return asio::write(port, asio::buffer(data, size));
}
And I can demo it working:
Live On Coliru
#include <boost/asio.hpp>
#include <iomanip>
#include <iostream>
namespace asio = boost::asio;
using boost::system::error_code;
using namespace std::chrono_literals;
class RS485CommunicationLayer final {
public:
RS485CommunicationLayer(std::string const& path) : io(), port(io) { open(path); };
size_t write(char const* const data, const size_t size) {
return asio::write(port, asio::buffer(data, size));
}
size_t readUntil(std::vector<char>& buffer, char delim,
std::chrono::microseconds timeout) {
error_code read_result;
size_t msglen = 0;
io.reset();
asio::system_timer timer(io, timeout);
asio::async_read_until(port, asio::dynamic_buffer(buffer), delim,
[&](error_code ec, size_t n) {
timer.cancel();
read_result = ec;
msglen = n;
});
timer.async_wait([&](error_code ec) { if (!ec) port.cancel(); });
io.run();
if (read_result)
boost::throw_with_location(boost::system::system_error(read_result),
read_result.location());
return msglen;
}
private:
asio::io_context io;
asio::serial_port port;
void open(std::string path) {
port.open(path);
/*set options*/
}
void close();
};
int main(int argc, char** argv) {
RS485CommunicationLayer comm(argc > 1 ? argv[1] : "");
comm.write("Hello world\n", 12);
for (std::vector<char> response_buffer;
auto len = comm.readUntil(response_buffer, '\n', 100ms);) //
{
std::cout << "Received " << response_buffer.size() << " bytes, next "
<< quoted(std::string_view(response_buffer.data(), len - 1))
<< std::endl;
// consume
response_buffer.erase(begin(response_buffer), begin(response_buffer) + len);
}
}
Demo locally with a socat PTS tunnel:
socat -d -d pty,raw,echo=0 pty,raw,echo=0
And throwing dictionaries at the other end:
while true; do cat /etc/dictionaries-common/words ; done | pv > /dev/pts/10
For the most part my program runs fine, but occasionally it will crash. If I pause the program mid run it will also crash. Any insight as to why would be greatly appreciated! I think it could be due to async_read_some being called multiple times before it is actually executed.
Main.cpp:
while(true)
{
sensor->update();
if (sensor->processNow == 1)
{
sensor->process(4);
sensor->processNow = 0;
sensorReadyForUpdate = 1;
}
}
Constructor:
sensorHandler::sensorHandler(std::string host, int port, std::string name) :
socket_(ioservice_),
sensorAddress_(boost::asio::ip::address::from_string(host), port),
dataRequested_(false),
dataReady_(false)
{
}
Update Function:
bool sensorHandler::update()
{
ioservice_.poll_one();
if (inOperation == false)
{
inOperation = true;
socket_.async_read_some(boost::asio::buffer(receiveBuffer, receiveBuffer.size()), boost::bind(&sensorHandler::receiveCallback, this, _1, _2));
return success;
}
}
Receive Callback Function:
bool sensorHandler::receiveCallback(const boost::system::error_code& error, std::size_t bytes_transferred)
{
std::cout << "success - in receiveCallBack" << std::endl;
processNow = 1;
inOperation = false;
}
Includes:
#include "sensorHandler.h"
#include <boost\bind.hpp>
#include <boost\asio\write.hpp>
#include <iostream>
#include <windows.h>
Header File:
class sensorHandler
{
public:
sensorHandler(std::string host, int port, std::string name);
~sensorHandler();
bool connect();
bool update();
boost::array<char, 400000> receiveBuffer; // was 50000
}
I may be missing the point of the question, but is the question that you want to run an async operation in a loop?
The classical way to achieve that is via call chaining, so in the completion handler you enqueue the next operation:
bool IFMHandler::receiveCallback(const boost::system::error_code& error, std::size_t bytes_transferred)
{
/*code to process buffer here - ends with processNow = 1 and inOperation = false*/
if (!error) {
socket_.async_read_some(boost::asio::buffer(receiveBuffer,
receiveBuffer.size()), boost::bind(&IFMHandler::receiveCallback, this, _1, _2));
}
}
So, now you can simply call
ioservice_.run();
and the chain will run itself.
According to the documentation:
"The program must ensure that the stream performs no other write operations (such as async_write, the stream's async_write_some function, or any other composed operations that perform writes) until this operation completes."
Does this mean, I cannot call boost::asio::async_write a second time until the handler for the first is called? How does one achieve this and still be asynchronous?
If I have a method Send:
//--------------------------------------------------------------------
void Connection::Send(const std::vector<char> & data)
{
auto callback = boost::bind(&Connection::OnSend, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred);
boost::asio::async_write(m_socket, boost::asio::buffer(data), callback);
}
Do I have to change it to something like:
//--------------------------------------------------------------------
void Connection::Send(const std::vector<char> & data)
{
// Issue a send
std::lock_guard<std::mutex> lock(m_numPostedSocketIOMutex);
++m_numPostedSocketIO;
m_numPostedSocketIOConditionVariable.wait(lock, [this]() {return m_numPostedSocketIO == 0; });
auto callback = boost::bind(&Connection::OnSend, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred);
boost::asio::async_write(m_socket, boost::asio::buffer(data), callback);
}
and if so, then aren't I blocking after the first call again?
The async in async_write() refers to the fact that the function returns immediately while the writing happens in background. There should still be only one outstanding write at any given time.
You need to use a buffer if you have an asynchronous producer to set aside the new chunk of data until the currently active write completes, then issue a new async_write in the completion handler.
That is, Connection::Send must only call async_write once to kick off the process, in subsequent calls it should instead buffer its data, which will be picked up in the completion handler of the currently executing async_write.
For performance reasons you want to avoid copying the data into the buffer, and instead append the new chunk to a list of buffers and use the scatter-gather overload of async_write that accepts a ConstBufferSequence. It is also possible to use one large streambuf as a buffer and append directly into it.
Of course the buffer needs to be synchronized unless both Connection::Send and the io_service run in the same thread. An empty buffer can be reused as an indication that no async_write is in progress.
Here's some code to illustrate what I mean:
struct Connection
{
void Connection::Send(std::vector<char>&& data)
{
std::lock_guard<std::mutex> lock(buffer_mtx);
buffers[active_buffer ^ 1].push_back(std::move(data)); // move input data to the inactive buffer
doWrite();
}
private:
void Connection::doWrite()
{
if (buffer_seq.empty()) { // empty buffer sequence == no writing in progress
active_buffer ^= 1; // switch buffers
for (const auto& data : buffers[active_buffer]) {
buffer_seq.push_back(boost::asio::buffer(data));
}
boost::asio::async_write(m_socket, buffer_seq, [this] (const boost::system::error_code& ec, size_t bytes_transferred) {
std::lock_guard<std::mutex> lock(buffer_mtx);
buffers[active_buffer].clear();
buffer_seq.clear();
if (!ec) {
if (!buffers[active_buffer ^ 1].empty()) { // have more work
doWrite();
}
}
});
}
}
std::mutex buffer_mtx;
std::vector<std::vector<char>> buffers[2]; // a double buffer
std::vector<boost::asio::const_buffer> buffer_seq;
int active_buffer = 0;
. . .
};
The complete working source can be found in this answer.
Yes you need to wait for completion handler before calling async_write again. Are you sure you'll be blocked? Of course it depends on how fast you generate your data, but even if yes there's no way to send it faster than your network can handle it. If it's really an issue consider sending bigger chunks.
Here is a complete, compilable, and tested, example, that I researched and got to work through trial and error after reading the answer and subsequent edits from RustyX.
Connection.h
#pragma once
#include <boost/asio.hpp>
#include <atomic>
#include <condition_variable>
#include <memory>
#include <mutex>
//--------------------------------------------------------------------
class ConnectionManager;
//--------------------------------------------------------------------
class Connection : public std::enable_shared_from_this<Connection>
{
public:
typedef std::shared_ptr<Connection> SharedPtr;
// Ensure all instances are created as shared_ptr in order to fulfill requirements for shared_from_this
static Connection::SharedPtr Create(ConnectionManager * connectionManager, boost::asio::ip::tcp::socket & socket);
//
static std::string ErrorCodeToString(const boost::system::error_code & errorCode);
Connection(const Connection &) = delete;
Connection(Connection &&) = delete;
Connection & operator = (const Connection &) = delete;
Connection & operator = (Connection &&) = delete;
~Connection();
// We have to defer the start until we are fully constructed because we share_from_this()
void Start();
void Stop();
void Send(const std::vector<char> & data);
private:
static size_t m_nextClientId;
size_t m_clientId;
ConnectionManager * m_owner;
boost::asio::ip::tcp::socket m_socket;
std::atomic<bool> m_stopped;
boost::asio::streambuf m_receiveBuffer;
mutable std::mutex m_sendMutex;
std::vector<char> m_sendBuffers[2]; // Double buffer
int m_activeSendBufferIndex;
bool m_sending;
std::vector<char> m_allReadData; // Strictly for test purposes
Connection(ConnectionManager * connectionManager, boost::asio::ip::tcp::socket socket);
void DoReceive();
void DoSend();
};
//--------------------------------------------------------------------
Connection.cpp
#include "Connection.h"
#include "ConnectionManager.h"
#include <boost/bind.hpp>
#include <algorithm>
#include <cstdlib>
//--------------------------------------------------------------------
size_t Connection::m_nextClientId(0);
//--------------------------------------------------------------------
Connection::SharedPtr Connection::Create(ConnectionManager * connectionManager, boost::asio::ip::tcp::socket & socket)
{
return Connection::SharedPtr(new Connection(connectionManager, std::move(socket)));
}
//--------------------------------------------------------------------------------------------------
std::string Connection::ErrorCodeToString(const boost::system::error_code & errorCode)
{
std::ostringstream debugMsg;
debugMsg << " Error Category: " << errorCode.category().name() << ". "
<< " Error Message: " << errorCode.message() << ". ";
// IMPORTANT - These comparisons only work if you dynamically link boost libraries
// Because boost chose to implement boost::system::error_category::operator == by comparing addresses
// The addresses are different in one library and the other when statically linking.
//
// We use make_error_code macro to make the correct category as well as error code value.
// Error code value is not unique and can be duplicated in more than one category.
if (errorCode == boost::asio::error::make_error_code(boost::asio::error::connection_refused))
{
debugMsg << " (Connection Refused)";
}
else if (errorCode == boost::asio::error::make_error_code(boost::asio::error::eof))
{
debugMsg << " (Remote host has disconnected)";
}
else
{
debugMsg << " (boost::system::error_code has not been mapped to a meaningful message)";
}
return debugMsg.str();
}
//--------------------------------------------------------------------
Connection::Connection(ConnectionManager * connectionManager, boost::asio::ip::tcp::socket socket)
:
m_clientId (m_nextClientId++)
, m_owner (connectionManager)
, m_socket (std::move(socket))
, m_stopped (false)
, m_receiveBuffer ()
, m_sendMutex ()
, m_sendBuffers ()
, m_activeSendBufferIndex (0)
, m_sending (false)
, m_allReadData ()
{
printf("Client connection with id %zd has been created.", m_clientId);
}
//--------------------------------------------------------------------
Connection::~Connection()
{
// Boost uses RAII, so we don't have anything to do. Let thier destructors take care of business
printf("Client connection with id %zd has been destroyed.", m_clientId);
}
//--------------------------------------------------------------------
void Connection::Start()
{
DoReceive();
}
//--------------------------------------------------------------------
void Connection::Stop()
{
// The entire connection class is only kept alive, because it is a shared pointer and always has a ref count
// as a consequence of the outstanding async receive call that gets posted every time we receive.
// Once we stop posting another receive in the receive handler and once our owner release any references to
// us, we will get destroyed.
m_stopped = true;
m_owner->OnConnectionClosed(shared_from_this());
}
//--------------------------------------------------------------------
void Connection::Send(const std::vector<char> & data)
{
std::lock_guard<std::mutex> lock(m_sendMutex);
// Append to the inactive buffer
std::vector<char> & inactiveBuffer = m_sendBuffers[m_activeSendBufferIndex ^ 1];
inactiveBuffer.insert(inactiveBuffer.end(), data.begin(), data.end());
//
DoSend();
}
//--------------------------------------------------------------------
void Connection::DoSend()
{
// Check if there is an async send in progress
// An empty active buffer indicates there is no outstanding send
if (m_sendBuffers[m_activeSendBufferIndex].empty())
{
m_activeSendBufferIndex ^= 1;
std::vector<char> & activeBuffer = m_sendBuffers[m_activeSendBufferIndex];
auto self(shared_from_this());
boost::asio::async_write(m_socket, boost::asio::buffer(activeBuffer),
[self](const boost::system::error_code & errorCode, size_t bytesTransferred)
{
std::lock_guard<std::mutex> lock(self->m_sendMutex);
self->m_sendBuffers[self->m_activeSendBufferIndex].clear();
if (errorCode)
{
printf("An error occured while attemping to send data to client id %zd. %s", self->m_clientId, ErrorCodeToString(errorCode).c_str());
// An error occurred
// We do not stop or close on sends, but instead let the receive error out and then close
return;
}
// Check if there is more to send that has been queued up on the inactive buffer,
// while we were sending what was on the active buffer
if (!self->m_sendBuffers[self->m_activeSendBufferIndex ^ 1].empty())
{
self->DoSend();
}
});
}
}
//--------------------------------------------------------------------
void Connection::DoReceive()
{
auto self(shared_from_this());
boost::asio::async_read_until(m_socket, m_receiveBuffer, '#',
[self](const boost::system::error_code & errorCode, size_t bytesRead)
{
if (errorCode)
{
// Check if the other side hung up
if (errorCode == boost::asio::error::make_error_code(boost::asio::error::eof))
{
// This is not really an error. The client is free to hang up whenever they like
printf("Client %zd has disconnected.", self->m_clientId);
}
else
{
printf("An error occured while attemping to receive data from client id %zd. Error Code: %s", self->m_clientId, ErrorCodeToString(errorCode).c_str());
}
// Notify our masters that we are ready to be destroyed
self->m_owner->OnConnectionClosed(self);
// An error occured
return;
}
// Grab the read data
std::istream stream(&self->m_receiveBuffer);
std::string data;
std::getline(stream, data, '#');
data += "#";
printf("Received data from client %zd: %s", self->m_clientId, data.c_str());
// Issue the next receive
if (!self->m_stopped)
{
self->DoReceive();
}
});
}
//--------------------------------------------------------------------
ConnectionManager.h
#pragma once
#include "Connection.h"
// Boost Includes
#include <boost/asio.hpp>
// Standard Includes
#include <thread>
#include <vector>
//--------------------------------------------------------------------
class ConnectionManager
{
public:
ConnectionManager(unsigned port, size_t numThreads);
ConnectionManager(const ConnectionManager &) = delete;
ConnectionManager(ConnectionManager &&) = delete;
ConnectionManager & operator = (const ConnectionManager &) = delete;
ConnectionManager & operator = (ConnectionManager &&) = delete;
~ConnectionManager();
void Start();
void Stop();
void OnConnectionClosed(Connection::SharedPtr connection);
protected:
boost::asio::io_service m_io_service;
boost::asio::ip::tcp::acceptor m_acceptor;
boost::asio::ip::tcp::socket m_listenSocket;
std::vector<std::thread> m_threads;
mutable std::mutex m_connectionsMutex;
std::vector<Connection::SharedPtr> m_connections;
boost::asio::deadline_timer m_timer;
void IoServiceThreadProc();
void DoAccept();
void DoTimer();
};
//--------------------------------------------------------------------
ConnectionManager.cpp
#include "ConnectionManager.h"
#include <boost/bind.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>
#include <system_error>
#include <cstdio>
//------------------------------------------------------------------------------
ConnectionManager::ConnectionManager(unsigned port, size_t numThreads)
:
m_io_service ()
, m_acceptor (m_io_service, boost::asio::ip::tcp::endpoint(boost::asio::ip::tcp::v4(), port))
, m_listenSocket(m_io_service)
, m_threads (numThreads)
, m_timer (m_io_service)
{
}
//------------------------------------------------------------------------------
ConnectionManager::~ConnectionManager()
{
Stop();
}
//------------------------------------------------------------------------------
void ConnectionManager::Start()
{
if (m_io_service.stopped())
{
m_io_service.reset();
}
DoAccept();
for (auto & thread : m_threads)
{
if (!thread.joinable())
{
thread.swap(std::thread(&ConnectionManager::IoServiceThreadProc, this));
}
}
DoTimer();
}
//------------------------------------------------------------------------------
void ConnectionManager::Stop()
{
{
std::lock_guard<std::mutex> lock(m_connectionsMutex);
m_connections.clear();
}
// TODO - Will the stopping of the io_service be enough to kill all the connections and ultimately have them get destroyed?
// Because remember they have outstanding ref count to thier shared_ptr in the async handlers
m_io_service.stop();
for (auto & thread : m_threads)
{
if (thread.joinable())
{
thread.join();
}
}
}
//------------------------------------------------------------------------------
void ConnectionManager::IoServiceThreadProc()
{
try
{
// Log that we are starting the io_service thread
{
printf("io_service socket thread starting.");
}
// Run the asynchronous callbacks from the socket on this thread
// Until the io_service is stopped from another thread
m_io_service.run();
}
catch (std::system_error & e)
{
printf("System error caught in io_service socket thread. Error Code: %d", e.code().value());
}
catch (std::exception & e)
{
printf("Standard exception caught in io_service socket thread. Exception: %s", e.what());
}
catch (...)
{
printf("Unhandled exception caught in io_service socket thread.");
}
{
printf("io_service socket thread exiting.");
}
}
//------------------------------------------------------------------------------
void ConnectionManager::DoAccept()
{
m_acceptor.async_accept(m_listenSocket,
[this](const boost::system::error_code errorCode)
{
if (errorCode)
{
printf("An error occured while attemping to accept connections. Error Code: %s", Connection::ErrorCodeToString(errorCode).c_str());
return;
}
// Create the connection from the connected socket
std::lock_guard<std::mutex> lock(m_connectionsMutex);
Connection::SharedPtr connection = Connection::Create(this, m_listenSocket);
m_connections.push_back(connection);
connection->Start();
DoAccept();
});
}
//------------------------------------------------------------------------------
void ConnectionManager::OnConnectionClosed(Connection::SharedPtr connection)
{
std::lock_guard<std::mutex> lock(m_connectionsMutex);
auto itConnection = std::find(m_connections.begin(), m_connections.end(), connection);
if (itConnection != m_connections.end())
{
m_connections.erase(itConnection);
}
}
//------------------------------------------------------------------------------
void ConnectionManager::DoTimer()
{
if (!m_io_service.stopped())
{
// Send messages every second
m_timer.expires_from_now(boost::posix_time::seconds(30));
m_timer.async_wait(
[this](const boost::system::error_code & errorCode)
{
std::lock_guard<std::mutex> lock(m_connectionsMutex);
for (auto connection : m_connections)
{
connection->Send(std::vector<char>{'b', 'e', 'e', 'p', '#'});
}
DoTimer();
});
}
}
main.cpp
#include "ConnectionManager.h"
#include <cstring>
#include <iostream>
#include <string>
int main()
{
// Start up the server
ConnectionManager connectionManager(5000, 2);
connectionManager.Start();
// Pretend we are doing other things or just waiting for shutdown
std::this_thread::sleep_for(std::chrono::minutes(5));
// Stop the server
connectionManager.Stop();
return 0;
}
Could we use 2 strands for this question by posting write(...) as an asynchronous operation to strand1 and handler(...) to strand2?
Your advices on the code would be highly appreciated.
boost::asio::strand<boost::asio::io_context::executor_type> strand1, strand2;
std::vector<char> empty_vector(0);
void Connection::Send(const std::vector<char> & data)
{
boost::asio::post(boost::asio::bind_executor(strand1, std::bind(&Connection::write, this, true, data)));
}
void Connection::write(bool has_data, const std::vector<char> & data)
{
// Append to the inactive buffer
std::vector<char> & inactiveBuffer = m_sendBuffers[m_activeSendBufferIndex ^ 1];
if (has_data)
{
inactiveBuffer.insert(inactiveBuffer.end(), data.begin(), data.end());
}
//
if (!inactiveBuffer.empty() && m_sendBuffers[m_activeSendBufferIndex].empty())
{
m_activeSendBufferIndex ^= 1;
std::vector<char> & activeBuffer = m_sendBuffers[m_activeSendBufferIndex];
boost::asio::async_write(m_socket, boost::asio::buffer(activeBuffer), boost::asio::bind_executor(strand2, std::bind(&Connection::handler, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred)));
}
}
void Connection::handler(const boost::system::error_code & errorCode, size_t bytesTransferred)
{
self->m_sendBuffers[self->m_activeSendBufferIndex].clear();
if (errorCode)
{
printf("An error occured while attemping to send data to client id %zd. %s", self->m_clientId, ErrorCodeToString(errorCode).c_str());
// An error occurred
// We do not stop or close on sends, but instead let the receive error out and then close
return;
}
boost::asio::post(boost::asio::bind_executor(strand1, std::bind(&Connection::write, this, false, empty_vector)));
}
}
I am implementing a small distributed system that consists N machines. Each of them receives some data from some remote server and then propagates the data to other n-1 fellow machines. I am using the Boost Asio async_read and async_write to implement this. I set up a test cluster of N=30 machines. When I tried smaller datesets (receiving 75KB to 750KB per machine), the program always worked. But when I moved on to just a slightly larger dataset (7.5MB), I observed strange behavior: at the beginning, reads and writes happened as expected, but after a while, some machines hanged while others finished, the number of machines that hanged varied with each run. I tried to print out some messages in each handler and found that for those machines that hanged, async_read basically could not successfully read after a while, therefore nothing could proceed afterwards. I checked the remote servers, and they all finished writing. I have tried out using strand to control the order of execution of async reads and writes, and I also tried using different io_services for read and write. None of them solved the problem. I am pretty desperate. Can anyone help me?
Here is the code for the class that does the read and propagation:
const int TRANS_TUPLE_SIZE=15;
const int TRANS_BUFFER_SIZE=5120/TRANS_TUPLE_SIZE*TRANS_TUPLE_SIZE;
class Asio_Trans_Broadcaster
{
private:
char buffer[TRANS_BUFFER_SIZE];
int node_id;
int mpi_size;
int mpi_rank;
boost::asio::ip::tcp::socket* dbsocket;
boost::asio::ip::tcp::socket** sender_sockets;
int n_send;
boost::mutex mutex;
bool done;
public:
Asio_Trans_Broadcaster(boost::asio::ip::tcp::socket* dbskt, boost::asio::ip::tcp::socket** senderskts,
int msize, int mrank, int id)
{
dbsocket=dbskt;
count=0;
node_id=id;
mpi_size=mpi_rank=-1;
sender_sockets=senderskts;
mpi_size=msize;
mpi_rank=mrank;
n_send=-1;
done=false;
}
static std::size_t completion_condition(const boost::system::error_code& error, std::size_t bytes_transferred)
{
int remain=bytes_transferred%TRANS_TUPLE_SIZE;
if(remain==0 && bytes_transferred>0)
return 0;
else
return TRANS_BUFFER_SIZE-bytes_transferred;
}
void write_handler(const boost::system::error_code &ec, std::size_t bytes_transferred)
{
int n=-1;
mutex.lock();
n_send--;
n=n_send;
mutex.unlock();
fprintf(stdout, "~~~~~~ #%d, write_handler: %d bytes, copies_to_send: %d\n",
node_id, bytes_transferred, n);
if(n==0 && !done)
boost::asio::async_read(*dbsocket,
boost::asio::buffer(buffer, TRANS_BUFFER_SIZE),
Asio_Trans_Broadcaster::completion_condition, boost::bind(&Asio_Trans_Broadcaster::broadcast_handler, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void broadcast_handler(const boost::system::error_code &ec, std::size_t bytes_transferred)
{
fprintf(stdout, "#%d, broadcast_handler: %d bytes, mpi_size:%d, mpi_rank: %d\n", node_id, bytes_transferred, mpi_size, mpi_rank);
if (!ec)
{
int pos=0;
while(pos<bytes_transferred && pos<TRANS_BUFFER_SIZE)
{
int id=-1;
memcpy(&id, &buffer[pos], 4);
if(id<0)
{
done=true;
fprintf(stdout, "#%d, broadcast_handler: done!\n", mpi_rank);
break;
}
pos+=TRANS_TUPLE_SIZE;
}
mutex.lock();
n_send=mpi_size-1;
mutex.unlock();
for(int i=0; i<mpi_size; i++)
if(i!=mpi_rank)
{
boost::asio::async_write(*sender_sockets[i], boost::asio::buffer(buffer, bytes_transferred),
boost::bind(&Asio_Trans_Broadcaster::write_handler, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
}
else
{
cerr<<mpi_rank<<" error: "<<ec.message()<<endl;
delete this;
}
}
void broadcast()
{
boost::asio::async_read(*dbsocket,
boost::asio::buffer(buffer, TRANS_BUFFER_SIZE),
Asio_Trans_Broadcaster::completion_condition, boost::bind(&Asio_Trans_Broadcaster::broadcast_handler, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
};
Here is the main code running on each machine:
int N=30;
boost::asio::io_service* sender_io_service=new boost::asio::io_service();
boost::asio::io_service::work* p_work=new boost::asio::io_service::work(*sender_io_service);
boost::thread_group send_thread_pool;
for(int i=0; i<NUM_THREADS; i++)
{
send_thread_pool.create_thread( boost::bind( & boost::asio::io_service::run, sender_io_service ) );
}
boost::asio::io_service* receiver_io_service=new boost::asio::io_service();
shared_ptr<boost::asio::io_service::work> p_work2(new boost::asio::io_service::work(*receiver_io_service));
boost::thread_group thread_pool2;
thread_pool2.create_thread( boost::bind( & boost::asio::io_service::run, receiver_io_service) );
boost::asio::ip::tcp::socket* receiver_socket;
//establish nonblocking connection with remote server
AsioConnectToRemote(5000, 1, receiver_io_service, receiver_socket, true);
boost::asio::ip::tcp::socket* send_sockets[N];
//establish blocking connection with other machines
hadoopNodes = SetupAsioConnectionsWIthOthers(sender_io_service, send_sockets, hostFileName, mpi_rank, mpi_size, 3000, false);
Asio_Trans_Broadcaster* db_receiver=new Asio_Trans_Broadcaster(receiver_socket, send_sockets,
mpi_size, mpi_rank, mpi_rank);
db_receiver->broadcast();
p_work2.reset();
thread_pool2.join_all();
delete p_work;
send_thread_pool.join_all();
I don't know what your code is trying to achieve. There are too many missing bits.
Of course, if the task is to asynchronously send/receive traffic on network sockets, Asio is just the thing for that. It's hard to see what's special about your code.
I'd suggest to clean up the more obvious problems:
there's (almost) no error handling (check your error_code-s!)
unless you're on a funny platform, your format strings should use %lu for size_t
why do you mess around with raw arrays, with possibly bad sizes, when you can just have a vector?
never assume the size of objects if you can use sizeof:
memcpy(&id, &trans_buffer[pos], sizeof(id));
come to think of it, it looks like the indexing of buffer is unsafe anyways:
while(pos < bytes_transferred && pos < TRANS_BUFFER_SIZE)
{
int id = -1;
memcpy(&id, &buffer[pos], sizeof(id));
If e.g. pos == TRANS_BUFFER_SIZE-1 here the memcpy invokes Undefined Behavour...
why is there so much new going on? You're inviting a hairy class of bugs into your code. As if memory management wasn't the achilles heel of lowlevel coding. Use values, or shared pointers. Never delete this. Ever[1]
why is there so much repeated code? Why is one thread pool named after sender and the other thread_pool2? Which contains 1 thread. Eh? Why do you have one work item as a raw pointer, the other as a shared_ptr?
You could just just:
struct service_wrap {
service_wrap(int threads) {
while(threads--)
pool.create_thread(boost::bind(&boost::asio::io_service::run, boost::ref(io_service)));
}
~service_wrap() {
io_service.post(boost::bind(&service_wrap::stop, this));
pool.join_all();
}
private: // mind the initialization order!
boost::asio::io_service io_service;
boost::optional<boost::asio::io_service::work> work;
boost::thread_group pool;
void stop() {
work = boost::none;
}
};
So you can simply write:
service_wrap senders(NUM_THREADS);
service_wrap receivers(1);
Wow. Did you see that? No more chance of error. If you fix one pool, you fix the other automatically. No more delete the first, .reset() the second work item. In short: no more messy code, and less complexity.
Use exception safe locking guards:
int local_n_send = -1; // not clear naming
{
boost::lock_guard<boost::mutex> lk(mutex);
n_send--;
local_n_send = n_send;
}
the body of broadcast is completely repeated in write_handler(). Why not just call it:
if(local_n_send == 0 && !done)
broadcast();
I think there's still a race condition - not a data race on the access to n_send itself, but the decision to re-broadcast might be wrong if n_send reaches zero after the the lock is released. Now, since broadcast() does only an async operation, you can just do it under the lock and get rid of the race condition:
void write_handler(const error_code &ec, size_t bytes_transferred) {
boost::lock_guard<boost::mutex> lk(mutex);
if(!(done || --n_send))
broadcast();
}
Woop woop. That's three lines of code now. Less code is less bugs.
My guess would be that if you diligently scrub the code like this, you will inevitably find your clues. Think of it like you would look for a lost wedding-ring: you wouldn't leave a mess lying around. Instead, you'd go from room to room and tidy it all up. Throw everything "out" first if need be.
Iff you can make this thing self-contained /and/ reproducible, I'll even debug it further for you!
Cheers
Here's a starting point that I made while looking at the code: Compiling on Coliru
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/array.hpp>
#include <boost/make_shared.hpp>
#include <boost/ptr_container/ptr_vector.hpp>
#include <iostream>
const/*expr*/ int TRANS_TUPLE_SIZE = 15;
const/*expr*/ int TRANS_BUFFER_SIZE = 5120 / TRANS_TUPLE_SIZE * TRANS_TUPLE_SIZE;
namespace AsioTrans
{
using boost::system::error_code;
using namespace boost::asio;
typedef ip::tcp::socket socket_t;
typedef boost::ptr_vector<socket_t> socket_list;
class Broadcaster
{
private:
boost::array<char, TRANS_BUFFER_SIZE> trans_buffer;
int node_id;
int mpi_rank;
socket_t& dbsocket;
socket_list& sender_sockets;
int n_send;
boost::mutex mutex;
bool done;
public:
Broadcaster(
socket_t& dbskt,
socket_list& senderskts,
int mrank,
int id) :
node_id(id),
mpi_rank(mrank),
dbsocket(dbskt),
sender_sockets(senderskts),
n_send(-1),
done(false)
{
// count=0;
}
static size_t completion_condition(const error_code& error, size_t bytes_transferred)
{
// TODO FIXME handler error_code here
int remain = bytes_transferred % TRANS_TUPLE_SIZE;
if(bytes_transferred && !remain)
{
return 0;
}
else
{
return TRANS_BUFFER_SIZE - bytes_transferred;
}
}
void write_handler(const error_code &ec, size_t bytes_transferred)
{
// TODO handle errors
// TODO check bytes_transferred
boost::lock_guard<boost::mutex> lk(mutex);
if(!(done || --n_send))
broadcast();
}
void broadcast_handler(const error_code &ec, size_t bytes_transferred)
{
fprintf(stdout, "#%d, broadcast_handler: %lu bytes, mpi_size:%lu, mpi_rank: %d\n", node_id, bytes_transferred, sender_sockets.size(), mpi_rank);
if(!ec)
{
for(size_t pos = 0; (pos < bytes_transferred && pos < TRANS_BUFFER_SIZE); pos += TRANS_TUPLE_SIZE)
{
int id = -1;
memcpy(&id, &trans_buffer[pos], sizeof(id));
if(id < 0)
{
done = true;
fprintf(stdout, "#%d, broadcast_handler: done!\n", mpi_rank);
break;
}
}
{
boost::lock_guard<boost::mutex> lk(mutex);
n_send = sender_sockets.size() - 1;
}
for(int i = 0; size_t(i) < sender_sockets.size(); i++)
{
if(i != mpi_rank)
{
async_write(
sender_sockets[i],
buffer(trans_buffer, bytes_transferred),
boost::bind(&Broadcaster::write_handler, this, placeholders::error, placeholders::bytes_transferred));
}
}
}
else
{
std::cerr << mpi_rank << " error: " << ec.message() << std::endl;
delete this;
}
}
void broadcast()
{
async_read(
dbsocket,
buffer(trans_buffer),
Broadcaster::completion_condition,
boost::bind(&Broadcaster::broadcast_handler, this,
placeholders::error,
placeholders::bytes_transferred));
}
};
struct service_wrap {
service_wrap(int threads) {
while(threads--)
_pool.create_thread(boost::bind(&io_service::run, boost::ref(_service)));
}
~service_wrap() {
_service.post(boost::bind(&service_wrap::stop, this));
_pool.join_all();
}
io_service& service() { return _service; }
private: // mind the initialization order!
io_service _service;
boost::optional<io_service::work> _work;
boost::thread_group _pool;
void stop() {
_work = boost::none;
}
};
extern void AsioConnectToRemote(int, int, io_service&, socket_t&, bool);
extern void SetupAsioConnectionsWIthOthers(io_service&, socket_list&, std::string, int, bool);
}
int main()
{
using namespace AsioTrans;
// there's no use in increasing #threads unless there are blocking operations
service_wrap senders(boost::thread::hardware_concurrency());
service_wrap receivers(1);
socket_t receiver_socket(receivers.service());
AsioConnectToRemote(5000, 1, receivers.service(), receiver_socket, true);
socket_list send_sockets(30);
/*hadoopNodes =*/ SetupAsioConnectionsWIthOthers(senders.service(), send_sockets, "hostFileName", 3000, false);
int mpi_rank = send_sockets.size();
AsioTrans::Broadcaster db_receiver(receiver_socket, send_sockets, mpi_rank, mpi_rank);
db_receiver.broadcast();
}
[1] No exceptions. Except when there's an exception to the no-exceptions rule. Exception-ception.
Hey all, I'm new to asio and boost, I've been trying to implement a TCP Server & Client so that I could transmit an std::vector - but I've failed so far. I'm finding the boost documentation of Asio lacking (to say the least) and hard to understand (english is not my primary language).
In any case, I've been looking at the iostreams examples and I've been trying to implement an object oriented solution - but I've failed.
The server that I'm trying to implement should be able to accept connections from multiple clients (How do I do that ?)
The server should receive the std::vector, /* Do something */ and then return it to the client so that the client can tell that the server received the data intact.
*.h file
class TCP_Server : private boost::noncopyable
{
typedef boost::shared_ptr<TCP_Connection> tcp_conn_pointer;
public :
TCP_Server(ba::io_service &io_service, int port);
virtual ~TCP_Server() {}
virtual void Start_Accept();
private:
virtual void Handle_Accept(const boost::system::error_code& e);
private :
int m_port;
ba::io_service& m_io_service; // IO Service
bi::tcp::acceptor m_acceptor; // TCP Connections acceptor
tcp_conn_pointer m_new_tcp_connection; // New connection pointer
};
*.cpp file
TCP_Server::TCP_Server(boost::asio::io_service &io_service, int port) :
m_io_service(io_service),
m_acceptor(io_service, bi::tcp::endpoint(bi::tcp::v4(), port)),
m_new_tcp_connection(TCP_Connection::Create(io_service))
{
m_port = port;
Start_Accept();
}
void TCP_Server::Start_Accept()
{
std::cout << "[TCP_Server][Start_Accept] => Listening on port : " << m_port << std::endl;
//m_acceptor.async_accept(m_new_tcp_connection->Socket(),
// boost::bind(&TCP_Server::Handle_Accept, this,
// ba::placeholders::error));
m_acceptor.async_accept(*m_stream.rdbuf(),
boost::bind(&TCP_Server::Handle_Accept,
this,
ba::placeholders::error));
}
void TCP_Server::Handle_Accept(const boost::system::error_code &e)
{
if(!e)
{
/*boost::thread T(boost::bind(&TCP_Connection::Run, m_new_tcp_connection));
std::cout << "[TCP_Server][Handle_Accept] => Accepting incoming connection. Launching Thread " << std::endl;
m_new_tcp_connection = TCP_Connection::Create(m_io_service);
m_acceptor.async_accept(m_new_tcp_connection->Socket(),
boost::bind(&TCP_Server::Handle_Accept,
this,
ba::placeholders::error));*/
m_stream << "Server Response..." << std::endl;
}
}
How should the client look ?
How do I keep the connection alive while both apps "talk" ?
AFAIK ASIO iostreams are only for synchronous I/O. But your example gives me a hint that you want to use asynchronous I/O.
Here is a small example of a server which uses async I/O to read a request comprising of an array of integers preceded by 4 byte count of the integers in the request.
So in effect I am serializing a vector of integerss as
count(4 bytes)
int
int
...
etc
if reading the vector of ints is successful, the server will write a 4 byte response code(=1) and then issue a read for a new request from the client. Enough said, Code follows.
#include <iostream>
#include <vector>
#include <boost/bind.hpp>
#include <boost/function.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/asio.hpp>
using namespace boost::asio;
using boost::asio::ip::tcp;
class Connection
{
public:
Connection(tcp::acceptor& acceptor)
: acceptor_(acceptor), socket_(acceptor.get_io_service(), tcp::v4())
{
}
void start()
{
acceptor_.get_io_service().post(boost::bind(&Connection::start_accept, this));
}
private:
void start_accept()
{
acceptor_.async_accept(socket_,boost::bind(&Connection::handle_accept, this,
placeholders::error));
}
void handle_accept(const boost::system::error_code& err)
{
if (err)
{
//Failed to accept the incoming connection.
disconnect();
}
else
{
count_ = 0;
async_read(socket_, buffer(&count_, sizeof(count_)),
boost::bind(&Connection::handle_read_count,
this, placeholders::error, placeholders::bytes_transferred));
}
}
void handle_read_count(const boost::system::error_code& err, std::size_t bytes_transferred)
{
if (err || (bytes_transferred != sizeof(count_))
{
//Failed to read the element count.
disconnect();
}
else
{
elements_.assign(count_, 0);
async_read(socket_, buffer(elements_),
boost::bind(&Connection::handle_read_elements, this,
placeholders::error, placeholders::bytes_transferred));
}
}
void handle_read_elements(const boost::system::error_code& err, std::size_t bytes_transferred)
{
if (err || (bytes_transferred != count_ * sizeof(int)))
{
//Failed to read the request elements.
disconnect();
}
else
{
response_ = 1;
async_write(socket_, buffer(&response_, sizeof(response_)),
boost::bind(&Connection::handle_write_response, this,
placeholders::error, placeholders::bytes_transferred));
}
}
void handle_write_response(const boost::system::error_code& err, std::size_t bytes_transferred)
{
if (err)
disconnect();
else
{
//Start a fresh read
count_ = 0;
async_read(socket_, buffer(&count_, sizeof(count_)),
boost::bind(&Connection::handle_read_count,
this, placeholders::error, placeholders::bytes_transferred));
}
}
void disconnect()
{
socket_.shutdown(tcp::socket::shutdown_both);
socket_.close();
socket_.open(tcp::v4());
start_accept();
}
tcp::acceptor& acceptor_;
tcp::socket socket_;
std::vector<int> elements_;
long count_;
long response_;
};
class Server : private boost::noncopyable
{
public:
Server(unsigned short port, unsigned short thread_pool_size, unsigned short conn_pool_size)
: acceptor_(io_service_, tcp::endpoint(tcp::v4(), port), true)
{
unsigned short i = 0;
for (i = 0; i < conn_pool_size; ++i)
{
ConnectionPtr conn(new Connection(acceptor_));
conn->start();
conn_pool_.push_back(conn);
}
// Start the pool of threads to run all of the io_services.
for (i = 0; i < thread_pool_size; ++i)
{
thread_pool_.create_thread(boost::bind(&io_service::run, &io_service_));
}
}
~Server()
{
io_service_.stop();
thread_pool_.join_all();
}
private:
io_service io_service_;
tcp::acceptor acceptor_;
typedef boost::shared_ptr<Connection> ConnectionPtr;
std::vector<ConnectionPtr> conn_pool_;
boost::thread_group thread_pool_;
};
boost::function0<void> console_ctrl_function;
BOOL WINAPI console_ctrl_handler(DWORD ctrl_type)
{
switch (ctrl_type)
{
case CTRL_C_EVENT:
case CTRL_BREAK_EVENT:
case CTRL_CLOSE_EVENT:
case CTRL_SHUTDOWN_EVENT:
console_ctrl_function();
return TRUE;
default:
return FALSE;
}
}
void stop_server(Server* pServer)
{
delete pServer;
pServer = NULL;
}
int main()
{
Server *pServer = new Server(10255, 4, 20);
console_ctrl_function = boost::bind(stop_server, pServer);
SetConsoleCtrlHandler(console_ctrl_handler, TRUE);
while(true)
{
Sleep(10000);
}
}
I believe the code you have posted is a little incomplete/incorrect. Nonetheless, here is some guidance..
1)
Your async_accept() call seems wrong. It should be something like,
m_acceptor.async_accept(m_new_tcp_connection->socket(),...)
2)
Take note that the Handle_Accept() function will be called after the socket is accepted. In other words, when control reaches Handle_Accept(), you simply have to write to the socket. Something like
void TCP_Server::Handle_Accept(const system::error_code& error)
{
if(!error)
{
//send data to the client
string message = "hello there!\n";
//Write data to the socket and then call the handler AFTER that
//Note, you will need to define a Handle_Write() function in your TCP_Connection class.
async_write(m_new_tcp_connection->socket(),buffer(message),bind(&TCP_Connection::Handle_Write, this,placeholders::error,placeholders::bytes_transferred));
//accept the next connection
Start_Accept();
}
}
3)
As for the client, you should take a look here:
http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/tutorial/tutdaytime1.html
If your communication on both ends is realized in C++ you can use Boost Serialization library to sezilize the vector into bytes and transfer these to the other machine. On the opposite end you will use boost serialization lib to desirialize the object. I saw at least two approaches doing so.
Advantage of Boost Serialization: this approach works when transferring objects between 32bit and 64bit systems as well.
Below are the links:
code project article
boost mailing list ideas
Regards,
Ovanes