EOF in boost::async_read with thread_pull and boost 1.54 - c++

I have a strange problem with my server application. My system is simple: I have 1+ devices and one server app that communicate over a network. Protocol has binary packets with variable length, but fixed header (that contain info about current packet size). Example of packet:
char pct[maxSize] = {}
pct[0] = 0x5a //preambule
pct[1] = 0xa5 //preambule
pct[2] = 0x07 //packet size
pct[3] = 0x0A //command
... [payload]
The protocol is built on the principle of a command-answer.
I use boost::asio for communication - io_service with thread pull (4 threads) + async read/write operation (code example below) and create a "query cycle" - each 200ms by timer:
query one value from device
get result, query second value
get result, start timer again
This work very well on boost 1.53 (Debug and Release). But then i switch to boost 1.54 (especially in Release mode) magic begins. My server successfuly starts, connects to device and starts "query cycle". For about 30-60 seconds everything work well (I receive data, data is correct), but then I start receive asio::error on last read handle (always in one place). Error type: EOF. After recieving the error, I must disconnect from device.
Some time of googling give me info about EOF indicate that other side (device in my case) initiated disconnect procedure. But, according to the logic of the device it can not be true.
May somebody explain what's going on? May be i need set some socket option or defines? I see two possible reason:
my side init disconnect (with some reason, that i don't know) and EOF is answer of this action.
some socket timeout firing.
My environment:
OS: Windows 7/8
Compiler: MSVC 2012 Update 3
Sample code of main "query cycle". Is adapted from official boost chat example All code simplified for reduce space :)
SocketWorker - low level wrapper for sockets
DeviceWorker - class for device communication
ERes - internal struct for error store
ProtoCmd and ProtoAnswer - wrapper for raw array command and answer (chat_message
analog from boost chat example)
lw_service_proto namespace - predefined commands and max sizes of packets
So, code samples. Socket wrapper:
namespace b = boost;
namespace ba = boost::asio;
typedef b::function<void(const ProtoAnswer answ)> DataReceiverType;
class SocketWorker
{
private:
typedef ba::ip::tcp::socket socketType;
typedef std::unique_ptr<socketType> socketPtrType;
socketPtrType devSocket;
ProtoCmd sendCmd;
ProtoAnswer rcvAnsw;
//[other definitions]
public:
//---------------------------------------------------------------------------
ERes SocketWorker::Connect(/*[connect settings]*/)
{
ERes res(LGS_RESULT_ERROR, "Connect to device - Unknow Error");
using namespace boost::asio::ip;
boost::system::error_code sock_error;
//try to connect
devSocket->connect(tcp::endpoint(address::from_string(/*[connect settings ip]*/), /*[connect settings port]*/), sock_error);
if(sock_error.value() > 0) {
//[work with error]
devSocket->close();
}
else {
//[res code ok]
}
return res;
}
//---------------------------------------------------------------------------
ERes SocketWorker::Disconnect()
{
if (devSocket->is_open())
{
boost::system::error_code ec;
devSocket->shutdown(bi::tcp::socket::shutdown_send, ec);
devSocket->close();
}
return ERes(LGS_RESULT_OK, "OK");
}
//---------------------------------------------------------------------------
//query any cmd
void SocketWorker::QueryCommand(const ProtoCmd cmd, DataReceiverType dataClb)
{
sendCmd = std::move(cmd); //store command
if (sendCmd .CommandLength() > 0)
{
ba::async_write(*devSocket.get(), ba::buffer(sendCmd.Data(), sendCmd.Length()),
b::bind(&SocketWorker::HandleSocketWrite,
this, ba::placeholders::error, dataClb));
}
else
{
cerr << "Send command error: nothing to send" << endl;
}
}
//---------------------------------------------------------------------------
// boost socket handlers
void SocketWorker::HandleSocketWrite(const b::system::error_code& error,
DataReceiverType dataClb)
{
if (error)
{
cerr << "Send cmd error: " << error.message() << endl;
//[send error to other place]
return;
}
//start reading header of answer (lw_service_proto::headerSize == 3 bytes)
ba::async_read(*devSocket.get(),
ba::buffer(rcvAnsw.Data(), lw_service_proto::headerSize),
b::bind(&SocketWorker::HandleSockReadHeader,
this, ba::placeholders::error, dataClb));
}
//---------------------------------------------------------------------------
//handler for read header
void SocketWorker::HandleSockReadHeader(const b::system::error_code& error, DataReceiverType dataClb)
{
if (error)
{
//[error working]
return;
}
//decode header (check preambule and get full packet size) and read answer payload
if (rcvAnsw.DecodeHeaderAndGetCmdSize())
{
ba::async_read(*devSocket.get(),
ba::buffer(rcvAnsw.Answer(), rcvAnsw.AnswerLength()),
b::bind(&SocketWorker::HandleSockReadBody,
this, ba::placeholders::error, dataClb));
}
}
//---------------------------------------------------------------------------
//handler for andwer payload
void SocketWorker::HandleSockReadBody(const b::system::error_code& error, DataReceiverType dataClb)
{
//if no error - send anwser to 'master'
if (!error){
if (dataClb != nullptr)
dataClb(rcvAnsw);
}
else{
//[error process]
//here i got EOF in release mode
}
}
};
Device worker
class DeviceWorker
{
private:
const static int LW_QUERY_TIME = 200;
LWDeviceSocketWorker sockWorker;
ba::io_service& timerIOService;
typedef std::shared_ptr<ba::deadline_timer> TimerPtr;
TimerPtr queryTimer;
bool queryCycleWorking;
//[other definitions]
public:
ERes DeviceWorker::Connect()
{
ERes intRes = sockWorker.Connect(/*[connect settings here]*/);
if(intRes != LGS_RESULT_OK) {
//[set result to error]
}
else {
//[set result to success]
//start "query cycle"
StartNewCycleQuery();
}
return intRes;
}
//---------------------------------------------------------------------------
ERes DeviceWorker::Disconnect()
{
return sockWorker.Disconnect();
}
//---------------------------------------------------------------------------
void DeviceWorker::StartNewCycleQuery()
{
queryCycleWorking = true;
//start timer
queryTimer = make_shared<ba::deadline_timer>(timerIOService, bt::milliseconds(LW_QUERY_TIME));
queryTimer->async_wait(boost::bind(&DeviceWorker::HandleQueryTimer,
this, boost::asio::placeholders::error));
}
//---------------------------------------------------------------------------
void DeviceWorker::StopCycleQuery()
{
//kill timer
if (queryTimer)
queryTimer->cancel();
queryCycleWorking = false;
}
//---------------------------------------------------------------------------
//timer handler
void DeviceWorker::HandleQueryTimer(const b::system::error_code& error)
{
if (!error)
{
ProtoCmd cmd;
//query for first value
cmd.EncodeCommandCore(lw_service_proto::cmdGetAlarm, 1);
sockWorker.QueryCommand(cmd, boost::bind(&DeviceWorker::ReceiveAlarmCycle,
this, _1));
}
}
//---------------------------------------------------------------------------
//receive first value
void DeviceWorker::ReceiveAlarmCycle(ProtoAnswer adata)
{
//check and fix last bytes (remove \r\n from some commands)
adata.CheckAndFixFooter();
//[working with answer]
if (queryCycleWorking)
{
//query for second value
ProtoCmd cmd;
cmd.EncodeCommandCore(lw_service_proto::cmdGetEnergyLevel, 1);
sockWorker.QueryCommand(cmd, b::bind(&DeviceWorker::ReceiveEnergyCycle,
this, _1));
}
}
//---------------------------------------------------------------------------
//receive second value
void DeviceWorker::ReceiveEnergyCycle(ProtoAnswer edata)
{
//check and fix last bytes (remove \r\n from some commands)
edata.CheckAndFixFooter();
//[working with second value]
//start new "query cycle"
if (queryCycleWorking)
StartNewCycleQuery();
}
};
Any ideas are welcome :)
edit:
After several test I see anower picture:
this issue reproduce on boost 1.54 only (Debug and Release mode, Release - much more faster), with boost 1.53 no more error (maybe i poorly clean my code then rebuild first times....)
with boost 1.54 and 1 thread (instead of 4) all work well
I also spend some time with debugger and boost source and making some conclusion:
When i receive EOF my data is already fully received.
This EOF indicate that is nothing to transfer in this operation, i.e. socket result flag is 0 (no error), but boost operation flag if EOF (transfer bytes == 0)
At this moment I am forced to switch on boost 1.53...

I had the exact same problem and I am quite sure that this is a bug of boost::asio 1.54.0
Here is the bug report.
The solution is effectively to get back to 1.53, although there is a patch available for 1.54 in the bug report page.

If your application works fine with a single thread invoking io_service::run() but fails with four threads, you very likely have a race condition. This type of problem is difficult to diagnose. Generally speaking you should ensure your devSocket has at most one outstanding async_read() and async_write() operation. Your current implementation of SocketWorker::QueryCommand() unconditionally invokes async_write() which may violate the ordering assumption documented as such
This operation is implemented in terms of zero or more calls to the
stream's async_write_some function, and is known as a composed
operation. The program must ensure that the stream performs no other
write operations (such as async_write, the stream's async_write_some
function, or any other composed operations that perform writes) until
this operation completes.
The classic solution to this problem is to maintain a queue of outgoing messages. If a previous write is outstanding, append the next outgoing message to the queue. When the previous write completes, initiate the async_write() for the next message in the queue. When using multiple threads invoking io_service::run() you may need to use a strand as the linked answer does.

Related

boost::asio write: Broken pipe

I have a TCP server that handles new connections, when there's a new connection two threads will be created (std::thread, detached).
void Gateway::startServer(boost::asio::io_service& io_service, unsigned short port) {
tcp::acceptor TCPAcceptor(io_service, tcp::endpoint(tcp::v4(), port));
bool UARTToWiFiGatewayStarted = false;
for (;;) { std::cout << "\nstartServer()\n";
auto socket(std::shared_ptr<tcp::socket>(new tcp::socket(io_service)));
/*!
* Accept a new connected WiFi client.
*/
TCPAcceptor.accept(*socket);
socket->set_option( tcp::no_delay( true ) );
// This will set the boolean `Gateway::communicationSessionStatus` variable to true.
Gateway::enableCommunicationSession();
// start one thread
std::thread(WiFiToUARTWorkerSession, socket, this->SpecialUARTPort, this->SpecialUARTPortBaud).detach();
// start the second thread
std::thread(UARTToWifiWorkerSession, socket, this->UARTport, this->UARTbaud).detach();
}
}
The first of two worker functions look like this (here I'm reading using the shared socket):
void Gateway::WiFiToUARTWorkerSession(std::shared_ptr<tcp::socket> socket, std::string SpecialUARTPort, unsigned int baud) {
std::cout << "\nEntered: WiFiToUARTWorkerSession(...)\n";
std::shared_ptr<FastUARTIOHandler> uart(new FastUARTIOHandler(SpecialUARTPort, baud));
try {
while(true == Gateway::communicationSessionStatus) { std::cout << "WiFi->UART\n";
unsigned char WiFiDataBuffer[max_incoming_wifi_data_length];
boost::system::error_code error;
/*!
* Read the TCP data.
*/
size_t length = socket->read_some(boost::asio::buffer(WiFiDataBuffer), error);
/*!
* Handle possible read errors.
*/
if (error == boost::asio::error::eof) {
// this will set the shared boolean variable from "true" to "false", causing the while loop (from the both functions and threads) to stop.
Gateway::disableCommunicationSession();
break; // Connection closed cleanly by peer.
}
else if (error) {
Gateway::disableCommunicationSession();
throw boost::system::system_error(error); // Some other error.
}
uart->write(WiFiDataBuffer, length);
}
}
catch (std::exception &exception) {
std::cerr << "[APP::exception] Exception in thread: " << exception.what() << std::endl;
}
std::cout << "\nExiting: WiFiToUARTWorkerSession(...)\n";
}
And the second one (here I'm writing using the thread-shared socket):
void Gateway::UARTToWifiWorkerSession(std::shared_ptr<tcp::socket> socket, std::string UARTport, unsigned int baud) {
std::cout << "\nEntered: UARTToWifiWorkerSession(...)\n";
/*!
* Buffer used for storing the UART-incoming data.
*/
unsigned char UARTDataBuffer[max_incoming_uart_data_length];
std::vector<unsigned char> outputBuffer;
std::shared_ptr<FastUARTIOHandler> uartHandler(new FastUARTIOHandler(UARTport, baud));
while(true == Gateway::communicationSessionStatus) { std::cout << "UART->WiFi\n";
/*!
* Read the UART-available data.
*/
auto bytesReceived = uartHandler->read(UARTDataBuffer, max_incoming_uart_data_length);
/*!
* If there was some data, send it over TCP.
*/
if(bytesReceived > 0) {
boost::asio::write((*socket), boost::asio::buffer(UARTDataBuffer, bytesReceived));
std::cout << "\nSending data to app...\n";
}
}
std::cout << "\nExited: UARTToWifiWorkerSession(...)\n";
}
For stopping this two threads I do the following thing: from the WiFiToUARTWorkerSession(...) function, if the read(...) fails (there's an error like boost::asio::error::eof, or any other error) I set the Gateway::communicationSessionStatus boolean switch (which is shared (global) by the both functions) to false, this way the functions should return, and the threads should be killed gracefully.
When I'm connecting for the first time, this works well, but when I'm disconnecting from the server, the execution flow from the WiFiToUARTWorkerSession(...) goes through else if (error) condition, it sets the while condition variable to false, and then it throws boost::system::system_error(error) (which actually means Connection reset by peer).
Then when I'm trying to connect again, I got the following exception and the program terminates:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
what(): write: Broken pipe
What could be the problem?
EDIT: From what I found about this error, it seems that I write(...) after the client disconnects, but how could this be possible?
EDIT2: I have debugged the code even more and it seems that one thread (on which runs the UARTToWifiWorkerSession(...) function) won't actually exit (because there's a blocking read(...) function call at where the execution flow stops). This way that one thread will hang until there's some data received by the read(...) function, and when I'm reconnecting there will be created another two threads, this causing some data racing problems.
Can someone confirm me that this could be the problem?
The actual problem was that the function UARTToWifiWorkerSession(...) didn't actually exit (because of a blocking read(...) function, this causing two threads (the hanging one, and one of the latest two created ones) to write(...) (without any concurrency control) using the same socket.
The solution was to set a read(...) timeout, so I can return from the function (and thus destroy the thread) without pending from some input.

TCP client in Boost asio

Im building a TCP client using Boost::asio Libs. My program has a write() thread that sends a command to the server
write(*_socket,boost::asio::buffer("sspi l1\n\n",sizeof("sspi l1\n\n")));
Then a read thread is started that reads from the buffer all the time, as there can be messages broadcasted from the server due to any other client
void TCP_IP_Connection::readTCP()
{
size_t l=0;
this->len=0;
boost::system::error_code error;
try
{//loop reading all values from router
while(1)
{
//wait for reply??
l=_socket->read_some(boost::asio::buffer(this->reply,sizeof(this->reply)),error);
if(error)
throw boost::system::system_error(error);
if(l>0)
{
this->dataProcess(l);
}
else
boost::this_thread::sleep(boost::posix_time::milliseconds(5000));
_io.run();
if(error==boost::asio::error::eof) //connection closed by router
std::cout<<"connection closed by router";
_io.reset();
}
}
catch (std::exception& e)
{
std::cerr << e.what() << std::endl;
}
}
This thread runs al time in a while(1) loop and is supposed to sleep when the received data length is less than zero. It reads all the data and calls the data parser function. After that the write thread is used to send another command, with read thread running. But instead of the required response the server sends back
? ""
ERROR: Unknown command
I tried using the wireshark. I can see the command being send properly. What can be mistake I'm doing here?
sizeof("sspi l1\n\n") returns 10, but I can only count 9 characters in that string.
Try this instead:
const std::string cmd("sspi l1\n\n");
write(*_socket,boost::asio::buffer(cmd, cmd.length()));
Or when you have it as a string it is enough to do
const std::string cmd("sspi l1\n\n");
write(*_socket,boost::asio::buffer(cmd));
The second argument specifies a maximum length of the string to use. But since it is a constant string, the second argument is not strictly necessary.

boost::asio async server design

Currently I'm using design when server reads first 4 bytes of stream then read N bytes after header decoding.
But I found that time between first async_read and second read is 3-4 ms. I just printed in console timestamp from callbacks for measuring. I sent 10 bytes of data in total. Why it takes so much time to read?
I running it in debug mode but I think that 1 connection for debug is
not so much to have a 3 ms delay between reads from socket. Maybe I need
another approach to cut TCP stream on "packets"?
UPDATE: I post some code here
void parseHeader(const boost::system::error_code& error)
{
cout<<"[parseHeader] "<<lib::GET_SERVER_TIME()<<endl;
if (error) {
close();
return;
}
GenTCPmsg::header result = msg.parseHeader();
if (result.error == GenTCPmsg::parse_error::__NO_ERROR__) {
msg.setDataLength(result.size);
boost::asio::async_read(*socket,
boost::asio::buffer(msg.data(), result.size),
(*_strand).wrap(
boost::bind(&ConnectionInterface::parsePacket, shared_from_this(), boost::asio::placeholders::error)));
} else {
close();
}
}
void parsePacket(const boost::system::error_code& error)
{
cout<<"[parsePacket] "<<lib::GET_SERVER_TIME()<<endl;
if (error) {
close();
return;
}
protocol->parsePacket(msg);
msg.flush();
boost::asio::async_read(*socket,
boost::asio::buffer(msg.data(), config::HEADER_SIZE),
(*_strand).wrap(
boost::bind(&ConnectionInterface::parseHeader, shared_from_this(), boost::asio::placeholders::error)));
}
As you see unix timestamps differ in 3-4 ms. I want to understand why so many time elapse between parseHeader and parsePacket. This is not a client problem, summary data is 10 bytes, but i cant sent much much more, delay is exactly between calls. I'm using flash client version 11. What i do is just send ByteArray through opened socket. I don't sure that delays on client. I send all 10 bytes at once. How can i debug where actual delay is?
There are far too many unknowns to identify the root cause of the delay from the posted code. Nevertheless, there are a few approaches and considerations that can be taken to help to identify the problem:
Enable handler tracking for Boost.Asio 1.47+. Simply define BOOST_ASIO_ENABLE_HANDLER_TRACKING and Boost.Asio will write debug output, including timestamps, to the standard error stream. These timestamps can be used to help filter out delays introduced by application code (parseHeader(), parsePacket(), etc.).
Verify that byte-ordering is being handled properly. For example, if the protocol defines the header's size field as two bytes in network-byte-order and the server is handling the field as a raw short, then upon receiving a message that has a body size of 10:
A big-endian machine will call async_read reading 10 bytes. The read operation should complete quickly as the socket already has the 10 byte body available for reading.
A little-endian machine will call async_read reading 2560 bytes. The read operation will likely remain outstanding, as far more bytes are trying to be read than is intended.
Use tracing tools such as strace, ltrace, etc.
Modify Boost.Asio, adding timestamps throughout the callstack. Boost.Asio is shipped as a header-file only library. Thus, users may modify it to provide as much verbosity as desired. While not the cleanest or easiest of approaches, adding a print statement with timestamps throughout the callstack may help provide visibility into timing.
Try duplicating the behavior in a short, simple, self contained example. Start with the simplest of examples to determine if the delay is systamtic. Then, iteratively expand upon the example so that it becomes closer to the real-code with each iteration.
Here is a simple example from which I started:
#include <iostream>
#include <boost/array.hpp>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/enable_shared_from_this.hpp>
#include <boost/make_shared.hpp>
#include <boost/shared_ptr.hpp>
class tcp_server
: public boost::enable_shared_from_this< tcp_server >
{
private:
enum
{
header_size = 4,
data_size = 10,
buffer_size = 1024,
max_stamp = 50
};
typedef boost::asio::ip::tcp tcp;
public:
typedef boost::array< boost::posix_time::ptime, max_stamp > time_stamps;
public:
tcp_server( boost::asio::io_service& service,
unsigned short port )
: strand_( service ),
acceptor_( service, tcp::endpoint( tcp::v4(), port ) ),
socket_( service ),
index_( 0 )
{}
/// #brief Returns collection of timestamps.
time_stamps& stamps()
{
return stamps_;
}
/// #brief Start the server.
void start()
{
acceptor_.async_accept(
socket_,
boost::bind( &tcp_server::handle_accept, this,
boost::asio::placeholders::error ) );
}
private:
/// #brief Accept connection.
void handle_accept( const boost::system::error_code& error )
{
if ( error )
{
std::cout << error.message() << std::endl;
return;
}
read_header();
}
/// #brief Read header.
void read_header()
{
boost::asio::async_read(
socket_,
boost::asio::buffer( buffer_, header_size ),
boost::bind( &tcp_server::handle_read_header, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred ) );
}
/// #brief Handle reading header.
void
handle_read_header( const boost::system::error_code& error,
std::size_t bytes_transferred )
{
if ( error )
{
std::cout << error.message() << std::endl;
return;
}
// If no more stamps can be recorded, then stop the async-chain so
// that io_service::run can return.
if ( !record_stamp() ) return;
// Read data.
boost::asio::async_read(
socket_,
boost::asio::buffer( buffer_, data_size ),
boost::bind( &tcp_server::handle_read_data, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred ) );
}
/// #brief Handle reading data.
void handle_read_data( const boost::system::error_code& error,
std::size_t bytes_transferred )
{
if ( error )
{
std::cout << error.message() << std::endl;
return;
}
// If no more stamps can be recorded, then stop the async-chain so
// that io_service::run can return.
if ( !record_stamp() ) return;
// Start reading header again.
read_header();
}
/// #brief Record time stamp.
bool record_stamp()
{
stamps_[ index_++ ] = boost::posix_time::microsec_clock::local_time();
return index_ < max_stamp;
}
private:
boost::asio::io_service::strand strand_;
tcp::acceptor acceptor_;
tcp::socket socket_;
boost::array< char, buffer_size > buffer_;
time_stamps stamps_;
unsigned int index_;
};
int main()
{
boost::asio::io_service service;
// Create and start the server.
boost::shared_ptr< tcp_server > server =
boost::make_shared< tcp_server >( boost::ref(service ), 33333 );
server->start();
// Run. This will exit once enough time stamps have been sampled.
service.run();
// Iterate through the stamps.
tcp_server::time_stamps& stamps = server->stamps();
typedef tcp_server::time_stamps::iterator stamp_iterator;
using boost::posix_time::time_duration;
for ( stamp_iterator iterator = stamps.begin() + 1,
end = stamps.end();
iterator != end;
++iterator )
{
// Obtain the delta between the current stamp and the previous.
time_duration delta = *iterator - *(iterator - 1);
std::cout << "Delta: " << delta.total_milliseconds() << " ms"
<< std::endl;
}
// Calculate the total delta.
time_duration delta = *stamps.rbegin() - *stamps.begin();
std::cout << "Total"
<< "\n Start: " << *stamps.begin()
<< "\n End: " << *stamps.rbegin()
<< "\n Delta: " << delta.total_milliseconds() << " ms"
<< std::endl;
}
A few notes about the implementation:
There is only one thread (main) and one asynchronous chain read_header->handle_read_header->handle_read_data. This should minimize the amount of time a ready-to-run handler spends waiting for an available thread.
To focus on boost::asio::async_read, noise is minimized by:
Using a pre-allocated buffer.
Not using shared_from_this() or strand::wrap.
Recording the timestamps, and perform processing post-collection.
I compiled on CentOS 5.4 using gcc 4.4.0 and Boost 1.50. To drive the data, I opted to send 1000 bytes using netcat:
$ ./a.out > output &
[1] 18623
$ echo "$(for i in {0..1000}; do echo -n "0"; done)" | nc 127.0.0.1 33333
[1]+ Done ./a.out >output
$ tail output
Delta: 0 ms
Delta: 0 ms
Delta: 0 ms
Delta: 0 ms
Delta: 0 ms
Delta: 0 ms
Total
Start: 2012-Sep-10 21:22:45.585780
End: 2012-Sep-10 21:22:45.586716
Delta: 0 ms
Observing no delay, I expanded upon the example by modifying the boost::asio::async_read calls, replacing this with shared_from_this() and wrapping the ReadHandlerss with strand_.wrap(). I ran the updated example and still observed no delay. Unfortunately, that is as far as I could get based on the code posted in the question.
Consider expanding upon the example, adding in a piece from the real implementation with each iteration. For example:
Start with using the msg variable's type to control the buffer.
Next, send valid data, and introduce parseHeader() and parsePacket functions.
Finally, introduce the lib::GET_SERVER_TIME() print.
If the example code is as close as possible to the real code, and no delay is being observed with boost::asio::async_read, then the ReadHandlers may be ready-to-run in the real code, but they are waiting on synchronization (the strand) or a resource (a thread), resulting in a delay:
If the delay is the result of synchronization with the strand, then consider Robin's suggestion by reading a larger block of data to potentially reduce the amount of reads required per-message.
If the delay is the result of waiting for a thread, then consider having an additional thread call io_service::run().
One thing that makes Boost.Asio awesome is using the async feature to the fullest. Relying on a specific number of bytes read in one batch, possibly ditching some of what could already been read, isn't really what you should be doing.
Instead, look at the example for the webserver especially this: http://www.boost.org/doc/libs/1_51_0/doc/html/boost_asio/example/http/server/connection.cpp
A boost triboolean is used to either a) complete the request if all data is available in one batch, b) ditch it if it's available but not valid and c) just read more when the io_service chooses to if the request was incomplete. The connection object is shared with the handler through a shared pointer.
Why is this superior to most other methods? You can possibly save the time between reads already parsing the request. This is sadly not followed through in the example but idealy you'd thread the handler so it can work on the data already available while the rest is added to the buffer. The only time it's blocking is when the data is incomplete.
Hope this helps, can't shed any light on why there is a 3ms delay between reads though.

async_receive_from stops receiving after a few packets under Linux

I have a setup with multiple peers broadcasting udp packets (containing images) every 200ms (5fps).
While receiving both the local stream as external streams works fine under Windows, the same code (except for the socket->cancel(); in Windows XP, see comment in code) produces rather strange behavior under Linux:
The first few (5~7) packets sent by another machine (when this machine starts streaming) are received as expected;
After this, the packets from the other machine are received after irregular, long intervals (12s, 5s, 17s, ...) or get a time out (defined after 20 seconds). At certain moments, there is again a burst of (3~4) packets received as expected.
The packets sent by the machine itself are still being received as expected.
Using Wireshark, I see both local as external packets arriving as they should, with correct time intervals between consecutive packages. The behavior also presents itself when the local machine is only listening to a single other stream, with the local stream disabled.
This is some code from the receiver (with some updates as suggested below, thanks!):
Receiver::Receiver(port p)
{
this->port = p;
this->stop = false;
}
int Receiver::run()
{
io_service io_service;
boost::asio::ip::udp::socket socket(
io_service,
boost::asio::ip::udp::endpoint(boost::asio::ip::udp::v4(),
this->port));
while(!stop)
{
const int bufflength = 65000;
int timeout = 20000;
char sockdata[bufflength];
boost::asio::ip::udp::endpoint remote_endpoint;
int rcvd;
bool read_success = this->receive_with_timeout(
sockdata, bufflength, &rcvd, &socket, remote_endpoint, timeout);
if(read_success)
{
std::cout << "read succes " << remote_endpoint.address().to_string() << std::endl;
}
else
{
std::cout << "read fail" << std::endl;
}
}
return 0;
}
void handle_receive_from(
bool* toset, boost::system::error_code error, size_t length, int* outsize)
{
if(!error || error == boost::asio::error::message_size)
{
*toset = length>0?true:false;
*outsize = length;
}
else
{
std::cout << error.message() << std::endl;
}
}
// Update: error check
void handle_timeout( bool* toset, boost::system::error_code error)
{
if(!error)
{
*toset = true;
}
else
{
std::cout << error.message() << std::endl;
}
}
bool Receiver::receive_with_timeout(
char* data, int buffl, int* outsize,
boost::asio::ip::udp::socket *socket,
boost::asio::ip::udp::endpoint &sender_endpoint, int msec_tout)
{
bool timer_overflow = false;
bool read_result = false;
deadline_timer timer( socket->get_io_service() );
timer.expires_from_now( boost::posix_time::milliseconds(msec_tout) );
timer.async_wait( boost::bind(&handle_timeout, &timer_overflow,
boost::asio::placeholders::error) );
socket->async_receive_from(
boost::asio::buffer(data, buffl), sender_endpoint,
boost::bind(&handle_receive_from, &read_result,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred, outsize));
socket->get_io_service().reset();
while ( socket->get_io_service().run_one())
{
if ( read_result )
{
timer.cancel();
}
else if ( timer_overflow )
{
//not to be used on Windows XP, Windows Server 2003, or earlier
socket->cancel();
// Update: added run_one()
socket->get_io_service().run_one();
}
}
// Update: added run_one()
socket->get_io_service().run_one();
return read_result;
}
When the timer exceeds the 20 seconds, the error message "Operation canceled" is returned, but it is difficult to get any other information about what is going on.
Can anyone identify a problem or give me some hints to get some more information about what is going wrong? Any help is appreciated.
Okay, what you're doing is that when you call receive_with_timeout, you're setting up the two asynchronous requests (one for the recv, one for the timeout). When the first one completes, you cancel the other.
However, you never invoke ioservice::run_one() again to allow it's callback to complete. When you cancel an operation in boost::asio, it invokes the handler, usually with an error code indicating that the operation has been aborted or canceled. In this case, I believe you have a handler dangling once you destroy the deadline service, since it has a pointer onto the stack for it to store the result.
The solution is to call run_one() again to process the canceled callback result prior to exiting the function. You should also check the error code being passed to your timeout handler, and only treat it as a timeout if there was no error.
Also, in the case where you do have a timeout, you need to execute run_one so that the async_recv_from handler can execute, and report that it was canceled.
After a clean installation with Xubuntu 12.04 instead of an old install with Ubuntu 10.04, everything now works as expected. Maybe it is because the new install runs a newer kernel, probably with improved networking? Anyway, a re-install with a newer version of the distribution solved my problem.
If anyone else gets unexpected network behavior with an older kernel, I would advice to try it on a system with a newer kernel installed.

Two-way C++ communication over serial connection

I am trying to write a really simple C++ application to communicate with an Arduino. I would like to send the Arduino a character that it sends back immediately. The Arduino code that I took from a tutorial looks like this:
void setup()
{
Serial.begin(9600);
}
void loop()
{
//Have the Arduino wait to receive input
while (Serial.available()==0);
//Read the input
char val = Serial.read();
//Echo
Serial.println(val);
}
I can communicate with the Arduino easily using GNU screen, so I know that everything is working fine with the basic communication:
$ screen /dev/tty.usbmodem641 9600
The (broken) C++ code that I have looks like this:
#include <fstream>
#include <iostream>
int main()
{
std::cout << "Opening fstream" << std::endl;
std::fstream file("/dev/tty.usbmodem641");
std::cout << "Sending integer" << std::endl;
file << 5 << std::endl; // endl does flush, which may be important
std::cout << "Data Sent" << std::endl;
std::cout << "Awaiting response" << std::endl;
std::string response;
file >> response;
std::cout << "Response: " << response << std::endl;
return 0;
}
It compiles fine, but when running it, some lights flash on the Arduino and the terminal just hangs at:
Opening fstream
Where am I going wrong?
There are three points:
First: You don't initialize the serial port (TTY) on the Linux side. Nobody knows in what state it is.
Doing this in your program you must use tcgetattr(3) and tcsetattr(3). You can find the required parameters by using these keywords at this site, the Arduino site or on Google. But just for quick testing I propose to issue this command before you call your own command:
stty -F /dev/tty.usbmodem641 sane raw pass8 -echo -hupcl clocal 9600
Especially the the missing clocal might prevent you opening the TTY.
Second: When the device is open, you should wait a little before sending anything. By default the Arduino resets when the serial line is opened or closed. You have to take this into account.
The -hupcl part will prevent this reset most of the time. But at least one reset is always necessary, because -hupcl can be set only when the TTY is already open and at that time the Arduino has received the reset signal already. So -hupcl will "only" prevent future resets.
Third: There is NO error handling in your code. Please add code after each IO operation on the TTY which checks for errors and - the most important part - prints helpful error messages using perror(3) or similar functions.
I found a nice example by Jeff Gray of how to make a simple minicom type client using boost::asio. The original code listing can be found on the boost user group. This allows connection and communication with the Arduino like in the GNU Screen example mentioned in the original post.
The code example (below) needs to be linked with the following linker flags
-lboost_system-mt -lboost_thread-mt
...but with a bit of tweaking, some of the dependence on boost can be replaced with new C++11 standard features. I'll post revised versions as and when I get around to it. For now, this compiles and is a solid basis.
/* minicom.cpp
A simple demonstration minicom client with Boost asio
Parameters:
baud rate
serial port (eg /dev/ttyS0 or COM1)
To end the application, send Ctrl-C on standard input
*/
#include <deque>
#include <iostream>
#include <boost/bind.hpp>
#include <boost/asio.hpp>
#include <boost/asio/serial_port.hpp>
#include <boost/thread.hpp>
#include <boost/lexical_cast.hpp>
#include <boost/date_time/posix_time/posix_time_types.hpp>
#ifdef POSIX
#include <termios.h>
#endif
using namespace std;
class minicom_client
{
public:
minicom_client(boost::asio::io_service& io_service, unsigned int baud, const string& device)
: active_(true),
io_service_(io_service),
serialPort(io_service, device)
{
if (!serialPort.is_open())
{
cerr << "Failed to open serial port\n";
return;
}
boost::asio::serial_port_base::baud_rate baud_option(baud);
serialPort.set_option(baud_option); // set the baud rate after the port has been opened
read_start();
}
void write(const char msg) // pass the write data to the do_write function via the io service in the other thread
{
io_service_.post(boost::bind(&minicom_client::do_write, this, msg));
}
void close() // call the do_close function via the io service in the other thread
{
io_service_.post(boost::bind(&minicom_client::do_close, this, boost::system::error_code()));
}
bool active() // return true if the socket is still active
{
return active_;
}
private:
static const int max_read_length = 512; // maximum amount of data to read in one operation
void read_start(void)
{ // Start an asynchronous read and call read_complete when it completes or fails
serialPort.async_read_some(boost::asio::buffer(read_msg_, max_read_length),
boost::bind(&minicom_client::read_complete,
this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void read_complete(const boost::system::error_code& error, size_t bytes_transferred)
{ // the asynchronous read operation has now completed or failed and returned an error
if (!error)
{ // read completed, so process the data
cout.write(read_msg_, bytes_transferred); // echo to standard output
read_start(); // start waiting for another asynchronous read again
}
else
do_close(error);
}
void do_write(const char msg)
{ // callback to handle write call from outside this class
bool write_in_progress = !write_msgs_.empty(); // is there anything currently being written?
write_msgs_.push_back(msg); // store in write buffer
if (!write_in_progress) // if nothing is currently being written, then start
write_start();
}
void write_start(void)
{ // Start an asynchronous write and call write_complete when it completes or fails
boost::asio::async_write(serialPort,
boost::asio::buffer(&write_msgs_.front(), 1),
boost::bind(&minicom_client::write_complete,
this,
boost::asio::placeholders::error));
}
void write_complete(const boost::system::error_code& error)
{ // the asynchronous read operation has now completed or failed and returned an error
if (!error)
{ // write completed, so send next write data
write_msgs_.pop_front(); // remove the completed data
if (!write_msgs_.empty()) // if there is anthing left to be written
write_start(); // then start sending the next item in the buffer
}
else
do_close(error);
}
void do_close(const boost::system::error_code& error)
{ // something has gone wrong, so close the socket & make this object inactive
if (error == boost::asio::error::operation_aborted) // if this call is the result of a timer cancel()
return; // ignore it because the connection cancelled the timer
if (error)
cerr << "Error: " << error.message() << endl; // show the error message
else
cout << "Error: Connection did not succeed.\n";
cout << "Press Enter to exit\n";
serialPort.close();
active_ = false;
}
private:
bool active_; // remains true while this object is still operating
boost::asio::io_service& io_service_; // the main IO service that runs this connection
boost::asio::serial_port serialPort; // the serial port this instance is connected to
char read_msg_[max_read_length]; // data read from the socket
deque<char> write_msgs_; // buffered write data
};
int main(int argc, char* argv[])
{
// on Unix POSIX based systems, turn off line buffering of input, so cin.get() returns after every keypress
// On other systems, you'll need to look for an equivalent
#ifdef POSIX
termios stored_settings;
tcgetattr(0, &stored_settings);
termios new_settings = stored_settings;
new_settings.c_lflag &= (~ICANON);
new_settings.c_lflag &= (~ISIG); // don't automatically handle control-C
tcsetattr(0, TCSANOW, &new_settings);
#endif
try
{
if (argc != 3)
{
cerr << "Usage: minicom <baud> <device>\n";
return 1;
}
boost::asio::io_service io_service;
// define an instance of the main class of this program
minicom_client c(io_service, boost::lexical_cast<unsigned int>(argv[1]), argv[2]);
// run the IO service as a separate thread, so the main thread can block on standard input
boost::thread t(boost::bind(&boost::asio::io_service::run, &io_service));
while (c.active()) // check the internal state of the connection to make sure it's still running
{
char ch;
cin.get(ch); // blocking wait for standard input
if (ch == 3) // ctrl-C to end program
break;
c.write(ch);
}
c.close(); // close the minicom client connection
t.join(); // wait for the IO service thread to close
}
catch (exception& e)
{
cerr << "Exception: " << e.what() << "\n";
}
#ifdef POSIX // restore default buffering of standard input
tcsetattr(0, TCSANOW, &stored_settings);
#endif
return 0;
}
You should check if you have access to /dev/tty.usbmodem641. The usual way in Linux is to add the user to the proper group with adduser.
By the way, I know that to access the serial port, one needs to open /dev/ttyS0 (for COM1), until /dev/ttyS3. See for example this example in C.