Handling "reset by peer" scenario with boost::asio - c++

I have a server method that waits for new incoming TCP connections, for each connection I'm creating two threads (detached) for handling various tasks.
void MyClass::startServer(boost::asio::io_service& io_service, unsigned short port) {
tcp::acceptor TCPAcceptor(io_service, tcp::endpoint(tcp::v4(), port));
bool UARTToWiFiGatewayStarted = false;
for (;;) {
auto socket(std::shared_ptr<tcp::socket>(new tcp::socket(io_service)));
/*!
* Accept a new connected WiFi client.
*/
TCPAcceptor.accept(*socket);
socket->set_option( tcp::no_delay( true ) );
MyClass::enableCommunicationSession();
// start one worker thread.
std::thread(WiFiToUARTWorkerSession, socket, this->LINport, this->LINbaud).detach();
// only if this is the first connected client:
if(false == UARTToWiFiGatewayStarted) {
std::thread(UARTToWifiWorkerSession, socket, this->UARTport, this->UARTbaud).detach();
UARTToWiFiGatewayStarted = true;
}
}
}
This works fine for starting the communication, but the problem appears when the client disconnects and connects again (or at least tries to connect again).
When the current client disconnects, I stop the communication (by stopping the internal infinite loops from both functions, then they'll return).
void Gateway::WiFiToUARTWorkerSession(std::shared_ptr<tcp::socket> socket, ...) {
/*!
* various code here...
*/
try {
while(true == MyClass::communicationSessionStatus) {
/*!
* Buffer used for storing the UART-incoming data.
*/
unsigned char WiFiDataBuffer[max_incoming_wifi_data_length];
boost::system::error_code error;
/*!
* Read the WiFi-available data.
*/
size_t length = socket->read_some(boost::asio::buffer(WiFiDataBuffer), error);
/*!
* Handle possible read errors.
*/
if (error == boost::asio::error::eof) {
break; // Connection closed cleanly by peer.
}
else if (error) {
// this will cause the infinite loops from the both worker functions to stop, and when they stop the functions will return.
MyClass::disableCommunicationSession();
sleep(1);
throw boost::system::system_error(error); // Some other error.
}
uart->write(WiFiDataBuffer, length);
}
}
catch (std::exception &exception) {
std::cerr << "[APP::exception] Exception in thread: " << exception.what() << std::endl;
}
}
I expect that when I reconnect the communication should work again (the MyClass::startServer(...) will create and detach again two worker threads that will do the same things.
The problem is that when I connect the second time I get:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
what(): write: Broken pipe
From what I found about this error it seems that the server (this application) sends something via TCP to a client that was disconnected.
What I'm doing wrong?
How can I solve this problem?

A read of length 0 with no error is also an indication of eof. The boost::asio::error::eof error code is normally more useful when you're checking the result of a composed operation.
When this error condition is missed, the code as presented will call write on a socket which has now been shutdown. You have used the form of write which does not take a reference to an error_code. This form will throw if there is an error. There will be an error. The read has failed.

Related

Detect closed TCP connection during write with boost::asio immediately

I have a TCP-server with multiple clients/sessions. Each session has its own thread for receiving data from the client, but there is only one thread ("writeThread") to respond to all clients.
Now there is the problem, if a client closes the connection during the "writeThread" is writing to this socket it takes multiple seconds until the write operation notices that the connection is closed remotely. Sometimes its not notices at all, just when I send a signal for an installed sighandler manually, the application will detect it and break the write operation.
The time is measuared between Logger::trace("start write"); and Logger::trace("remote term, closed socket ");
Despite the fact that this may not the best design, is there a possibility to detect the closed connection immediately, or do I really have to redesign?
bool myWrite(UINT8 *pu8_buffer, UINT32 u32_size)
{
bool b_success = false;
try
{
Logger::trace("start write");
b_success = (u32_size == boost::asio::write(_x_socket, boost::asio::buffer(pu8_buffer, u32_size)))
}
catch (boost::system::system_error &er)
{
if (er.code() == boost::asio::error::eof ||
er.code() == boost::asio::error::connection_reset)
{
boost::system::error_code x_er;
_x_socket.close(x_er);
if (!x_er)
{
Logger::trace("remote term, closed socket ");
}
else
{
Logger::err("remote term, closed socket failed");
}
}
}
catch(std::exception &ex)
{
Logger::err("write exception\n\t",ex.what());
}
catch(...)
{
Logger::err("write unknown exception",(uint32_t)this);
}
return b_success;
}
If you use asychronous write operations, it is possible to multiplex writes on the same thread without one blocking the other. You can even do the same for reads.
Just to close this question I will summarize "Richard Critten"s comments.
The issue was that there was no graceful disconnect from the client connected to my server. If there is a proper disconnect the write function will break immediately. To avoid long term or infinite blocking of the write operation its possible to configure a timeout for how long a write operation can take before reporting an error. This timeout can be set with the SO_SNDTIMEO socket option. http://man7.org/linux/man-pages/man7/socket.7.html

boost::asio write: Broken pipe

I have a TCP server that handles new connections, when there's a new connection two threads will be created (std::thread, detached).
void Gateway::startServer(boost::asio::io_service& io_service, unsigned short port) {
tcp::acceptor TCPAcceptor(io_service, tcp::endpoint(tcp::v4(), port));
bool UARTToWiFiGatewayStarted = false;
for (;;) { std::cout << "\nstartServer()\n";
auto socket(std::shared_ptr<tcp::socket>(new tcp::socket(io_service)));
/*!
* Accept a new connected WiFi client.
*/
TCPAcceptor.accept(*socket);
socket->set_option( tcp::no_delay( true ) );
// This will set the boolean `Gateway::communicationSessionStatus` variable to true.
Gateway::enableCommunicationSession();
// start one thread
std::thread(WiFiToUARTWorkerSession, socket, this->SpecialUARTPort, this->SpecialUARTPortBaud).detach();
// start the second thread
std::thread(UARTToWifiWorkerSession, socket, this->UARTport, this->UARTbaud).detach();
}
}
The first of two worker functions look like this (here I'm reading using the shared socket):
void Gateway::WiFiToUARTWorkerSession(std::shared_ptr<tcp::socket> socket, std::string SpecialUARTPort, unsigned int baud) {
std::cout << "\nEntered: WiFiToUARTWorkerSession(...)\n";
std::shared_ptr<FastUARTIOHandler> uart(new FastUARTIOHandler(SpecialUARTPort, baud));
try {
while(true == Gateway::communicationSessionStatus) { std::cout << "WiFi->UART\n";
unsigned char WiFiDataBuffer[max_incoming_wifi_data_length];
boost::system::error_code error;
/*!
* Read the TCP data.
*/
size_t length = socket->read_some(boost::asio::buffer(WiFiDataBuffer), error);
/*!
* Handle possible read errors.
*/
if (error == boost::asio::error::eof) {
// this will set the shared boolean variable from "true" to "false", causing the while loop (from the both functions and threads) to stop.
Gateway::disableCommunicationSession();
break; // Connection closed cleanly by peer.
}
else if (error) {
Gateway::disableCommunicationSession();
throw boost::system::system_error(error); // Some other error.
}
uart->write(WiFiDataBuffer, length);
}
}
catch (std::exception &exception) {
std::cerr << "[APP::exception] Exception in thread: " << exception.what() << std::endl;
}
std::cout << "\nExiting: WiFiToUARTWorkerSession(...)\n";
}
And the second one (here I'm writing using the thread-shared socket):
void Gateway::UARTToWifiWorkerSession(std::shared_ptr<tcp::socket> socket, std::string UARTport, unsigned int baud) {
std::cout << "\nEntered: UARTToWifiWorkerSession(...)\n";
/*!
* Buffer used for storing the UART-incoming data.
*/
unsigned char UARTDataBuffer[max_incoming_uart_data_length];
std::vector<unsigned char> outputBuffer;
std::shared_ptr<FastUARTIOHandler> uartHandler(new FastUARTIOHandler(UARTport, baud));
while(true == Gateway::communicationSessionStatus) { std::cout << "UART->WiFi\n";
/*!
* Read the UART-available data.
*/
auto bytesReceived = uartHandler->read(UARTDataBuffer, max_incoming_uart_data_length);
/*!
* If there was some data, send it over TCP.
*/
if(bytesReceived > 0) {
boost::asio::write((*socket), boost::asio::buffer(UARTDataBuffer, bytesReceived));
std::cout << "\nSending data to app...\n";
}
}
std::cout << "\nExited: UARTToWifiWorkerSession(...)\n";
}
For stopping this two threads I do the following thing: from the WiFiToUARTWorkerSession(...) function, if the read(...) fails (there's an error like boost::asio::error::eof, or any other error) I set the Gateway::communicationSessionStatus boolean switch (which is shared (global) by the both functions) to false, this way the functions should return, and the threads should be killed gracefully.
When I'm connecting for the first time, this works well, but when I'm disconnecting from the server, the execution flow from the WiFiToUARTWorkerSession(...) goes through else if (error) condition, it sets the while condition variable to false, and then it throws boost::system::system_error(error) (which actually means Connection reset by peer).
Then when I'm trying to connect again, I got the following exception and the program terminates:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
what(): write: Broken pipe
What could be the problem?
EDIT: From what I found about this error, it seems that I write(...) after the client disconnects, but how could this be possible?
EDIT2: I have debugged the code even more and it seems that one thread (on which runs the UARTToWifiWorkerSession(...) function) won't actually exit (because there's a blocking read(...) function call at where the execution flow stops). This way that one thread will hang until there's some data received by the read(...) function, and when I'm reconnecting there will be created another two threads, this causing some data racing problems.
Can someone confirm me that this could be the problem?
The actual problem was that the function UARTToWifiWorkerSession(...) didn't actually exit (because of a blocking read(...) function, this causing two threads (the hanging one, and one of the latest two created ones) to write(...) (without any concurrency control) using the same socket.
The solution was to set a read(...) timeout, so I can return from the function (and thus destroy the thread) without pending from some input.

Boost.Asio: Why the timer is executed only once?

I have a function called read_packet. This function remains blocked while there is no connection request or the timer is signaled.
The code is the following:
std::size_t read_packet(const std::chrono::milliseconds& timeout,
boost::system::error_code& error)
{
// m_timer_ --> boost::asio::high_resolution_timer
if(!m_is_first_time_) {
m_is_first_time = true;
// Set an expiry time relative to now.
m_timer_.expires_from_now( timeout );
} else {
m_timer_.expires_at( m_timer_.expires_at() + timeout );
}
// Start an asynchronous wait.
m_timer_.async_wait(
[ this ](const boost::system::error_code& error){
if(!error) m_is_timeout_signaled_ = true;
}
);
auto result = m_io_service_.run_one();
if( !m_is_timeout_signaled_ ) {
m_timer_.cancel();
}
m_io_service_.reset();
return result;
}
The function works correctly while not receiving a connection request. All acceptances of requests are asynchronous.
After accepting a connection, the run_one() function does not remains blocked the time set by the timer. The function always returns 1 (one handle has been processed). This handle corresponds to the timer.
I do not understand why this situation occurs.
Why the function is not blocked the time required for the timer?
Cheers.
NOTE: This function is used in a loop.
UPDATE:
I have my own io_service::run() function. This function performs other actions and tasks. I want to listen and process the network level for a period of time:
If something comes on the network level, io_service::run_one() returns and read_packet() returns the control to my run() function.
Otherwise, the timer is fired and read_packet() returns the control to my run() function.
Everything that comes from the network level is stored in a data structure. Then my run() function operates on that data structure.
It also runs other options.
void run(duration timeout, boost::system::error_code& error)
{
time_point start = clock_type::now();
time_point deadline = start + timeout;
while( !stop() ) {
read_packet(timeout, error);
if(error) return;
if(is_timeout_expired( start, deadline, timeout )) return;
// processing network level
// other actions
}
}
In my case, the sockets are always active until a client requests the closing of the connection.
During a time slot, you manage the network level and for another slot you do other things.
After reading the question more closely I got the idea that you are actually trying to use Asio to get synchronous IO, but with a timeout on each read operation.
That's not what Asio was intended for (hence, the name "Asynchronous IO Library").
But sure, you can do it if you insist. Like I said, I feel you're overcomplicating things.
In the completion handler of your timer, just cancel the socket operation if the timer had expired. (Note that if it didn't, you'll get operation_aborted, so check the error code).
Small selfcontained example (which is what you should always do when trying to get help, by the way):
Live On Coliru
#include <boost/asio.hpp>
#include <boost/asio/high_resolution_timer.hpp>
#include <iostream>
struct Program {
Program() { sock_.connect({ boost::asio::ip::address_v4{}, 6771 }); }
std::size_t read_packet(const std::chrono::milliseconds &timeout, boost::system::error_code &error) {
m_io_service_.reset();
boost::asio::high_resolution_timer timer { m_io_service_, timeout };
timer.async_wait([&](boost::system::error_code) {
sock_.cancel();
});
size_t transferred = 0;
boost::asio::async_read(sock_, boost::asio::buffer(buffer_), [&](boost::system::error_code ec, size_t tx) {
error = ec;
transferred = tx;
});
m_io_service_.run();
return transferred;
}
private:
boost::asio::io_service m_io_service_;
using tcp = boost::asio::ip::tcp;
tcp::socket sock_{ m_io_service_ };
std::array<char, 512> buffer_;
};
int main() {
Program client;
boost::system::error_code ec;
while (!ec) {
client.read_packet(std::chrono::milliseconds(100), ec);
}
std::cout << "Exited with '" << ec.message() << "'\n"; // operation canceled in case of timeout
}
If the socket operation succeeds you can see e.g.:
Exited with 'End of file'
Otherwise, if the operation didn't complete within 100 milliseconds, it will print:
Exited with 'Operation canceled'
See also await_operation in this previous answer, which generalizes this pattern a bit more:
boost::asio + std::future - Access violation after closing socket
Ok, The code is incorrect. When the timer is canceled, the timer handler is always executed. For this reason io_service::run_one() function is never blocked.
More information: basic_waitable_timer::cancel
Thanks for the help.

TCP client in Boost asio

Im building a TCP client using Boost::asio Libs. My program has a write() thread that sends a command to the server
write(*_socket,boost::asio::buffer("sspi l1\n\n",sizeof("sspi l1\n\n")));
Then a read thread is started that reads from the buffer all the time, as there can be messages broadcasted from the server due to any other client
void TCP_IP_Connection::readTCP()
{
size_t l=0;
this->len=0;
boost::system::error_code error;
try
{//loop reading all values from router
while(1)
{
//wait for reply??
l=_socket->read_some(boost::asio::buffer(this->reply,sizeof(this->reply)),error);
if(error)
throw boost::system::system_error(error);
if(l>0)
{
this->dataProcess(l);
}
else
boost::this_thread::sleep(boost::posix_time::milliseconds(5000));
_io.run();
if(error==boost::asio::error::eof) //connection closed by router
std::cout<<"connection closed by router";
_io.reset();
}
}
catch (std::exception& e)
{
std::cerr << e.what() << std::endl;
}
}
This thread runs al time in a while(1) loop and is supposed to sleep when the received data length is less than zero. It reads all the data and calls the data parser function. After that the write thread is used to send another command, with read thread running. But instead of the required response the server sends back
? ""
ERROR: Unknown command
I tried using the wireshark. I can see the command being send properly. What can be mistake I'm doing here?
sizeof("sspi l1\n\n") returns 10, but I can only count 9 characters in that string.
Try this instead:
const std::string cmd("sspi l1\n\n");
write(*_socket,boost::asio::buffer(cmd, cmd.length()));
Or when you have it as a string it is enough to do
const std::string cmd("sspi l1\n\n");
write(*_socket,boost::asio::buffer(cmd));
The second argument specifies a maximum length of the string to use. But since it is a constant string, the second argument is not strictly necessary.

async_receive_from stops receiving after a few packets under Linux

I have a setup with multiple peers broadcasting udp packets (containing images) every 200ms (5fps).
While receiving both the local stream as external streams works fine under Windows, the same code (except for the socket->cancel(); in Windows XP, see comment in code) produces rather strange behavior under Linux:
The first few (5~7) packets sent by another machine (when this machine starts streaming) are received as expected;
After this, the packets from the other machine are received after irregular, long intervals (12s, 5s, 17s, ...) or get a time out (defined after 20 seconds). At certain moments, there is again a burst of (3~4) packets received as expected.
The packets sent by the machine itself are still being received as expected.
Using Wireshark, I see both local as external packets arriving as they should, with correct time intervals between consecutive packages. The behavior also presents itself when the local machine is only listening to a single other stream, with the local stream disabled.
This is some code from the receiver (with some updates as suggested below, thanks!):
Receiver::Receiver(port p)
{
this->port = p;
this->stop = false;
}
int Receiver::run()
{
io_service io_service;
boost::asio::ip::udp::socket socket(
io_service,
boost::asio::ip::udp::endpoint(boost::asio::ip::udp::v4(),
this->port));
while(!stop)
{
const int bufflength = 65000;
int timeout = 20000;
char sockdata[bufflength];
boost::asio::ip::udp::endpoint remote_endpoint;
int rcvd;
bool read_success = this->receive_with_timeout(
sockdata, bufflength, &rcvd, &socket, remote_endpoint, timeout);
if(read_success)
{
std::cout << "read succes " << remote_endpoint.address().to_string() << std::endl;
}
else
{
std::cout << "read fail" << std::endl;
}
}
return 0;
}
void handle_receive_from(
bool* toset, boost::system::error_code error, size_t length, int* outsize)
{
if(!error || error == boost::asio::error::message_size)
{
*toset = length>0?true:false;
*outsize = length;
}
else
{
std::cout << error.message() << std::endl;
}
}
// Update: error check
void handle_timeout( bool* toset, boost::system::error_code error)
{
if(!error)
{
*toset = true;
}
else
{
std::cout << error.message() << std::endl;
}
}
bool Receiver::receive_with_timeout(
char* data, int buffl, int* outsize,
boost::asio::ip::udp::socket *socket,
boost::asio::ip::udp::endpoint &sender_endpoint, int msec_tout)
{
bool timer_overflow = false;
bool read_result = false;
deadline_timer timer( socket->get_io_service() );
timer.expires_from_now( boost::posix_time::milliseconds(msec_tout) );
timer.async_wait( boost::bind(&handle_timeout, &timer_overflow,
boost::asio::placeholders::error) );
socket->async_receive_from(
boost::asio::buffer(data, buffl), sender_endpoint,
boost::bind(&handle_receive_from, &read_result,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred, outsize));
socket->get_io_service().reset();
while ( socket->get_io_service().run_one())
{
if ( read_result )
{
timer.cancel();
}
else if ( timer_overflow )
{
//not to be used on Windows XP, Windows Server 2003, or earlier
socket->cancel();
// Update: added run_one()
socket->get_io_service().run_one();
}
}
// Update: added run_one()
socket->get_io_service().run_one();
return read_result;
}
When the timer exceeds the 20 seconds, the error message "Operation canceled" is returned, but it is difficult to get any other information about what is going on.
Can anyone identify a problem or give me some hints to get some more information about what is going wrong? Any help is appreciated.
Okay, what you're doing is that when you call receive_with_timeout, you're setting up the two asynchronous requests (one for the recv, one for the timeout). When the first one completes, you cancel the other.
However, you never invoke ioservice::run_one() again to allow it's callback to complete. When you cancel an operation in boost::asio, it invokes the handler, usually with an error code indicating that the operation has been aborted or canceled. In this case, I believe you have a handler dangling once you destroy the deadline service, since it has a pointer onto the stack for it to store the result.
The solution is to call run_one() again to process the canceled callback result prior to exiting the function. You should also check the error code being passed to your timeout handler, and only treat it as a timeout if there was no error.
Also, in the case where you do have a timeout, you need to execute run_one so that the async_recv_from handler can execute, and report that it was canceled.
After a clean installation with Xubuntu 12.04 instead of an old install with Ubuntu 10.04, everything now works as expected. Maybe it is because the new install runs a newer kernel, probably with improved networking? Anyway, a re-install with a newer version of the distribution solved my problem.
If anyone else gets unexpected network behavior with an older kernel, I would advice to try it on a system with a newer kernel installed.