Close Boost Websocket from Server side, C++, tcp::acceptor accept() timeout? - c++

UPDATE:
Well it appears that I need to address my issue with an asynchronous implementation. I will update my posting with a new direction, once I've completed testing
Original:
I'm currently writing a multiserver application that will collect, share, and request information from multiple machines. In some cases, Machine A will request information from Machine B but will need to send it to Machine C, which will reply to A. Without getting too deep into what the application is going to do I need some help with my client application.
I have my client application designed with two threads. I used this example from boost, as the basis for my design.
Thread one will open a Client Websocket with Machine-A, it will stream a series of data points and commands. Here is a stripped-down version of my code
#include "Poco/Clock.h"
#include "Poco/Task.h"
#include "Poco/Thread.h"
#include <boost/asio.hpp>
#include <boost/beast.hpp>
#include <jsoncons/json.hpp>
namespace beast = boost::beast; // from <boost/beast.hpp>
namespace http = beast::http; // from <boost/beast/http.hpp>
namespace websocket = beast::websocket; // from <boost/beast/websocket.hpp>
namespace net = boost::asio; // from <boost/asio.hpp>
using tcp = net::ip::tcp; // from <boost/asio/ip/tcp.hpp>
class ResponseChannel : public Poco::Runnable {
void do_session(tcp::socket socket)
{
try {
websocket::stream<tcp::socket> ws{std::move(socket)};
ws.set_option(websocket::stream_base::decorator(
[](websocket::response_type& res) {
res.set(http::field::server,
std::string(BOOST_BEAST_VERSION_STRING) +
" websocket-server-sync");
}));
ws.accept();
for (;;) {
beast::flat_buffer buffer;
ws.read(buffer);
if (ws.got_binary()) {
// do something
}
}
} catch (beast::system_error const& se) {
if (se.code() != websocket::error::closed) {
std::cerr << "do_session1 ->: " << se.code().message()
<< std::endl;
return;
}
} catch (std::exception const& e) {
std::cerr << "do_session2 ->: " << e.what() << std::endl;
return;
}
}
virtual void run()
{
auto const address = net::ip::make_address(host);
auto const port = static_cast<unsigned short>(respPort);
try {
net::io_context ioc{1};
tcp::acceptor acceptor{ioc, {address, port}};
tcp::socket socket{ioc};
for (; keep_running;) {
acceptor.accept(socket);
std::thread(&ResponseChannel::do_session, this,
std::move(socket))
.detach();
}
} catch (const std::exception& e) {
std::cout << "run: " << e.what() << std::endl;
}
}
void _terminate() { keep_running = false; }
public:
std::string host;
int respPort;
bool keep_running = true;
int responseCount = 0;
std::vector<long long int> latency_times;
long long int time_sum;
Poco::Clock* responseClock;
};
int main()
{
using namespace std::chrono_literals;
Poco::Clock clock = Poco::Clock();
Poco::Thread response_thread;
ResponseChannel response_channel;
response_channel.responseClock = &clock;
response_channel.host = "0.0.0.0";
response_channel.respPort = 8080;
response_thread.start(response_channel);
response_thread.setPriority(Poco::Thread::Priority::PRIO_HIGH);
// doing some work here. work will vary depending on command-line arguments
std::this_thread::sleep_for(30s);
response_channel.keep_running = false;
response_thread.join();
}
The way I have designed the multiple machines works as expected regarding sending commands to Machine-B and receiving results from Machine-C.
The issue I'm facing is closing out Thread 2, which contains my local response channel.
I went back and forth between Poco::Thread and Poco::Task, but I decided that I do not want to use Task, as it would be a mistake to be able to close the 2nd thread/task from the main thread. I need to know that all packets have been received before closing down the 2nd thread.
So I need to close events down only once I have received a websocket::error::closed flag from Machine-C. Shutting down the websocket, detached, thread is no issue, as when the flag arrives it takes care of that for me.
However, as part of the loop process for reconnecting after a closed socket, the thread just waits for a new connection.
acceptor.accept(socket);
It's blocking, and through the documentation, there doesn't seem to be a timeout feature. I see that there is a close option, but my attempt to use close simply threw an exception. Which ultimately added complexity, I didn't want.
Ultimately, I want the Server to continuously loop through a series of connections from both Machine-B and Machine-C, but only after my client application has ended. The last thing I do before waiting for the Poco::Thread to complete is to set the flag that I no longer want the Websocket server to run.
I've put that flag before the blocking accept() call. This would work, only with perfect timing of the flag going up, a new connection is opened and then closed, before looping back to wait for a new connection.
Ideally, there would be a timeout so that it would loop around, first checking if it timed out, allow for a periodic check if I wanted the thread to remain open.
Has anyone ever run into this?

Related

How to make a timeout at receiving in boost::asio udp::socket?

I create an one-thread application which exchanges with another one via UDP. When the second is disconnecting, my socket::receive_from blocks and I don't know how to solve this problem not changing the entire program into multi-threads or async interactions.
I thought that next may be a solution:
std::chrono::milliseconds timeout{4};
boost::system::error_code err;
data_t buffer(kPackageMaxSize);
std::size_t size = 0;
const auto status = std::async(std::launch::async,
[&]{
size = socket_.receive_from(boost::asio::buffer(buffer), dst_, 0, err);
}
).wait_for(timeout);
switch (status)
{
case std::future_status::timeout: /*...*/ break;
}
But I achieved a new problem: Qt Creator (GDB 11.1) (I don't have ability to try something yet) began to fall when I am debugging. If it runs without, the solution also not always works.
PS. As for "it doesn't work when debugging", debugging (specifically breakpoints) obviously changes timing. Also, keep in mind network operations have varying latency and UDP isn't a guaranteed protocol: messages may not be delivered.
Asio stands for "Asynchronous IO". As you might suspect, this means that asynchronous IO is a built-in feature, it's the entire purpose of the library. See overview/core/async.html: Concurrency Without Threads
It's not necessary to complicate with std::async. In your case I'd suggest using async_receive_from with use_future, as it is closest to the model you opted for:
Live On Coliru
#include <boost/asio.hpp>
#include <iostream>
#include <iomanip>
namespace net = boost::asio;
using net::ip::udp;
using namespace std::chrono_literals;
constexpr auto kPackageMaxSize = 65520;
using data_t = std::vector<char>;
int main() {
net::thread_pool ioc;
udp::socket socket_(ioc, udp::v4());
socket_.bind({{}, 8989});
udp::endpoint ep;
data_t buffer(kPackageMaxSize);
auto fut =
socket_.async_receive_from(net::buffer(buffer), ep, net::use_future);
switch (fut.wait_for(4ms)) {
case std::future_status::ready: {
buffer.resize(fut.get()); // never blocks here
std::cout << "Received " << buffer.size() << " bytes: "
<< std::quoted(
std::string_view(buffer.data(), buffer.size()))
<< "\n";
break;
}
case std::future_status::timeout:
case std::future_status::deferred: {
std::cout << "Timeout\n";
socket_.cancel(); // stop the IO operation
// fut.get() would throw system_error(net::error::operation_aborted)
break;
}
}
ioc.join();
}
The Coliru output:
Received 12 bytes: "Hello World
"
Locally demonstrating both timeout and successful path:

C++ multithreading closes TCP connection

I work on a C++ server where I wait for an network connection. If I get one I put the socket into a new thread and listen for further inputs. But the problem is that as soon as I have the socket in a new thread the TCP connection is disconnected. I'm using the SFML library.
Here's some code:
main.cpp:
int main() {
std::list<std::thread> user_connections;
sf::TcpListener listener;
listener.listen(PORT);
while (true)
{
sf::TcpSocket client;
listener.accept(client);
Protocol user_connection;
std::thread new_con (&Protocol::connect, &user_connection, std::ref(client));
new_con.detach();
user_connections.push_back(std::move(new_con)); // user_connections is a list
}
protocol.cpp:
class Protocol {
public:
void connect(sf::TcpSocket& client)
{
std::cout << "Address: " << client.getRemoteAddress() << ":" << client.getRemotePort() << std::endl;
}
}
This prints out:
Address: 0.0.0.0:0
And if I try to send any kind of message I get the status 4 which is according to the documentation disconnected.
EDIT:
According to #Ted Lyngmo it's because I need to put client in a list, because otherwise it runs out of scope. Now if I try to put it in a list via:
std::list<sf::TcpSocket> clients; // executed before while loop
// [...]
clients.push_back(client); // in the while loop
I get the error: (pastebin).
This is something built on your current threaded code. It may be a good idea to use a single threaded design and use the sf::SocketSelector to wait for events on the listener and all the connected clients instead.
In this lazy solution disconnected clients will not be removed from the servers list of clients until a new client is connected.
I've tried to explain it with comments in the code which is an echoing kind of server, so you can telnet to it, send messages and get them back.
#include <SFML/Network.hpp>
#include <atomic>
#include <iostream>
#include <list>
#include <thread>
constexpr uint16_t PORT = 2048; // what you have in your code.
// A simple struct to keep a client and thread
struct client_thread {
sf::TcpSocket client{};
std::thread thread{};
// The main thread can check "done" to remove this client_thread from its list:
std::atomic<bool> done{false};
~client_thread() {
// instead of detaching, join()
if(thread.joinable()) thread.join();
}
};
// the connect function gets a reference to a client_thread instead
void connect(client_thread& clith) {
constexpr std::size_t BufSize = 1024;
auto& [client, thread, done] = clith; // for convenience
std::cout << "thread: Address: " << client.getRemoteAddress() << ":"
<< client.getRemotePort() << std::endl;
std::string buffer(BufSize, '\0');
std::size_t received;
while(client.receive(buffer.data(), buffer.size(), received) == sf::Socket::Done) {
// remove ASCII control chars (cr and newline etc.)
while(received && buffer[received - 1] < ' ') --received;
buffer.resize(received);
std::cout << buffer << std::endl;
// send something back
buffer = "You sent >" + buffer + "<\n";
client.send(buffer.c_str(), buffer.size());
// restore the size
buffer.resize(BufSize);
}
std::cout << "thread: client disconnected\n";
client.disconnect();
// set done to true so the main thread can remove the client_thread
done = true;
}
int main() {
sf::TcpListener listener;
// check that listening actually works
if(listener.listen(PORT) != sf::Socket::Done) return 1;
// now a list of client_thread instead:
std::list<client_thread> user_connections;
while(true) {
// create a client_thread to use when listening
auto& clith = user_connections.emplace_back();
auto& [client, thread, _] = clith; // for convenience
std::cout << "main: listening ...\n";
sf::Socket::Status status = listener.accept(client);
if(status == sf::Socket::Done) {
std::cout << "main: got connection\n";
thread = std::thread(connect, std::ref(clith));
} else {
std::cout << "main: accept not done\n";
}
// remove disconnected clients, pre C++20
for(auto it = user_connections.begin(); it != user_connections.end();) {
// check the atomic bool in all threads
if(it->done) {
std::cout << "main: removing old connection\n";
it = user_connections.erase(it);
} else {
++it;
}
}
// remove disconnected clients, >= C++20
//
// std::erase_if(user_connections,
// [](auto& clith) -> bool { return clith.done; });
}
}
Edit regarding your edited question where you're trying to put the client in a list:
You're trying to copy the sf::TcpSocket and it's not copyable. What's worse, it's not even moveable. The reason the code in my answer works is because it avoids both copying and moving by using std::list::emplace_back to construct the element in place in the list.
It is apparently both sf::TcpSocket client and Protocol user_connection are destroyed. It's no use to only keep the thread alive, your thread only holds references to client and user_connection, but both of them are destroyed soon after your thread is created (and maybe not even started running).
I read a little bit on the SMFL library and unfortunately, at least the client, which is an object of TCPSocket, is not copyable, nor movable. The SMFL library must be a very old library. Any modern socket library will design socket to be at least movable, meaning that you can move your socket into the thread, or move it to the std::list or std::vector you created.
So, to use SMFL library, which was written without modern C++11 support (the copy & move in C++ was introduced in C++ 2011), together with C++11 library (std::thread), will be quite painful.
You can probably use std::shared_ptr to hold a newly created protocol & client, and pass shared_ptr into thread or into the list you created.
I don't know what Protocol exactly does, a rough pseudo code is as follows,
std::shared_ptr<TcpSocket> client = std::make_shared<TcpSocket>();
listener.accept(*client);
std::shared_ptr<Protocol> protocol = std::make_shared<Protocol>();
// copy the pointer into thread, they will be deleted after the thread is done
std::thread new_con ( [client, protocol] () { protocol->connect(*client); } );
or, protocol can probably be defined in the thread,
std::shared_ptr<TcpSocket> client = std::make_shared<TcpSocket>();
listener.accept(*client);
std::thread new_con ( [client] () {
Protocol protocol;
protocol.connect(*client);
} );

using boost:asio with select? blocking on TCP input OR file update

I had intended to have a thread in my program which would wait on two file descriptors, one for a socket and a second one for a FD describing the file system (specifically waiting to see if a new file is added to a directory). Since I expect to rarely see either the new file added or new TCP messages coming in I wanted to have one thread waiting for either input and handle whichever input is detected when it occures rather then bothering with seperate threads.
I then (finally!) got permission from the 'boss' to use boost. So now I want to replace the basic sockets with boost:asio. Only I'm running into a small problem. It seems like asio implimented it's own version of select rather then providing a FD I could use with select directly. This leaves me uncertain how I can block on both conditions, new file and TCP input, at the same time when one only works with select and the other doesn't seem to support the use of select. Is there an easy work around to this I'm missing?
ASIO is best used asynchronously (that's what it stands for): you can set up handlers for both TCP reads and the file descriptor activity, and the handlers would be called for you.
Here's a demo example to get you started (written for Linux with inotify support):
#include <iostream>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <sys/inotify.h>
namespace asio = boost::asio;
void start_notify_handler();
void start_accept_handler();
// this stuff goes into your class, only global for the simplistic demo
asio::streambuf buf(1024);
asio::io_service io_svc;
asio::posix::stream_descriptor stream_desc(io_svc);
asio::ip::tcp::socket sock(io_svc);
asio::ip::tcp::endpoint end(asio::ip::tcp::v4(), 1234);
asio::ip::tcp::acceptor acceptor(io_svc, end);
// this gets called on file system activity
void notify_handler(const boost::system::error_code&,
std::size_t transferred)
{
size_t processed = 0;
while(transferred - processed >= sizeof(inotify_event))
{
const char* cdata = processed
+ asio::buffer_cast<const char*>(buf.data());
const inotify_event* ievent =
reinterpret_cast<const inotify_event*>(cdata);
processed += sizeof(inotify_event) + ievent->len;
if(ievent->len > 0 && ievent->mask & IN_OPEN)
std::cout << "Someone opened " << ievent->name << '\n';
}
start_notify_handler();
}
// this gets called when nsomeone connects to you on TCP port 1234
void accept_handler(const boost::system::error_code&)
{
std::cout << "Someone connected from "
<< sock.remote_endpoint().address() << '\n';
sock.close(); // dropping connection: this is just a demo
start_accept_handler();
}
void start_notify_handler()
{
stream_desc.async_read_some( buf.prepare(buf.max_size()),
boost::bind(&notify_handler, asio::placeholders::error,
asio::placeholders::bytes_transferred));
}
void start_accept_handler()
{
acceptor.async_accept(sock,
boost::bind(&accept_handler, asio::placeholders::error));
}
int main()
{
int raw_fd = inotify_init(); // error handling ignored
stream_desc.assign(raw_fd);
inotify_add_watch(raw_fd, ".", IN_OPEN);
start_notify_handler();
start_accept_handler();
io_svc.run();
}

Two-way C++ communication over serial connection

I am trying to write a really simple C++ application to communicate with an Arduino. I would like to send the Arduino a character that it sends back immediately. The Arduino code that I took from a tutorial looks like this:
void setup()
{
Serial.begin(9600);
}
void loop()
{
//Have the Arduino wait to receive input
while (Serial.available()==0);
//Read the input
char val = Serial.read();
//Echo
Serial.println(val);
}
I can communicate with the Arduino easily using GNU screen, so I know that everything is working fine with the basic communication:
$ screen /dev/tty.usbmodem641 9600
The (broken) C++ code that I have looks like this:
#include <fstream>
#include <iostream>
int main()
{
std::cout << "Opening fstream" << std::endl;
std::fstream file("/dev/tty.usbmodem641");
std::cout << "Sending integer" << std::endl;
file << 5 << std::endl; // endl does flush, which may be important
std::cout << "Data Sent" << std::endl;
std::cout << "Awaiting response" << std::endl;
std::string response;
file >> response;
std::cout << "Response: " << response << std::endl;
return 0;
}
It compiles fine, but when running it, some lights flash on the Arduino and the terminal just hangs at:
Opening fstream
Where am I going wrong?
There are three points:
First: You don't initialize the serial port (TTY) on the Linux side. Nobody knows in what state it is.
Doing this in your program you must use tcgetattr(3) and tcsetattr(3). You can find the required parameters by using these keywords at this site, the Arduino site or on Google. But just for quick testing I propose to issue this command before you call your own command:
stty -F /dev/tty.usbmodem641 sane raw pass8 -echo -hupcl clocal 9600
Especially the the missing clocal might prevent you opening the TTY.
Second: When the device is open, you should wait a little before sending anything. By default the Arduino resets when the serial line is opened or closed. You have to take this into account.
The -hupcl part will prevent this reset most of the time. But at least one reset is always necessary, because -hupcl can be set only when the TTY is already open and at that time the Arduino has received the reset signal already. So -hupcl will "only" prevent future resets.
Third: There is NO error handling in your code. Please add code after each IO operation on the TTY which checks for errors and - the most important part - prints helpful error messages using perror(3) or similar functions.
I found a nice example by Jeff Gray of how to make a simple minicom type client using boost::asio. The original code listing can be found on the boost user group. This allows connection and communication with the Arduino like in the GNU Screen example mentioned in the original post.
The code example (below) needs to be linked with the following linker flags
-lboost_system-mt -lboost_thread-mt
...but with a bit of tweaking, some of the dependence on boost can be replaced with new C++11 standard features. I'll post revised versions as and when I get around to it. For now, this compiles and is a solid basis.
/* minicom.cpp
A simple demonstration minicom client with Boost asio
Parameters:
baud rate
serial port (eg /dev/ttyS0 or COM1)
To end the application, send Ctrl-C on standard input
*/
#include <deque>
#include <iostream>
#include <boost/bind.hpp>
#include <boost/asio.hpp>
#include <boost/asio/serial_port.hpp>
#include <boost/thread.hpp>
#include <boost/lexical_cast.hpp>
#include <boost/date_time/posix_time/posix_time_types.hpp>
#ifdef POSIX
#include <termios.h>
#endif
using namespace std;
class minicom_client
{
public:
minicom_client(boost::asio::io_service& io_service, unsigned int baud, const string& device)
: active_(true),
io_service_(io_service),
serialPort(io_service, device)
{
if (!serialPort.is_open())
{
cerr << "Failed to open serial port\n";
return;
}
boost::asio::serial_port_base::baud_rate baud_option(baud);
serialPort.set_option(baud_option); // set the baud rate after the port has been opened
read_start();
}
void write(const char msg) // pass the write data to the do_write function via the io service in the other thread
{
io_service_.post(boost::bind(&minicom_client::do_write, this, msg));
}
void close() // call the do_close function via the io service in the other thread
{
io_service_.post(boost::bind(&minicom_client::do_close, this, boost::system::error_code()));
}
bool active() // return true if the socket is still active
{
return active_;
}
private:
static const int max_read_length = 512; // maximum amount of data to read in one operation
void read_start(void)
{ // Start an asynchronous read and call read_complete when it completes or fails
serialPort.async_read_some(boost::asio::buffer(read_msg_, max_read_length),
boost::bind(&minicom_client::read_complete,
this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void read_complete(const boost::system::error_code& error, size_t bytes_transferred)
{ // the asynchronous read operation has now completed or failed and returned an error
if (!error)
{ // read completed, so process the data
cout.write(read_msg_, bytes_transferred); // echo to standard output
read_start(); // start waiting for another asynchronous read again
}
else
do_close(error);
}
void do_write(const char msg)
{ // callback to handle write call from outside this class
bool write_in_progress = !write_msgs_.empty(); // is there anything currently being written?
write_msgs_.push_back(msg); // store in write buffer
if (!write_in_progress) // if nothing is currently being written, then start
write_start();
}
void write_start(void)
{ // Start an asynchronous write and call write_complete when it completes or fails
boost::asio::async_write(serialPort,
boost::asio::buffer(&write_msgs_.front(), 1),
boost::bind(&minicom_client::write_complete,
this,
boost::asio::placeholders::error));
}
void write_complete(const boost::system::error_code& error)
{ // the asynchronous read operation has now completed or failed and returned an error
if (!error)
{ // write completed, so send next write data
write_msgs_.pop_front(); // remove the completed data
if (!write_msgs_.empty()) // if there is anthing left to be written
write_start(); // then start sending the next item in the buffer
}
else
do_close(error);
}
void do_close(const boost::system::error_code& error)
{ // something has gone wrong, so close the socket & make this object inactive
if (error == boost::asio::error::operation_aborted) // if this call is the result of a timer cancel()
return; // ignore it because the connection cancelled the timer
if (error)
cerr << "Error: " << error.message() << endl; // show the error message
else
cout << "Error: Connection did not succeed.\n";
cout << "Press Enter to exit\n";
serialPort.close();
active_ = false;
}
private:
bool active_; // remains true while this object is still operating
boost::asio::io_service& io_service_; // the main IO service that runs this connection
boost::asio::serial_port serialPort; // the serial port this instance is connected to
char read_msg_[max_read_length]; // data read from the socket
deque<char> write_msgs_; // buffered write data
};
int main(int argc, char* argv[])
{
// on Unix POSIX based systems, turn off line buffering of input, so cin.get() returns after every keypress
// On other systems, you'll need to look for an equivalent
#ifdef POSIX
termios stored_settings;
tcgetattr(0, &stored_settings);
termios new_settings = stored_settings;
new_settings.c_lflag &= (~ICANON);
new_settings.c_lflag &= (~ISIG); // don't automatically handle control-C
tcsetattr(0, TCSANOW, &new_settings);
#endif
try
{
if (argc != 3)
{
cerr << "Usage: minicom <baud> <device>\n";
return 1;
}
boost::asio::io_service io_service;
// define an instance of the main class of this program
minicom_client c(io_service, boost::lexical_cast<unsigned int>(argv[1]), argv[2]);
// run the IO service as a separate thread, so the main thread can block on standard input
boost::thread t(boost::bind(&boost::asio::io_service::run, &io_service));
while (c.active()) // check the internal state of the connection to make sure it's still running
{
char ch;
cin.get(ch); // blocking wait for standard input
if (ch == 3) // ctrl-C to end program
break;
c.write(ch);
}
c.close(); // close the minicom client connection
t.join(); // wait for the IO service thread to close
}
catch (exception& e)
{
cerr << "Exception: " << e.what() << "\n";
}
#ifdef POSIX // restore default buffering of standard input
tcsetattr(0, TCSANOW, &stored_settings);
#endif
return 0;
}
You should check if you have access to /dev/tty.usbmodem641. The usual way in Linux is to add the user to the proper group with adduser.
By the way, I know that to access the serial port, one needs to open /dev/ttyS0 (for COM1), until /dev/ttyS3. See for example this example in C.

Intermittently no data delivered through boost::asio / io completion port

Problem
I am using boost::asio for a project where two processes on the same machine communicate using TCP/IP. One generates data to be read by the other, but I am encountering a problem where intermittently no data is being sent through the connection. I've boiled this down to a very simple example below, based on the async tcp echo server example.
The processes (source code below) start out fine, delivering data at a fast rate from the sender to the receiver. Then all of a sudden, no data at all is delivered for about five seconds. Then data is delivered again until the next inexplicable pause. During these five seconds, the processes eat 0% CPU and no other processes seem to do anything in particular. The pause is always the same length - five seconds.
I am trying to figure out how to get rid of these stalls and what causes them.
CPU usage during an entire run:
Notice how there are three dips of CPU usage in the middle of the run - a "run" is a single invocation of the server process and the client process. During these dips, no data was delivered. The number of dips and their timing differs between runs - some times no dips at all, some times many.
I am able to affect the "probability" of these stalls by changing the size of the read buffer - for instance if I make the read buffer a multiple of the send chunk size it appears that this problem almost goes away, but not entirely.
Source and test description
I've compiled the below code with Visual Studio 2005, using Boost 1.43 and Boost 1.45. I have tested on Windows Vista 64 bit (on a quad-core) and Windows 7 64 bit (on both a quad-core and a dual-core).
The server accepts a connection and then simply reads and discards data. Whenever a read is performed a new read is issued.
The client connects to the server, then puts a bunch of packets into a send queue. After this it writes the packets one at the time. Whenever a write has completed, the next packet in the queue is written. A separate thread monitors the queue size and prints this to stdout every second. During the io stalls, the queue size remains exactly the same.
I have tried to used scatter io (writing multiple packets in one system call), but the result is the same. If I disable IO completion ports in Boost using BOOST_ASIO_DISABLE_IOCP, the problem appears to go away but at the price of significantly lower throughput.
// Example is adapted from async_tcp_echo_server.cpp which is
// Copyright (c) 2003-2010 Christopher M. Kohlhoff (chris at kohlhoff dot com)
//
// Start program with -s to start as the server
#ifndef _WIN32_WINNT
#define _WIN32_WINNT 0x0501
#endif
#include <iostream>
#include <tchar.h>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/thread.hpp>
#define PORT "1234"
using namespace boost::asio::ip;
using namespace boost::system;
class session {
public:
session(boost::asio::io_service& io_service) : socket_(io_service) {}
void do_read() {
socket_.async_read_some(boost::asio::buffer(data_, max_length),
boost::bind(&session::handle_read, this, _1, _2));
}
boost::asio::ip::tcp::socket& socket() { return socket_; }
protected:
void handle_read(const error_code& ec, size_t bytes_transferred) {
if (!ec) {
do_read();
} else {
delete this;
}
}
private:
tcp::socket socket_;
enum { max_length = 1024 };
char data_[max_length];
};
class server {
public:
explicit server(boost::asio::io_service& io_service)
: io_service_(io_service)
, acceptor_(io_service, tcp::endpoint(tcp::v4(), atoi(PORT)))
{
session* new_session = new session(io_service_);
acceptor_.async_accept(new_session->socket(),
boost::bind(&server::handle_accept, this, new_session, _1));
}
void handle_accept(session* new_session, const error_code& ec) {
if (!ec) {
new_session->do_read();
new_session = new session(io_service_);
acceptor_.async_accept(new_session->socket(),
boost::bind(&server::handle_accept, this, new_session, _1));
} else {
delete new_session;
}
}
private:
boost::asio::io_service& io_service_;
boost::asio::ip::tcp::acceptor acceptor_;
};
class client {
public:
explicit client(boost::asio::io_service &io_service)
: io_service_(io_service)
, socket_(io_service)
, work_(new boost::asio::io_service::work(io_service))
{
io_service_.post(boost::bind(&client::do_init, this));
}
~client() {
packet_thread_.join();
}
protected:
void do_init() {
// Connect to the server
tcp::resolver resolver(io_service_);
tcp::resolver::query query(tcp::v4(), "localhost", PORT);
tcp::resolver::iterator iterator = resolver.resolve(query);
socket_.connect(*iterator);
// Start packet generation thread
packet_thread_.swap(boost::thread(
boost::bind(&client::generate_packets, this, 8000, 5000000)));
}
typedef std::vector<unsigned char> packet_type;
typedef boost::shared_ptr<packet_type> packet_ptr;
void generate_packets(long packet_size, long num_packets) {
// Add a single dummy packet multiple times, then start writing
packet_ptr buf(new packet_type(packet_size, 0));
write_queue_.insert(write_queue_.end(), num_packets, buf);
queue_size = num_packets;
do_write_nolock();
// Wait until all packets are sent.
while (long queued = InterlockedExchangeAdd(&queue_size, 0)) {
std::cout << "Queue size: " << queued << std::endl;
Sleep(1000);
}
// Exit from run(), ignoring socket shutdown
work_.reset();
}
void do_write_nolock() {
const packet_ptr &p = write_queue_.front();
async_write(socket_, boost::asio::buffer(&(*p)[0], p->size()),
boost::bind(&client::on_write, this, _1));
}
void on_write(const error_code &ec) {
if (ec) { throw system_error(ec); }
write_queue_.pop_front();
if (InterlockedDecrement(&queue_size)) {
do_write_nolock();
}
}
private:
boost::asio::io_service &io_service_;
tcp::socket socket_;
boost::shared_ptr<boost::asio::io_service::work> work_;
long queue_size;
std::list<packet_ptr> write_queue_;
boost::thread packet_thread_;
};
int _tmain(int argc, _TCHAR* argv[]) {
try {
boost::asio::io_service io_svc;
bool is_server = argc > 1 && 0 == _tcsicmp(argv[1], _T("-s"));
std::auto_ptr<server> s(is_server ? new server(io_svc) : 0);
std::auto_ptr<client> c(is_server ? 0 : new client(io_svc));
io_svc.run();
} catch (std::exception& e) {
std::cerr << "Exception: " << e.what() << "\n";
}
return 0;
}
So my question is basically:
How do I get rid of these stalls?
What causes this to happen?
Update: There appears to be some correlation with disk activity contrary to what I stated above, so it appears that if I start a large directory copy on the disk while the test is running this might increase the frequency of the io stalls. This could indicate that this is the Windows IO Prioritization that kicks in? Since the pauses are always the same length, that does sound somewhat like a timeout somewhere in the OS io code...
adjust boost::asio::socket_base::send_buffer_size and receive_buffer_size
adjust max_length to a larger number. Since TCP is stream oriented, don't think of it as receiving single packets. This is most likely causing some sort of "gridlock" between TCP send/receive windows.
I recently encountered a very similar sounding problem, and have a solution that works for me. I have an asynchronous server/client written in asio that sends and receives video (and small request structures), and I was seeing frequent 5 second stalls just as you describe.
Our fix was to increase the size of the socket buffers on each end, and to disable the Nagle algorithm.
pSocket->set_option( boost::asio::ip::tcp::no_delay( true) );
pSocket->set_option( boost::asio::socket_base::send_buffer_size( s_SocketBufferSize ) );
pSocket->set_option( boost::asio::socket_base::receive_buffer_size( s_SocketBufferSize ) );
It might be that only one of the above options is critical, but I've not investigated this further.