websocketpp asio listen error - c++

I have a multi-threaded websocketpp server. With no clients connected when I quit the program and relaunch, it works with no issues.
However when a client is connected and I quit/relaunch, the program throws this error
[2017-08-06 15:36:05] [info] asio listen error: system:98 ()
terminate called after throwing an instance of 'websocketpp::exception'
what(): Underlying Transport Error
Aborted
I believe I have a proper disconnect sequence going and I have the following message (my own debug info) when I initiate the quit sequence
[2017-08-06 15:35:55] [control] Control frame received with opcode 8
on_close
[2017-08-06 15:35:55] [disconnect] Disconnect close local:[1000] remote:[1000]
Quitting :3
Waiting for thread
What does the asio error mean? I am hoping someone has seen this before so that I can begin troubleshooting. Thanks!
EDIT:
I am adapting the stock broadcast_server example where
typedef std::map<connection_hdl, connection_data, std::owner_less<connection_hdl> > con_list;
con_list m_connections;
Code to close connections.
lock_guard<mutex> guard(m_connection_lock);
std::cout << "Closing Server" << std::endl;
con_list::iterator it;
for (it = m_connections.begin(); it != m_connections.end(); ++it)
{
m_server.close(it->first, websocketpp::close::status::normal, "", ec);
if (ec)
{
std::cout << "> Error initiating client close: " << ec.message() << std::endl;
}
m_connections.erase(it->first);
}
Also in destructor for broadcast_server class I have a m_server.stop()

Whenever there's a websocketpp::exception, I first check anywhere I'm explicitly using the endpoint, in your case m_server.
For instance, it could be somewhere where you are calling m_server.send(...). Since you're multithreading, it's very possible that one of the threads may be trying to utilize a connection_hdl while it has already been closed by a different thread.
In that case, it's usually a websocketpp::exception invalid state. I'm not sure for the Underlying Transport Error.
You can use breakpoints to spot the culprit (or put a bunch of cout sequences in different methods, and see which sequence is broken before the exception is thrown), or use a try/catch:
try {
m_server.send(hdl, ...);
// or
m_server.close(hdl, ...);
// or really anything you're trying to do using `m_server`.
} catch (const websocketpp::exception &e) {//by safety, I just go with `const std::exception` so that it grabs any potential exceptions out there.
std::cout << "Exception in method foo() because: " << e.what() /* log the cause of the exception */ << std::endl;
}
Otherwise, I have noticed that it will sometimes throw an exception when you're trying to close a connection_hdl, even if no other thread is seemingly accessing it. But if you put it in a try/catch, although it still throws the exception, since it doesn't terminate the program, it eventually closes the handler.
Also, maybe try m_server.pause_reading(it->first) before calling close() to freeze activity from that handler.
After second look, I think the exception you're getting is thrown where you listen with m_server.listen(...). Try surrounding it with a try/catch and putting a custom logging message.

Related

Confusion about boost::asio::io_context::run

I am currently working on a project where I use the MQTT protocol for communication.
There is a Session class in a dedicated file which basically just sets up the publish handler, i.e. the callback that is invoked, when this client receives a message (the handler checks if the topic matches "ZEUXX/var", then deserialized the binary content of the frame and subsequently unsubscribes the topic):
session.hpp:
class Session
{
public:
Session()
{
comobj = MQTT_NS::make_sync_client(ioc, "localhost", "1883", MQTT_NS::protocol_version::v5);
using packet_id_t = typename std::remove_reference_t<decltype(*comobj)>::packet_id_t;
// Setup client
comobj->set_client_id(clientId);
comobj->set_clean_session(true);
/* If someone sends commands to this client */
comobj->set_v5_publish_handler( // use v5 handler
[&](MQTT_NS::optional<packet_id_t> /*packet_id*/,
MQTT_NS::publish_options pubopts,
MQTT_NS::buffer topic_name,
MQTT_NS::buffer contents,
MQTT_NS::v5::properties /*props*/) {
std::cout << "[client] publish received. "
<< " dup: " << pubopts.get_dup()
<< " qos: " << pubopts.get_qos()
<< " retain: " << pubopts.get_retain() << std::endl;
std::string_view topic = std::string_view(topic_name.data(), topic_name.size());
std::cout << " -> topic: " << topic << std::endl;
else if (topic.substr(0, 9) == "ZEUXX/var")
{
std::cout << "[client] reading variable name: " << topic.substr(10, topic.size() - 9) << std::endl;
auto result = 99; // dummy variable, normally an std::variant of float, int32_t uint8_t
// obtained by deserialzing the binary content of the frame
std::cout << comobj->unsubscribe(std::string{topic});
}
return true;
});
}
void readvar(const std::string &varname)
{
comobj->publish(serialnumber + "/read", varname, MQTT_NS::qos::at_most_once);
comobj->subscribe(serialnumber + "/var/" + varname, MQTT_NS::qos::at_most_once);
}
void couple()
{
comobj->connect();
ioc.run();
}
void decouple()
{
comobj->disconnect();
std::cout << "[client] disconnected..." << std::endl;
}
private:
std::shared_ptr<
MQTT_NS::callable_overlay<
MQTT_NS::sync_client<MQTT_NS::tcp_endpoint<as::ip::tcp::socket, as::io_context::strand>>>>
comobj;
boost::asio::io_context ioc;
};
The client is based on a boost::asio::io_context object which happens to be the origin of my confusion. In my main file I have the following code.
main.cpp:
#include "session.hpp"
int main()
{
Session session;
session.couple();
session.readvar("speedcpu");
}
Essentially, this creates an instance of the class Session and the couple member invokes the boost::asio::io_context::run member. This runs the io_context object's event processing loop and blocks the main thread, i.e. the third line in the main function will never be reached.
I would like to initiate a connection (session.couple) and subsequently do my publish and subscribe commands (session.readvar). My question is: How do I do that correctly?
Conceptionally what I aim for is best expressed by the following python-code:
client.connect("localhost", 1883)
# client.loop_forever() that's what happens at the moment, the program
# doesn't continue from here
# The process loop get's started, however it does not block the program and
# one can send publish command subsequently.
client.loop_start()
while True:
client.publish("ZEUXX/read", "testread")
time.sleep(20)
Running the io_context object in a separate thread seems not to be working the way I tried it, any suggestions on how to tackle this problem? What I tried is the following:
Adaption in session.hpp
// Adapt the couple function to run io_context in a separate thread
void couple()
{
comobj->connect();
std::thread t(boost::bind(&boost::asio::io_context::run, &ioc));
t.detach();
}
Adpations in main.cpp
int main(int argc, char** argv)
{
Session session;
session.couple();
std::cout << "successfully started io context in separate thread" << std::endl;
session.readvar("speedcpu");
}
The std::cout line is now reached, i.e. the program does not get stuck in the couple member of the class by io_context.run(). However directly after this line I get an error: "The network connection was aborted by the local system".
The interesting thing about this is that when I use t.join() instead of t.detach() then there is no error, however I have the same behavior with t.join() as when I call io_context.run() directly, namely blocking the program.
Given your comment to the existing answer:
io_context.run() never return because it never runs out of work (it is being kept alive from the MQTT server). As a result, the thread gets blocked as soon as I enter the run() method and I cannot send any publish and subscribe frames anymore. That was when I thought it would be clever to run the io_context in a separate thread to not block the main thread. However, when I detach this separate thread, the connection runs into an error, if I use join however, it works fine but the main thread gets blocked again.
I'll assume you know how to get this running successfully in a separate thread. The "problem" you're facing is that since io_context doesn't run out of work, calling thread::join will block as well, since it will wait for the thread to stop executing. The simplest solution is to call io_context::stop before the thread::join. From the official docs:
This function does not block, but instead simply signals the io_context to stop. All invocations of its run() or run_one() member functions should return as soon as possible. Subsequent calls to run(), run_one(), poll() or poll_one() will return immediately until restart() is called.
That is, calling io_context::stop will cause the io_context::run call to return ("as soon as possible") and thus make the related thread joinable.
You will also want to save the reference to the thread somewhere (possibly as an attribute of the Session class) and only call thread::join after you've done the rest of the work (e.g. called the Session::readvar) and not from within the Session::couple.
When io_context runs out of work, it returns from run().
If you don't post any work, run() will always immediately return. Any subsequent run() also immediately returns, even if new work was posted.
To re-use io_context after it completed, use io_context.reset(). In your case, better to
use a work guard (https://www.boost.org/doc/libs/1_73_0/doc/html/boost_asio/reference/executor_work_guard.html), see many of the library examples
don't even "run" the ioc in couple() if you already run it on a background thread
If you need synchronous behaviour, don't run it on a background thread.
Also keep in mind that you need to afford graceful shutdown which is strictly harder with a detached thread - after all, now you can't join() it to know when it exited.

Boost.Asio: Async operations timeout

My Program acts as a server to which a client can connect. Once a client connected, he will get updates from the server every ~5 seconds. This is the write-function that is called every 5 seconds to send the new data to the client:
void NIUserSession::write(std::string &message_orig)
{
std::cout << "Writing message" << std::endl;
std::shared_ptr<std::string> message = std::make_shared<std::string>( message_orig );
message->append("<EOF>");
boost::system::error_code ec;
boost::asio::async_write(this->socket_, boost::asio::buffer(*message),
boost::asio::transfer_all(), boost::bind(&NIUserSession::writeHandler,
this, boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred(),
message
));
}
void NIUserSession::writeHandler(const boost::system::error_code &error, std::size_t bytes_transferred, std::shared_ptr<std::string> message)
{
std::cout << "Write Handler" << std::endl;
if(error)
{
std::cout << "Write handler error: " << error.message() << std::endl;
this->disconnect();
}
}
void NIUserSession::disconnect()
{
std::cout << "Disconnecting client, cancling all write and read operations." << std::endl;
this->socket_.lowest_layer().cancel();
delete this;
}
If there is an error in the write operations the connection between the server and the client gets closed and all async operations are cancled (this->socket_.lowest_layer().cancel();).
The problem is that if the connection times out, writeHandler will not be called immediately. Instead, the write operations "stack up" until the first one reaches writeHandler.
This should be the normal output of the program:
Writing message
Write Handler
... Other stuff ...
... Other stuff ...
Writing message
Write Handler
If the connections times out, this is what happens:
Writing message
Write Handler
Write handler error: Connection timed out
Disconnecting client, cancling all write and read operations.
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Write Handler
Segmentation fault
At the end, a segmentation fault rises. I think this is because disconnectis called while other async operations are still on their way.
I thought I could avoid it by using this->socket_.lowest_layer().cancel(); directly after the first async operation fails, but it doesn't work.
How can I avoid a segmentation fault?
Well, you should not delete this when cancelling the operations since the callbacks for the pending I/O operations will still be invoked and then accessing this leads to undefined behavior. There are multiple ways to tackle this:
Don't write data until you actually know that previous data has been written. You could queue the std::string instances passed to NIUserSession::write in case an outstanding write is still pending and then actually write them in the handler when the outstanding write operation completes. That way you will not have multiple I/O operations in flight.
Inherit from std::enable_shared_from_this and pass shared_from_this() instead of this to the async_write call (this is what the Boost asynchronous TCP daytime server example does). That way pending I/O operations will keep a reference to your class and the destructor will be called if all of them complete.

How to print something when runtime error occurs somewhere in my C++ code?

What normally happens is when code faces Runtime error , it simply terminates with runtime flag , what i intend to do is print a custom message before termination & i wish to 'return 0' ,ie terminate code normally after printing custom message as if runtime never happened .
Any idea how to do it?
There are different reasons, why a programm might terminate.
First: An uncaught Exception was thrown. If that is, what you are looking for, then follow the advice, Paul Evans has given. With C++11, you might want to call get_terminate(), and call the returned teminate handler at the end of your new teminate handler:
terminate_handler old_terminate_handler = nullptr;
void new_terminate_handler() {
std::cerr << "terminate due to error" << std::endl;
if( old_terminate_handler != nullptr ) {
old_terminate_handler();
} else {
std::abort();
}
}
int main(int, char**) {
old_terminate_handler = get_terminate();
set_terminate(new_terminate_handler);
}
Second: a signal was received, that would normaly terminate the program. Install a signal handler to catch it:
void sig_handler(int signal) {
new_terminate_handler();
}
// ...
std::signal(SIGTERM, sig_handler);
std::signal(SIGSEGV, sig_handler);
std::signal(SIGINT, sig_handler);
// ...
Third: The operating system might simply decide to kill the process. That is either done by a normal signal signal (e.g. SIGTERM), or by a signal, that can not be handled (e.g. SIGKILL) In the second case, you have no chance to notice that inside the programm. The first case is already covered.
First define your custom terminate handler, something like:
void f() {
std::cout << \\ your custom message
}
then you want to call:
std::terminate_handler set_terminate( std::terminate_handler f );
to set up your function f as the terminate handler.

ellipsis try catch on c++

Can an ellipsis try-catch be used to catch all the errors that can lead to a crash? Are there are any anomalies?
try
{
//some operation
}
catch(...)
{
}
No, it'll only catch C++ exceptions, not things like a segfault, SIGINT etc.
You need to read up about and understand the difference between C++ exceptions and for want of a better word, "C-style" signals (such as SIGINT).
If the code inside try/catch block crashed somehow, the program is anyway in a non-recoverable state. You shouldn't try to prevent the crash, the best that the program can do is just let the process crash.
The "anomaly" is in the fact that your code only catches the exceptions, and not the errors. Even if the code is exception-safe (which may be not the case, if you are trying to work-around its mistakes by a try/catch block), any other inner error may bring the program into irrecoverable state. There is simply no way to protect the program from it.
Addition: look at this article at "The Old New Thing" for some insights.
It is the Catch All handler.
It catches all the C++ exceptions thrown from the try block. It does not catch segfault and other signals that cause your program to crash.
While using it, You need to place this handler at the end of all other specific catch handlers or it all your exceptions will end up being caught by this handler.
It is a bad idea to use catch all handler because it just masks your problems and hides the programs inability by catching all(even unrecognized) exceptions. If you face such a situation you better let the program crash, and create a crash dump you can analyze later and resolve the root of the problem.
It catches everything that is thrown, it is not limited to exceptions. It doesn't handle things like windows debug asserts, system signals, segfaults.
TEST(throw_int) {
try {
throw -1;
} catch (std::exception &e) {
std::cerr << "caught " << e.what() << std::endl;
} catch (...) {
std::cerr << "caught ..." << std::endl;
}
}
Throwing an integer isn't really recommended though. It's better to throw something that inherits from std::exception.
You might expect to see something like this as a last ditch effort for documenting failure, though. Some applications aren't required to be very robust. Internal tools might cost more than they are worth if you went through the paces of making them better than hacked together crap.
int main(int argc, char ** argv) {
try {
// ...
} catch (std::exception &e) {
std::cerr << "error occured: " << e.what() << std::endl;
return 1;
}
return 0;
}

Inject runtime exception to pthread sometime fails. How to fix that?

I try to inject the exception to thread using signals, but some times the exception is not get caught. For example the following code:
void _sigthrow(int sig)
{
throw runtime_error(strsignal(sig));
}
struct sigaction sigthrow = {{&_sigthrow}};
void* thread1(void*)
{
sigaction(SIGINT,&sigthrow,NULL);
try
{
while(1) usleep(1);
}
catch(exception &e)
{
cerr << "Thread1 catched " << e.what() << endl;
}
};
void* thread2(void*)
{
sigaction(SIGINT,&sigthrow,NULL);
try
{
while(1);
}
catch(exception &e)
{
cerr << "Thread2 catched " << e.what() << endl; //never goes here
}
};
If I try to execute like:
int main()
{
pthread_t p1,p2;
pthread_create( &p1, NULL, &thread1, NULL );
pthread_create( &p2, NULL, &thread2, NULL );
sleep(1);
pthread_kill( p1, SIGINT);
pthread_kill( p2, SIGINT);
sleep(1);
return EXIT_SUCCESS;
}
I get the following output:
Thread1 catched Interrupt
terminate called after throwing an instance of 'std::runtime_error'
what(): Interrupt
Aborted
How can I make second threat catch exception?
Is there better idea about injecting exceptions?
G++ assumes that exceptions can only be thrown from function calls. If you're going to violate this assumption (eg, by throwing them from signal handlers), you need to pass -fnon-call-exceptions to G++ when building your program.
Note, however that this causes G++ to:
Generate code that allows trapping instructions to throw
exceptions. Note that this requires platform-specific runtime
support that does not exist everywhere. Moreover, it only allows
_trapping_ instructions to throw exceptions, i.e. memory
references or floating point instructions. It does not allow
exceptions to be thrown from arbitrary signal handlers such as
`SIGALRM'.
This means that exceptioning out from the middle of some random code is NEVER safe. You can only except out of SIGSEGV, SIGBUS, and SIGFPE, and only if you pass -fnon-call-exceptions and they were triggered due to a fault in the running code. The only reason this worked on thread 1 is because, due to the existence of the usleep() call, G++ was forced to assume that it might throw. With thread 2, G++ can see that no trapping instruction was present, and eliminate the try-catch block.
You may find the pthread cancellation support more akin to what you need, or otherwise just add a test like this somewhere:
if (*(volatile int *)terminate_flag) throw terminate_exception();
In Boost.thread a thread can be interrupted by invoking the interrupt() member function of the corresponding boost::thread object. It uses pthread condition variables to communicate with the thread and allows you to define interruption points in the thread code. I would avoid use of pthread_kill in C++. The fact that boost thread doesn't use pthread_kill anywhere in their code confirms this I think.