How to shutdown gRPC server from Client (using RPC function) - c++

I'm using gRPC for inter-process communication between C++ App (gRPC Server) and Java App (gRPC Client). Everything run on one machine. I want to provide client possibility to shut down the server. My idea is to add RPC function to service in proto which would do it.
The C++ Implementation would be:
class Service : public grpcGeneratedService
{
public:
......
private:
grpc::Server* m_pServer;
};
grpc::Status Service::ShutDown(grpc::ServerContext* pContext, const ShutDownRequest* pRequest, ShutDownResponse* pResponse)
{
if (m_pServer)
m_pServer->Shutdown();
return grpc::Status(grpc::StatusCode::OK, "");
}
However the ShutDown blocks until all RPC calls are processed what means dead-lock. Is there any elegant way how to implement it?

I'm using a std::promise with a method almost exactly like yours.
// Somewhere in the global scope :/
std::promise<void> exit_requested;
// My method looks nearly identical to yours
Status CoreServiceImpl::shutdown(ServerContext *context, const SystemRequest *request, Empty*)
{
LOG(INFO) << context->peer() << " - Shutdown request acknowledged.";
exit_requested.set_value();
return Status::OK;
}
In order to make this work, I call server->Wait() in a second thread and wait on the future for the exit_requested promise to block a shutdown call:
auto serveFn = [&]() {
server->Wait();
};
std::thread serving_thread(serveFn);
auto f = exit_requested.get_future();
f.wait();
server->Shutdown();
serving_thread.join();
Once I had this I was also able to support a clean shutdown via signal handlers as well:
auto handler = [](int s) {
exit_requested.set_value();
};
std::signal(SIGINT, handler);
std::signal(SIGTERM, handler);
std::signal(SIGQUIT, handler);
I've been satisfied with this approach so far and it's kept me within the bounds of gRPC and the standard c++ libs. Rather than use some globally scoped promise (I have to declare it as an external in my service implementation source) I should probably think of something more elegant.
One thing to note here is that setting the value of the promise more than once will throw an exception. This could happen if you somehow send the shutdown message and also pkill -2 my_awesome_service at the same time. I actually ran into this when there was a deadlock in my persistence layer preventing shutdown from finishing, when I tried to send a SIGINT again the service aborted instead! For my needs this is still an acceptable solution but I'd love to hear about alternatives that work around or solve that little problem.

You can create an std::function from the ShutDown() handler and run that function in a separate thread (or threadpool). This will allow decoupling the handling of the RPC from the execution of the shutdown logic and eliminate the deadlock.

Related

Destroying server instance : ASIO C++

Referring to HTTP Server- Single threaded Implementation
I am trying to Explicitly control Lifetime of server instance
My Requirements are:
1) I should be able to explicitly destroy the server
2) I need to keep multiple Server Instances alive which should listen to different ports
3) Manager Class maintains list of all active server instances; should be able to create and destroy the server instances by create and drop methods
I am trying to implement Requirement 1 and
I have come up with code:
void server::stop()
{
DEBUG_MSG("Stopped");
io_service_.post(boost::bind(&server::handle_stop, this));
}
where handle_stop() is
void server::handle_stop()
{
// The server is stopped by cancelling all outstanding asynchronous
// operations. Once all operations have finished the io_service::run() call
// will exit.
acceptor_.close();
connection_manager_.stop_all();
}
I try to call it from main() as:
try
{
http::server::server s("127.0.0.1","8973");
// Run the server until stopped.
s.run();
boost::this_thread::sleep_for(boost::chrono::seconds(3));
s.stop();
}
catch (std::exception& e)
{
std::cerr << "exception: " << e.what() << "\n";
}
Question 1)
I am not able to call server::handle_stop().
I suppose io_service_.run() is blocking my s.stop() call.
void server::run()
{
// The io_service::run() call will block until all asynchronous operations
// have finished. While the server is running, there is always at least one
// asynchronous operation outstanding: the asynchronous accept call waiting
// for new incoming connections.
io_service_.run();
}
How do I proceed?
Question 2:
For requirement 2) where I need to have multiple server instances, i think I will need to create an io_service instance in main and must pass the same instance to all server instances. Am I right?
Is it mandatory to have only one io_service instance per process or can I have more than one ?
EDIT
My aim is to implement a class which can control multi server instances:
Something of below sort (Incorrect code // Just giving view, what I try to implement ) I want to achieve-
How do i design?
I have confusion regarding io_Service and how do I cleanly call mng.create(), mng.drop()
Class Manager{
public:
void createServer(ServerPtr)
{
list_.insert(make_shared<Server> (ip, port));
}
void drop()
{
list_.drop((ServerPtr));
}
private:
io_service iO_;
set<server> list_;
};
main()
{
io_service io;
Manager mng(io);
mng.createServer(ip1,port1);
mng.createServer(ip2,port2);
io.run();
mng.drop(ip1,port1);
}
I am not able to call server::handle_stop().
As you say, run() won't return until the service is stopped or runs out of work. There's no point calling stop() after that.
In a single-threaded program, you can call stop() from an I/O handler - for your example, you could use a deadline_timer to call it after three seconds. Or you could do something complicated with poll() rather than run(), but I wouldn't recommend that.
In a multi-threaded program, you could call it from another thread than the one calling run(), as long as you make sure it's thread-safe.
For [multiple servers] I think I will need to create an io_service instance in main
Yes, that's probably the best thing to do.
Is it mandatory to have only one io_service instance per process or can I have more than one?
You can have as many as you like. But I think you can only run one at a time on a single thread, so it would be tricky to have more than one in a single-threaded program. I'd have a single instance that all the servers can use.
You are right, it's not working because you call stop after blocking run, and run blocks until there are some unhandled callbacks. There are multiple ways to solve this and it depands from what part of program stop will be called:
If you can call it from another thread, then run each instance of server in separate thread.
If you need to stop server after some IO operation for example you can simply do as you have tried io_service_.post(boost::bind(&server::handle_stop, this));, but it should be registered from another thread or from another callback in current thread.
You can use io_service::poll(). It is non-blocking version of run, so you create a loop where you call poll until you need to stop server.
You can do it both ways. Even with the link you provided you can take a look at:
HTTP Server 3 - An HTTP server using a single io_service and a thread pool
and HTTP Server 2 - An HTTP server using an io_service-per-CPU design

Boost HTTP server issue

I'm starting to use Boost, so may be I'm messing something up.
I'm trying to set up http server with boost (ASIO). I've taken the code from docs: http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/examples/cpp03_examples.html (HTTP Server, the first one)
The only difference from the example is I'm running server by my own method "run" and starting io_service in background thread, like in the docs: http://www.boost.org/doc/libs/1_54_0/doc/html/boost_asio/reference/io_service.html
boost::asio::io_service::work work(io_service_);
(Also I'm stopping io_service from my run method too.)
When I'm starting this modified server everything seems to be OK, run method is working fine. But then I'm trying to get a doc from the server the request hangs and control flow never comes to "request_handle" method.
Am I missing something?
UPD. Here is my code of run method:
void NetstreamServer::run()
{
LOG4CPLUS_DEBUG(logger, "NetstreamServer is running");
boost::asio::io_service::work work(io_service_);
try
{
while (true)
{
if (condition)
{
io_service_.stop();
break;
}
}
}
catch (std::exception const& e)
{
LOG4CPLUS_ERROR(logger, "NetstreamServer" << " caught exception: " << e.what());
}
}
You should call io_service_::run() - otherwise no one will dispatch the completion handlers of Asio objects serviced by io_service_.
Without including the code you changed, everyone here can only guess. Unfortunately you also do not include the compiler and the OS you are using. Even with boost claiming it is platform independent, you should always include this information, as it reality, platforms are different even with boost.
Let me do a guess. You use Microsoft Windows? How do you prevent the "main" function to exit? You moved the blocking "run" function out of it in another thread, the main function has no wait point anymore. Let me guess again, you used something like "getchar". With that, you can exit your server with only hitting the keyboard return key. If yes, the problem is the getchar, with unfortunately blocks every io of the asio socket implementation, but only on Windows based systems.
I would not need to guess if you would include the informations mentioned in your post. In particular all(!) changes you made to the code sample.

Boost Asio callback doesn't get called

I'm using Boost.Asio for network operations, they have to (and actually, can, there's no complex data structures or anything) remain pretty low level since I can't afford the luxury of serialization overhead (and the libs I found that did offer well enough performance seemed to be badly suited for my case).
The problem is with an async write I'm doing from the client (in QT, but that should probably be irrelevant here). The callback specified in the async_write doesn't get called, ever, and I'm at a complete loss as to why. The code is:
void SpikingMatrixClient::addMatrix() {
std::cout << "entered add matrix" << std::endl;
int action = protocol::Actions::AddMatrix;
int matrixSize = this->ui->editNetworkSize->text().toInt();
std::ostream out(&buf);
out.write(reinterpret_cast<const char*>(&action), sizeof(action));
out.write(reinterpret_cast<const char*>(&matrixSize), sizeof(matrixSize));
boost::asio::async_write(*connection.socket(), buf.data(),
boost::bind(&SpikingMatrixClient::onAddMatrix, this, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));
}
which calls the first write. The callback is
void SpikingMatrixClient::onAddMatrix(const boost::system::error_code& error, size_t bytes_transferred) {
std::cout << "entered onAddMatrix" << std::endl;
if (!error) {
buf.consume(bytes_transferred);
requestMatrixList();
} else {
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
The callback never gets called, even though the server receives all the data. Can anyone think of any reason why it might be doing that?
P.S. There was a wrapper for that connection, and yes there will probably be one again. Ditched it a day or two ago because I couldn't find the problem with this callback.
As suggested, posting a solution I found to be the most suitable (at least for now).
The client application is [being] written in QT, and I need the IO to be async. For the most part, the client receives calculation data from the server application and has to render various graphical representations of them.
Now, there's some key aspects to consider:
The GUI has to be responsive, it should not be blocked by the IO.
The client can be connected / disconnected.
The traffic is pretty intense, data gets sent / refreshed to the client every few secs and it has to remain responsive (as per item 1.).
As per the Boost.Asio documentation,
Multiple threads may call io_service::run() to set up a pool of
threads from which completion handlers may be invoked.
Note that all threads that have joined an io_service's pool are considered equivalent, and the io_service may distribute work across them in an arbitrary fashion.
Note that io_service.run() blocks until the io_service runs out of work.
With this in mind, the clear solution is to run io_service.run() from another thread. The relevant code snippets are
void SpikingMatrixClient::connect() {
Ui::ConnectDialog ui;
QDialog *dialog = new QDialog;
ui.setupUi(dialog);
if (dialog->exec()) {
QString host = ui.lineEditHost->text();
QString port = ui.lineEditPort->text();
connection = TcpConnection::create(io);
boost::system::error_code error = connection->connect(host, port);
if (!error) {
io = boost::shared_ptr<boost::asio::io_service>(new boost::asio::io_service);
work = boost::shared_ptr<boost::asio::io_service::work>(new boost::asio::io_service::work(*io));
io_threads.create_thread(boost::bind(&SpikingMatrixClient::runIo, this, io));
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
}
for connecting & starting IO, where:
work is a private boost::shared_ptr to the boost::asio::io_service::work object it was passed,
io is a private boost::shared_ptr to a boost::asio::io_service,
connection is a boost::shared_ptr to my connection wrapper class, and the connect() call uses a resolver etc. to connect the socket, there's plenty examples of that around
and io_threads is a private boost::thread_group.
Surely it could be shortened with some typedefs if needed.
TcpConnection is my own connection wrapper implementation, which sortof lacks functionality for now, and I suppose I could move the whole thread thing into it when it gets reinstated. This snippet should be enough to get the idea anyway...
The disconnecting part goes like this:
void SpikingMatrixClient::disconnect() {
work.reset();
io_threads.join_all();
boost::system::error_code error = connection->disconnect();
if (!error) {
connection.reset();
}
QString message = QString::fromStdString(error.message());
this->ui->statusBar->showMessage(message, 15000);
}
the work object is destroyed, so that the io_service can run out of work eventually,
the threads are joined, meaning that all work gets finished before disconnecting, thus data shouldn't get corrupted,
the disconnect() calls shutdown() and close() on the socket behind the scenes, and if there's no error, destroys the connection pointer.
Note, that there's no error handling in case of an error while disconnecting in this snippet, but it could very well be done, either by checking the error code (which seems more C-like), or throwing from the disconnect() if the error code within it represents an error after trying to disconnect.
I encountered a similar problem (callbacks not fired) but the circumstances are different from this question (io_service had jobs but still would not fire the handlers ). I will post this anyway and maybe it will help someone.
In my program, I set up an async_connect() then followed by io_service.run(), which blocks as expected.
async_connect() goes to on_connect_handler() as expected, which in turn fires async_write().
on_write_complete_handler() does not fire, even though the other end of the connection has received all the data and has even sent back a response.
I discovered that it is caused by me placing program logic in on_connect_handler(). Specifically, after the connection was established and after I called async_write(), I entered an infinite loop to perform arbitrary logic, not allowing on_connect_handler() to exit. I assume this causes the io_service to not be able to execute other handlers, even if their conditions are met because it is stuck here. ( I had many misconceptions, and thought that io_service would automagically spawn threads for each async_x() call )
Hope that helps.

Poco HTTPServer connections still served after calling stop() and destructor

I am facing a problem using the Poco::HTTPServer. As descibed in the doc of TCPServer:
After calling stop(), no new connections will be accepted and all
queued connections will be discarded. Already served connections,
however, will continue being served.
Every connection is executed in its own thread.
Although it seems the destructor is succesfully called the connection-thread still exists and serves connections, which leads to segmentation faults.
I want to cancel all connections. Therefore I use Poco::ThreadPool::defaultPool().stopAll(); in the destructor of my server class, which leads to the behaviour also described in the docs of ThreadPool (It takes 10 seconds and objects are not deleted):
If a thread fails to stop within 10 seconds (due to a programming
error, for example), the underlying thread object will not be deleted
and this method will return anyway. This allows for a more or less
graceful shutdown in case of a misbehaving thread.
My question is: How do I accomplish the more graceful way? Is the programming error within the Poco-library?
EDIT: I am using GNU/Linux (Ubuntu 10.04) with eclipse + cdt as IDE, target system is embedded Linux (Kernel 2.6.9). On both systems I experienced the described behaviour.
The application I am working on shall be configured via web-interface. So the server sends an event (on upload of new configuration) to main to restart.
Here's the outline:
main{
while (true){
server = new Server(...);
server->start();
// wait for termination request
server->stop();
delete server;
}
}
class Server{
Poco:HTTPServer m_Server;
Server(...):
m_Server(requestHandlerFactory, socket, params);
{
}
~Server(){
[...]
Poco::ThreadPool::defaultPool().stopAll(); // This takes 10 seconds!
// without the above line I get segmentation faults,
// because connections are still being served.
}
start() { m_Server.start(); }
stop() { m_Server.stop(); }
}
This is actually a bug in the implementation of the stopAll() method. The listening socket is being shut down after closing the currently active connections, which allows the server to accept new connections in between, which in turn will not be closed and keep running. A workaround is to call HTTPServer::stop() and then HTTPServer::stopAll(). I reported the bug upstream including a proposed fix:
https://github.com/pocoproject/poco/issues/436
You should avoid using Poco::ThreadPool::defaultPool().stopAll(); since it doesn't provide you control on which threads are stopped.
I suggest you to create a Poco::ThreadPool specifically for you Poco:HTTPServer instance and stops the threads of this pool when your server is stopped.
With this, your code should look like this:
class Server{
Poco:HTTPServer m_Server;
Poco::ThreadPool m_threadPool;
Server(...)
: m_Server(requestHandlerFactory, m_threadPool, socket, params);
{
}
~Server(){
}
start() { m_Server.start(); }
stop() {
m_Server.stop();
m_threadPool.stopAll(); // Stop and wait serving threads
}
};
This answer may be too late for the poster, but since the question helped me to solve my issue, I think it is good to post a solution here !

Reference problem (I guess) when using boost::asio

I am building an HTTP client based on the example on HTTP server given at boost website. Now, the difference between that code and mine is that the example uses the server constructor to start the asynchronous operations. This makes sense since a server is supposed to listen all the time. In my client, on the other hand, I want to first construct the object and then have a send() function that starts off by connecting to the endpoint and later on sends a request and finally listens for the reply. This makes sense too, doesn't it?
When I create my object (client) I do it in the same manner as in the server example (winmain.cpp). It looks like this:
client c("www.boost.org);
c.start(); // starts the io_service in a thread
c.send(msg_);
The relevant parts of the code are these:
void enabler::send(common::geomessage& msg_)
{
new_connection_.reset(new connection(io_service_,
connection_manager_,
message_manager_, msg_
));
boost::asio::ip::tcp::resolver resolver(io_service_);
boost::asio::ip::tcp::resolver::query query(host_address, "http");
resolver.async_resolve(query, boost::bind(
&enabler::handle_resolve,
boost::ref(*this),
boost::asio::placeholders::error,
boost::asio::placeholders::iterator
));
}
void enabler::run()
{
io_service_.run();
}
The problem with this is that the program gets stuck somewhere here. The last thing that prints is the "Resolving host", after that the program ends. I don't know why because the io_service should block until all async operations have returned to their callbacks. If, however, I change the order of how I call the functions, it works. If I call run() just after the call to async_resolve() and also omit calling start() in my main program, it works!
In this scenario, io_service blocks as it should and I can see that I get a response from the server.
It has something to do from the fact that I call run() from inside the same class as where I call async_resolve(). Could this be true? The I suppose I need to give a reference from the main program when I call run(), is it like that?
I have struggled with getting io_service::work to work but the program just gets stuck and yeah, similar problems as the one above occur. So it does not really help.
So, what can I do to get this right? As I said earlier, what I want is to be able to create the client object and have the io_service running all the time in a separate thread inside the client class. Secondly to have a function, send(), that sends requests to the server.
You need to start at least some work before calling run(), as it returns when there is no more work to do.
If you call it before you start the async resolve, it won't have any work so it returns.
If you don't expect to have some work at all times, to keep the io_service busy, you should construct an io_service::work object in some scope which can be exited without io_service::run() having to return first. If you're running the io_service in a separate thread, I would imagine you wouldn't have a problem with that.
It's sort of hard to know what you're trying to do with those snippets of code. I imagine that you'd want to do something along these lines:
struct client
{
io_service io_service_;
io_service::work* w_;
pthread_t main_thread_;
client(): w_(new io_service::work(io_service)) { ... }
void start() { pthread_create(&main_thread_, 0, main_thread, this); }
static long main_thread(void* arg) { ((client*)arg)->io_service_.run(); }
// release the io_service and allow run() to return
void stop() { delete w_; w_ = 0; pthread_join(main_thread_); }
};