Use asio for concurrent server - c++

I'm using asio to run a TCP server.
For each message the server receives, one or more responses is returned.
Most of the messages are simple returns but some are commands which will run an action, which can take up to 10 minutes then a returns a message (but only one action can run at a time).
I start my session function in a new thread, passing it a tcp::socket when a connection is made:
tcp::acceptor a(io_context, tcp::endpoint(tcp::v4(), port));
for (;;) {
std::thread(session, a.accept()).detach();
}
But after that the tcp::socket is "stuck" in the session function. I can't pass the socket anywhere else (without compilation errors so far) and the session needs to be complete because it:
Receives the message using socket.read_some()
Processes the message (and trigger an action if required)
Transmits a response using asio::write()
I need to be able to interrupt step 2 if a new message is received but without sharing the Socket I don't know how.
Whichever way I look at it, the socket can only be used by one thread so I'll either be waiting for a new message or waiting for a response to be generated - both of which would block eachother.

Related

Difference between sync and async gRPC

I am working on a service based on gRPC, which requires high throughput. But currently my program suffers low throughput when using C++ synchronous gRPC.
I've read through gRPC documentations, but don't find explicit explanation on the difference between sync/async APIs. Except async has control over completion queue, while it's transparent to sync APIs.
I want to know whether synchronous gRPC sends messages to TCP layer, and wait for its "ack", thus the next message would be blocked?
Meanwhile async APIs would send them asynchronously without latter messages waiting?
TLDR: Yes, async APIs would send the messages asynchronously without latter messages waiting, while synchronous APIs will block the whole thread while one message is being sent/received.
gRPC uses CompletionQueue for it's asynchronous operations. You can find the official tutorial here: https://grpc.io/docs/languages/cpp/async/
CompletionQueue is an event queue. "event" here can be the completion of a request data reception or the expiry of an alarm(timer), etc. (basically, the completion of any asynchronous operation.)
Using the official gRPC asynchronous APIs example as example, focus on the CallData class and HandleRpcs():
void HandleRpcs() {
// Spawn a new CallData instance to serve new clients.
new CallData(&service_, cq_.get());
void* tag; // uniquely identifies a request.
bool ok;
while (true) {
// Block waiting to read the next event from the completion queue. The
// event is uniquely identified by its tag, which in this case is the
// memory address of a CallData instance.
// The return value of Next should always be checked. This return value
// tells us whether there is any kind of event or cq_ is shutting down.
GPR_ASSERT(cq_->Next(&tag, &ok));
GPR_ASSERT(ok);
static_cast<CallData*>(tag)->Proceed();
}
}
HandleRpcs() is the main loop of the server. It's an infinite loop which continuously gets the next event from the completion queue by using cq->Next() , and calls it's Proceed() method (our custom method to process client request of different states).
The CallData class (instance of which represents a complete processing cycle of a client request):
class CallData {
public:
// Take in the "service" instance (in this case representing an asynchronous
// server) and the completion queue "cq" used for asynchronous communication
// with the gRPC runtime.
CallData(Greeter::AsyncService* service, ServerCompletionQueue* cq)
: service_(service), cq_(cq), responder_(&ctx_), status_(CREATE) {
// Invoke the serving logic right away.
Proceed();
}
void Proceed() {
if (status_ == CREATE) {
// Make this instance progress to the PROCESS state.
status_ = PROCESS;
// As part of the initial CREATE state, we *request* that the system
// start processing SayHello requests. In this request, "this" acts are
// the tag uniquely identifying the request (so that different CallData
// instances can serve different requests concurrently), in this case
// the memory address of this CallData instance.
service_->RequestSayHello(&ctx_, &request_, &responder_, cq_, cq_,
this);
} else if (status_ == PROCESS) {
// Spawn a new CallData instance to serve new clients while we process
// the one for this CallData. The instance will deallocate itself as
// part of its FINISH state.
new CallData(service_, cq_);
// The actual processing.
std::string prefix("Hello ");
reply_.set_message(prefix + request_.name());
// And we are done! Let the gRPC runtime know we've finished, using the
// memory address of this instance as the uniquely identifying tag for
// the event.
status_ = FINISH;
responder_.Finish(reply_, Status::OK, this);
} else {
GPR_ASSERT(status_ == FINISH);
// Once in the FINISH state, deallocate ourselves (CallData).
delete this;
}
}
private:
// The means of communication with the gRPC runtime for an asynchronous
// server.
Greeter::AsyncService* service_;
// The producer-consumer queue where for asynchronous server notifications.
ServerCompletionQueue* cq_;
// Context for the rpc, allowing to tweak aspects of it such as the use
// of compression, authentication, as well as to send metadata back to the
// client.
ServerContext ctx_;
// What we get from the client.
HelloRequest request_;
// What we send back to the client.
HelloReply reply_;
// The means to get back to the client.
ServerAsyncResponseWriter<HelloReply> responder_;
// Let's implement a tiny state machine with the following states.
enum CallStatus { CREATE, PROCESS, FINISH };
CallStatus status_; // The current serving state.
};
As we can see, a CallData has three states: CREATE, PROCESS and FINISH.
A request routine looks like this:
At startup, preallocates one CallData for a future incoming client.
During the construction of that CallData object, service_->RequestSayHello(&ctx_, &request_, &responder_, cq_, cq_, this) gets called, which tells gRPC to prepare for the reception of exactly one SayHello request.
At this point we don't know where the request will come from or when it will come, we are just telling gRPC that we are ready to process when one actually arrives, and let gRPC notice us when it happens.
Arguments to RequestSayHello tells gRPC where to put the context, request body and responder of the request after receiving one, as well as which completion queue to use for the notice and what tags should be attached to the notice event (in this case, this is used as the tag).
HandleRpcs() blocks on cq->Next(). Waiting for an event to occur.
some time later....
client makes a SayHello request to the server, gRPC starts receiving and decoding that request. (IO operation)
some time later....
gRPC have finished receiving the request. It puts the request body into the request_ field of the CallData object (via the pointer supplied earlier), then creates an event (with the pointer to the CallData object as tag, as asked earlier by the last argument to RequestSayHello). gRPC then puts that event into the completion queue cq_.
The loop in HandleRpcs() received the event(the previously blocked call to cq->Next() returns now), calls CallData::Proceed() to process the request.
status_ of the CallData is PROCESS, so it does the following:
6.1. Creates a new CallData object, so that new client requests after this one can be processed.
6.2. Generates the reply for the request, tells gRPC we have finished processing and please send the reply back to the client.
6.3 gRPC starts transmission of the reply. (IO operation)
6.4 The loop in HandleRpcs() goes into the next iteration and blocks on cq->Next() again, waiting for a new event to occur.
some time later....
gRPC have finished transmission of the reply and tells us that by again putting an event into the completion queue with a pointer to CallData as the tag.
cq->Next() receives the event and returns, CallData::Proceed() deallocates the CallData object (by using delete this;). HandleRpcs() loops and blocks on cq->Next() again, waiting for a new event.
It might look like the process is largely the same as synchonous API, just with extra access to the completion queue. However, by doing it this way, at each and every some time later.... (usually is waiting for IO operation to complete or waiting for a request to occur), cq->Next() can actually receive operation completion events not only for this request, but for other requests as well.
So if a new request come in while the first request is, let's say, waiting for the transmission of reply data to finish, cq->Next() will get the event emitted by the new request, and starts the processing of the new request immediately and concurrently, instead of waiting for the first request to finish its transmission.
Synchonous API, on the other hand, will always wait for the full completion of one request (from start receiving to finish replying) before even starting the receiving of another one. This meant near 0% CPU utilization while receiving request body data and sending back reply data (IO operations). Precious CPU time that could have been used to process other requests are wasted on just waiting.
This is really bad since if a client with a bad internet connection (100ms round-trip) sent a request to the server, we will have to spend at least 200ms for every request from this client just on actively waiting for TCP transmission to finish. That would bring our server performance down to only ~5 requests per second.
Whereas if we are using asynchonous API, we just don't actively wait for anything. We tell gRPC: "please send this data to the client, but we will not wait for you to finish here. Instead, just put a little letter to the completion queue when you are done, and we'll check it later." and move on to process other requests.
Related information
You can see how a simple server is written for both synchronous APIs and asynchronous APIs
Best performance practices
The best performance practice suggested by the gRPC C++ Performance Nodes is to spawn the amount of threads equal to your CPU cores count, and use one CompletionQueue per thread.

What notification is provided for a lost connection in a C++ gRPC async server

I have an async gRPC server for Windows written in C++. I’d like to detect the loss of connection to a client – whether a network connection is lost, or the client crashes, etc. I see references to the keepalive channel arguments, and I’ve tried various combinations of those settings, such as:
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_TIME_MS, 10000);
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_TIMEOUT_MS, 10000);
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS, 1);
builder.AddChannelArgument(GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS, 9000);
builder.AddChannelArgument(GRPC_ARG_HTTP2_BDP_PROBE, 1);
I've done some testing with a streaming RPC method. If I kill the client process and then try to send data to the client, the lost connection is detected. I don't actually even have to send data. I can set an Alarm object to trigger immediately and that causes the call handler to be cancelled. However, if I don't try to send data (or set an alarm) after killing the client process then there's no notification or callback that I've been able to find/enable. I must not have a complete understanding. So:
How does the detection of a lost connection manifest itself for the server? Is there a callback method, or notification of some type? My server doesn’t receive any errors; the completion queue’s ‘Next()’ method never returns, etc.
Does this detection work for both unary (call/response) and streaming methods?
Does the server detection of a lost connection work whether or not the client has implemented lost connection / keepalive logic?
Is there some method besides the keepalive channel arguments that is preferred?
Thanks - any help is appreciated.
You can use ServerContext::AsyncNotifyWhenDone() to get a notification when the request has been cancelled.
https://grpc.github.io/grpc/cpp/classgrpc__impl_1_1_server_context_base.html#a0f1289f31257e6dbef57bc901bd7b5f2

boost asio async_read header connection closes too early

Providing a MCVE is going to be hard, the scenario is the following:
a server written in c++ with boost asio offers some services
a client written in c++ with boost asio requests services
There are custom headers and most communication is done using multipart/form.
However, in the case where the server returns a 401 for an unauthorized access,
the client receives a broken pipe (system error 32).
AFAIK this happens when the server connection closes too early.
So, running into gdb, I can see that the problem is indeed the transition from the async_write which sends the request, to the async_read_until which reads the first line of the HTTP Header:
The connect routine sends the request from the client to the server:
boost::asio::async_write(*socket_.get(),
request_,
boost::bind(&asio_handler<http_socket>::write_request,
this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
And the write_request callback, checks if the request was sent OK, and then reads the first line (until the first newline):
template <class T>
void asio_handler<T>::write_request(const boost::system::error_code & err,
const std::size_t bytes)
{
if (!err) {
// read until first newline
boost::asio::async_read_until(*socket_,
buffer_,
"\r\n",
boost::bind(&asio_handler::read_status_line,
this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
else {
end(err);
}
}
The problem is that the end(err) is always called with a broken pipe (error code 32). Meaning, as far as I understand, that the server closed the connection. The server indeed closes the connection, but only after it has sent a message HTTP/1.1 401 Unauthorized.
using curl with the appropriate request, we do get the actual message/error before the server closes the connection
using our client written in C++/boost asio we only get the broken pipe and no data
only when the server leaves the connection open, do we get to the point of reading the error (401) but that defeats the purpose, since now the connection is left open.
I would really appreciate any hints or tips. I understand that without the code its hard to help, so I can add more source at any time.
EDIT:
If I do not check for errors between writing the request, and reading the server reply, then I do get the actual HTTP 401 error. However this seems counter-intuitive, and I am not sure why this happens or if it is supposed to happen.
The observed behavior is allowed per the HTTP specification.
A client or server may close the socket at anytime. The server can provide a response and close the connection before the client has finished transmitting the request. When writing the body, it is recommended that clients monitor the socket for an error or close notification. From the RFC 7230, HTTP/1.1: Message Syntax and Routing Section 6.5. Failures and Timeouts:
6.5. Failures and Timeouts
A client, server, or proxy MAY close the transport connection at any time. [...]
A client sending a message body SHOULD monitor the network connection for an error response while it is transmitting the request. If the client sees a response that indicates the server does not wish to receive the message body and is closing the connection, the client SHOULD immediately cease transmitting the body and close its side of the connection.
On a graceful connection closure, the server will send a response to the client before closing the underlying socket:
6.6. Tear-down
A server that sends a "close" connection option MUST initiate a close of the connection [...] after it sends the response containing "close". [...]
Given the above behaviors, there are three possible scenarios. The async_write() operation completes with:
success, indicating the request was written in full. The client may or may not have received the HTTP Response yet
an error, indicating the request was not written in full. If there is data available to be read on the socket, then it may contain the HTTP Response sent by the server before the connection terminated. The HTTP connection may have terminated gracefully
an error, indicating the request was not written in full. If there is no data available to be read on the socket, then the HTTP connection was not terminated gracefully
Consider either:
initiating the async_read() operation if the async_write() is successful or there is data available to be read
void write_request(
const boost::system::error_code & error,
const std::size_t bytes_transferred)
{
// The server may close the connection before the HTTP Request finished
// writing. In that case, the HTTP Response will be available on the
// socket. Only stop the call chain if an error occurred and no data is
// available.
if (error && !socket_->available())
{
return;
}
boost::asio::async_read_until(*socket_, buffer_, "\r\n", ...);
}
per the RFC recommendation, initiate the async_read() operation at the same time as the async_write(). If the server indicates the HTTP connection is closing, then the client would shutdown its send side of the socket. The additional state handling may not warrant the extra complexity

boost::asio::async_write issue over serial channel

I have client server application, the flow is as explained below:
client is at windows side and do not use boost
server is at linux side and uses boost
client-server communicates over serial channel RS485. and server uses boost::asio::async_write.
client --> calls command with specific command_id --> server
client <-- sends acknowledgement <-- server
{server process the command, meanwhile the client is blocked for response}
client <-- sends response <-- server
Sometimes what happens client receives the acknowledgement but do not receive the response even if the response is sent by the server.
The pending response is later received by the client when client sends another command.
If I use boost::asio::write for serial communication there is no problem at all.
below is the code snippet for async_write
boost::asio::async_write(serial_port, boost::asio::buffer(&v_chunk[0], v_chunk.size()),
boost::bind(&Serial_channel::async_write_callback, this, boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
io_serv->run();
io_serv->reset();
The way you use the io_service will not work. First of all the run function doesn't return until the service event loop is stopped. Secondly, if you just want to use it as a "poller" then you should use poll or optionally poll_one (or perhaps run_one).
But if you do it like that, it's the same as doing a non-async write call, and you loos the benefits of the async functions.
referencing #Joachim comment I have changed my flow like below and it worked.
boost::asio::async_write(serial_port, boost::asio::buffer(&v_chunk[0], v_chunk.size()),
boost::bind(&Serial_channel::async_write_callback, this, boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
io_serv->run();
usleep(1000); // 1 millisecond delay
io_serv->reset();

How to increase the socket timeout on the server side using Restify?

I use restify to implement a node.js server. Basically the server runs a time-consuming process per a HTTP POST request, but somehow the socket gets closed and the client receives an error message like this:
[Error: socket hang up] code: 'ECONNRESET'
According to the error type, the socket is definitely closed on the server side.
Is there any option that I can set in the createServer method of the restify to solve this problem?
Edit:
The long running process is using Mongoose to run MongoDB process. Maybe it is also possible that the socket hangup is caused by the connection to MongoDB? How to increase the timeout for Mongoose? I found that the hang up happened in exactly 120 seconds, so it might be because of some default timeout configuration?
Thanks in advance!
You can use the standard socket on the req object, and manually call setTimeout to increase the time before node hangs up the socket. By default, node has a 2 minute timer on all sockets for inactivity, which is why you are getting hang ups at exactly 120s (this has nothing to do with restify). As an example of increasing that, set up a handler to run before your long running task like this:
server.use(function (req, res, next) {
// This will set the idle timer to 10 minutes
req.connection.setTimeout(600 * 1000);
res.connection.setTimeout(600 * 1000); //**Edited**
next();
});
This seams not to be actually implemented
https://github.com/mcavage/node-restify/issues/288