My team is designing a scalable solution with micro-services architecture and planning to use gRPC as the transport communication between layers. And we've decided to use async grpc model. The design that example( provides doesn't seem viable if I scale the number of RPC methods, because then I'll have to create a new class for every RPC method, and create their objects in HandleRpcs() like this.
Pastebin (Short example code).
void HandleRpcs() {
new CallDataForRPC1(&service_, cq_.get());
new CallDataForRPC2(&service_, cq_.get());
new CallDataForRPC3(&service, cq_.get());
// so on...
It'll be hard-coded, all the flexibility will be lost.
I've around 300-400RPC methods to implement and having 300-400 classes will be cumbersome and inefficient when I'll have to handle more than 100K RPC requests/sec and this solution is a very bad design. I can't bear the overhead of creation of objects this way on every single request. Can somebody kindly provide me a workaround for this. Can async grpc c++ not be simple like its sync companion?
Edit: In favour of making the situation more clear, and for those who might be struggling to grasp the flow of this async example, I'm writing what I've understood so far, please make me correct if wrong somewhere.
In async grpc, every time we have to bind a unique-tag with the completion-queue so that when we poll, the server can give it back to us when the particular RPC will be hit by the client, and we infer from the returned unique-tag about the type of the call.
service_->RequestRPC2(&ctx_, &request_, &responder_, cq_, cq_,this); Here we're using the address of the current object as the unique-tag. This is like registering for our RPC call on the completion queue. Then we poll down in HandleRPCs() to see if the client hits the RPC, if so then cq_->Next(&tag, &OK) will fill the tag. The polling code snippet:
while (true) {
GPR_ASSERT(cq_->Next(&tag, &ok));
Since, the unique-tag that we registered into the queue was the address of the CallData object so we're able to call Proceed(). This was fine for one RPC with its logic inside Proceed(). But with more RPCs each time we'll have all of them inside the CallData, then on polling, we'll be calling the only one Proceed() which will contain logic to (say) RPC1(postgres calls), RPC2(mongodb calls), .. so on. This is like writing all my program inside one function. So, to avoid this, I used a GenericCallData class with the virtual void Proceed() and made derived classes out of it, one class per RPC with their own logic inside their own Proceed(). This is a working solution but I want to avoid writing many classes.
Another solution I tried was keeping all RPC-function-logics out of the proceed() and into their own functions and maintaining a global std::map<long, std::function</*some params*/>> . So whenever I register an RPC with unique-tag onto the queue, I store its corresponding logic function (which I'll surely hard code into the statement and bind all the parameters required), then the unique-tag as key. On polling, when I get the &tag I do a lookup in the map for this key and call the corresponding saved function. Now, there's one more hurdle, I'll have to do this inside the function logic:
// pseudo code
void function(reply, responder, context, service)
// register this RPC with another unique tag so to serve new incoming request of the same type on the completion queue
service_->RequestRPC1(/*params*/, new_unique_id);
// now again save this new_unique_id and current function into the map, so when tag will be returned we can do lookup
map.emplace(new_unique_id, function);
// now you're free to do your logic
// do your logic
You see this, code has spread into another module now, and it's per RPC based.
Hope it clears the situation.
I thought if somebody could have implemented this type of server in a more easy way.

This post is pretty old by now but I have not seen any answer or example regarding this so I will show how I solved it to any other readers. I have around 30 RPC calls and was looking for a way of reducing the footprint when adding and removing RPC calls. It took me some iterations to figure out a good way to solve it.
So my interface for getting RPC requests from my (g)RPC library is a callback interface that the recepiant need to implement. The interface looks like this:
class IRpcRequestHandler
virtual ~IRpcRequestHandler() = default;
virtual void onZigbeeOpenNetworkRequest(const smarthome::ZigbeeOpenNetworkRequest& req,
smarthome::Response& res) = 0;
virtual void onZigbeeTouchlinkDeviceRequest(const smarthome::ZigbeeTouchlinkDeviceRequest& req,
smarthome::Response& res) = 0;
And some code for setting up/register each RPC method after the gRPC server is started:
void ready()
SETUP_SMARTHOME_CALL("ZigbeeOpenNetwork", // Alias that is used for debug messages
smarthome::Command::AsyncService::RequestZigbeeOpenNetwork, // Generated gRPC service method for async.
smarthome::ZigbeeOpenNetworkRequest, // Generated gRPC service request message
smarthome::Response, // Generated gRPC service response message
IRpcRequestHandler::onZigbeeOpenNetworkRequest); // The callback method to call when request has arrived.
This is all that you need to care about when adding and removing RPC methods.
The SETUP_SMARTHOME_CALL is a home-cooked macro which looks like this:
new ServerCallData<REQ, RES>( \
std::bind(&SERVICE, \
&mCommandService, \
std::placeholders::_1, \
std::placeholders::_2, \
std::placeholders::_3, \
std::placeholders::_4, \
std::placeholders::_5, \
std::placeholders::_6), \
mCompletionQueue.get(), \
std::bind(&CALLBACK_FUNC, requestHandler, std::placeholders::_1, std::placeholders::_2))
I think the ServerCallData class looks like the one from gRPCs examples with a few modifications. ServerCallData is derived from a non-templete class with an abstract function void proceed(bool ok) for the CompletionQueue::Next() handling. When ServerCallData is created, it will call the SERVICE method to register itself on the CompletionQueue and on every first proceed(ok) call, it will clone itself which will register another instance. I can post some sample code for that as well if someone is interested.
EDIT: Added some more sample code below.
class GrpcServer
explicit GrpcServer(std::vector<grpc::Service*> services);
virtual ~GrpcServer();
void run(const std::string& sslKey,
const std::string& sslCert,
const std::string& password,
const std::string& listenAddr,
uint32_t port,
uint32_t threads = 1);
virtual void ready(); // Called after gRPC server is created and before polling CQ.
void handleRpcs(); // Function that polls from CQ, can be run by multiple threads. Casts object to CallData and calls CallData::proceed().
std::unique_ptr<ServerCompletionQueue> mCompletionQueue;
std::unique_ptr<Server> mServer;
std::vector<grpc::Service*> mServices;
std::list<std::shared_ptr<std::thread>> mThreads;
And the main part of the CallData object:
template <typename TREQUEST, typename TREPLY>
class ServerCallData : public ServerCallMethod
explicit ServerCallData(const std::string& methodName,
void*)> serviceFunc,
grpc::ServerCompletionQueue* completionQueue,
std::function<void(const TREQUEST&, TREPLY&)> callback,
bool first = false)
: ServerCallMethod(methodName),
void proceed(bool ok) override
if (!ok)
delete this;
if (callStatus() == ServerCallMethod::PROCESS)
callStatus() = ServerCallMethod::FINISH;
new ServerCallData<TREQUEST, TREPLY>(callMethodName(), serviceFunc, completionQueue, callback);
callback(mRequest, mReply);
catch (const std::exception& e)
mResponder.Finish(mReply, Status::CANCELLED, this);
mResponder.Finish(mReply, Status::OK, this);
delete this;
void requestNewCall()
&mContext, &mRequest, &mResponder, completionQueue, completionQueue, this);
ServerContext mContext;
TREQUEST mRequest;
TREPLY mReply;
ServerAsyncResponseWriter<TREPLY> mResponder;
std::function<void(const TREQUEST&, TREPLY&)> callback;
grpc::ServerCompletionQueue* completionQueue;

Although the thread is old I wanted to share a solution I am currently implementing. It mainly consists templated classes inheriting CallData to be scalable. This way, each new rpc will only require specializing the templates of the required CallData methods.
Calldata header:
class CallData {
enum Status { CREATE, PROCESS, FINISH };
Status status;
virtual void treat_create() = 0;
virtual void treat_process() = 0;
void Proceed();
CallData Proceed implementation:
void CallData::Proceed() {
switch (status) {
case CREATE:
status = PROCESS;
status = FINISH;
case FINISH:
delete this;
Inheriting from CallData header (simplified):
template <typename Request, typename Reply>
class CallDataTemplated : CallData {
static_assert(std::is_base_of<google::protobuf::Message, Request>::value,
"Request and reply must be protobuf messages");
static_assert(std::is_base_of<google::protobuf::Message, Reply>::value,
"Request and reply must be protobuf messages");
Request request;
Reply reply;
void treat_create() override;
void treat_process() override;
Then, for specific rpc's in theory you should be able to do things like:
void CallDataTemplated<HelloRequest, HelloReply>::treat_process() {
It's a lot of templated methods but preferable to creating a class per rpc from my point of view.


C++ GRPC ClientAsyncReaderWriter: how to check if data is available for read?

I have bidirectional streaming async grpc client that use ClientAsyncReaderWriter for communication with server. RPC code looks like:
rpc Process (stream Request) returns (stream Response)
For simplicity Request and Response are bytes arrays (byte[]). I send several chunks of data to server, and when server accumulate enough data, server process this data and send back the response and continue accumulating data for next responses. After several responses, the server send final response and close connection.
For async client I using CompletionQueue. Code looks like:
CompletionQueue cq;
std::unique_ptr<Stub> stub;
grpc::ClientContext context;
std::unique_ptr<grpc::ClientAsyncReaderWriter<Request,Response>> responder = stub->AsyncProcess(&context, &cq, handler);
// thread for completition queue
std::thread t(
void *handler = nullptr;
bool ok = false;
while (cq_.Next(&handler, &ok)) {
if (can_read) {
// how do you know that it is read data available
// Do read
} else {
// do write
Request request = prepare_request();
responder_->Write(request, handler);
// wait
What is the proper way to async reading? Can I try to read if it no data available? Is it blocking call?
Sequencing Read() calls
Can I try to read if it no data available?
Yep, and it's going to be case more often than not. Read() will do nothing until data is available, and only then put its passed tag into the completion queue. (see below for details)
Is it blocking call?
Nope. Read() and Write() return immediately. However, you can only have one of each in flight at any given moment. If you try to send a second one before the previous has completed, it (the second one) will fail.
What is the proper way to async reading?
Each time a Read() is done, start a new one. For that, you need to be able to tell when a Read() is done. This is where tags come in!
When you call Read(&msg, tag), or Write(request, tag),you are telling grpc to put tag in the completion queue associated with that responder once that operation has completed. grpc doesn't care what the tag is, it just hands it off.
So the general strategy you will want to go for is:
As soon as you are ready to start receiving messages:
call responder->Read() once with some tag that you will recognize as a "read done".
Whenever cq_.Next() gives you back that tag, and ok == true:
consume the message
Queue up a new responder->Read() with that same tag.
Obviously, you'll also want to do something similar for your calls to Write().
But since you still want to be able to lookup the handler instance from a given tag, you'll need a way to pack a reference to the handler as well as information about which operation is being finished in a single tag.
Completion queues
Lookup the handler instance from a given tag? Why?
The true raison d'ĂȘtre of completion queues is unfortunately not evident from the examples. They allow multiple asynchronous rpcs to share the same thread. Unless your application only ever makes a single rpc call, the handling thread should not be associated with a specific responder. Instead, that thread should be a general-purpose worker that dispatches events to the correct handler based on the content of the tag.
The official examples tend to do that by using pointer to the handler object as the tag. That works when there's a specific sequence of events to expect since you can easily predict what a handler is reacting to. You often can't do that with async bidirectional streams, since any given completion event could be a Read() or a Write() finishing.
Here's a general outline of what I personally consider to be a clean way to go about all that:
// Base class for async bidir RPCs handlers.
// This is so that the handling thread is not associated with a specific rpc method.
class RpcHandler {
// This will be used as the "tag" argument to the various grpc calls.
struct TagData {
enum class Type {
// add more as needed...
RpcHandler* handler;
Type evt;
struct TagSet {
TagSet(RpcHandler* self)
: start_done{self, TagData::Type::start_done},
read_done{self, TagData::Type::read_done},
write_done{self, TagData::Type::write_done} {}
TagData start_done;
TagData read_done;
TagData write_done;
RpcHandler() : tags(this) {}
virtual ~RpcHandler() = default;
// The actual tag objects we'll be passing
TagSet tags;
virtual void on_ready() = 0;
virtual void on_recv() = 0;
virtual void on_write_done() = 0;
static void handling_thread_main(grpc::CompletionQueue* cq) {
void* raw_tag = nullptr;
bool ok = false;
while (cq->Next(&raw_tag, &ok)) {
TagData* tag = reinterpret_cast<TagData*>(raw_tag);
if(!ok) {
// Handle error
else {
switch (tag->evt) {
case TagData::Type::start_done:
case TagData::Type::read_done:
case TagData::Type::write_done:
void do_something_with_response(Response const&);
class MyHandler final : public RpcHandler {
using responder_ptr =
std::unique_ptr<grpc::ClientAsyncReaderWriter<Request, Response>>;
MyHandler(responder_ptr responder) : responder_(std::move(responder)) {
// This lock is needed because StartCall() can
// cause the handler thread to access the object.
std::lock_guard lock(mutex_);
~MyHandler() {
// TODO: finish/abort the streaming rpc as appropriate.
void send(const Request& msg) {
std::lock_guard lock(mutex_);
if (!sending_) {
sending_ = true;
responder_->Write(msg, &tags.write_done);
} else {
// TODO: add some form of synchronous wait, or outright failure
// if the queue starts to get too big.
// When the rpc is ready, queue the first read
void on_ready() override {
std::lock_guard l(mutex_); // To synchronize with the constructor
responder_->Read(&incoming_, &tags.read_done);
// When a message arrives, use it, and start reading the next one
void on_recv() override {
// incoming_ never leaves the handling thread, so no need to lock
// ------ If handling is cheap and stays in the handling thread.
responder_->Read(&incoming_, &tags.read_done);
// ------ If responses is expensive or involves another thread.
// Response msg = std::move(incoming_);
// responder_->Read(&incoming_, &tags.read_done);
// do_something_with_response(msg);
// When has been sent, send the next one is there is any
void on_write_done() override {
std::lock_guard lock(mutex_);
if (!queued_msgs_.empty()) {
responder_->Write(queued_msgs_.front(), &tags.write_done);
} else {
sending_ = false;
responder_ptr responder_;
// Only ever touched by the handler thread post-construction.
Response incoming_;
bool sending_ = false;
std::queue<Request> queued_msgs_;
std::mutex mutex_; // grpc might be thread-safe, MyHandler isn't...
int main() {
// Start the thread as soon as you have a completion queue.
auto cq = std::make_unique<grpc::CompletionQueue>();
std::thread t(RpcHandler::handling_thread_main, cq.get());
// Multiple concurent RPCs sharing the same handling thread:
MyHandler handler1(serviceA->MethodA(&context, cq.get()));
MyHandler handler2(serviceA->MethodA(&context, cq.get()));
MyHandlerB handler3(serviceA->MethodB(&context, cq.get()));
MyHandlerC handler4(serviceB->MethodC(&context, cq.get()));
If you have a keen eye, you will notice that the code above stores a bunch (1 per event type) of redundant this pointers in the handler. It's generally not a big deal, but it is possible to do without them via multiple inheritance and downcasting, but that's starting to be somewhat beyond the scope of this question.

Pass data between threads using boost.signals2

I apologize for the ambiguous title, but I'll try to elaborate further here:
I have an application which includes (among others) a control, and TCP server classes.
Communication between the TCP and control class is done via this implementation:
#include <boost/signals2.hpp>
// T - Observable object type
// S - Function signature
template <class T, typename S> class observer {
using F = std::function<S>;
void register_notifier(T &obj, F f)
connection_ = obj.connect_notifier(std::forward<F>(f));
boost::signals2::scoped_connection connection_;
// S - Function signature
template <typename S> class observable {
boost::signals2::scoped_connection connect_notifier(std::function<S> f)
return notify.connect(std::move(f));
boost::signals2::signal<S> notify;
Where the TCP server class is the observable, and the control class is the observer.
The TCP server is running on a separate thread from the control class, and uses boost::asio::async_read. Whenever a message is received, the server object sends a notification via the 'notify' member, thus triggering the callback registered in the control class, and then waits to read the next message.
The problem is that I need to somehow safely and efficiently store the data currently stored in the TCP server buffer and pass it to the control class before it's overridden by the next message.
i.e. :
inline void ctl::tcp::server::handle_data_read(/* ... */)
if (!error) {
/* .... */
notify(/* ? */); // Using a pointer to the buffer
// would obviously fail as it
// is overridden in the next read
/* .... */
Those were my ideas for a solution so far:
Allocating heap memory and passing a pointer to it using
unique_ptr, but I'm not sure if boost.signals2 is move-aware.
Use an
unordered map (shared between the objects) that maps an integer index to a unique_ptr of the
data type (std::unordered_map<int, std::unique_ptr<data_type>>),
then only pass the index of the element, and 'pop' it in the control
class callback, but it feels like an overkill.
What I'm really looking for is an idea for a simple and efficient solution to pass the TCP buffer contents for each message between the threads.
Note I'm also open for suggestions to redesign my communication method between the objects if it's completely wrong.

C++ Design: Multiple TCP clients, boost asio and observers

In my system, I have a juggle a bunch of TCP clients and I am bit confused on how to design it [most of my experience is in C, hence the insecurity]. I am using boost ASIO for managing connection. These are the components I have
A TCPStream class : thin wrapper over boost asio
an IPC protocol, which implement a protocol over TCP:
basically Each message starts with a type and length field
so we can read the individual messages out of the stream.
Connection classes which handle the messages
Observer class which monitors connections
I am writing pseudo C++ code to be concise. I think you will get the idea
class TCPStream {
boost::asio::socket socket_;
template <typename F>
void connect (F f)
template <typename F>
void read (F f)
class IpcProtocol : public TCPStream {
template <typename F
void read (F f)
[f] (buffer, err) {
while (msg = read_indvidual_message(buffer)) {
// **** this is a violation of how this pattern is
// supposed to work. Ideally there should a callback
// for individual message. Here the same callback
// is called for N no. of messages. But in our case
// its the same callback everytime so this should be
// fine - just avoids some function calls.
Lets say I have a bunch of TCP connections and there are a handler class
for each of the connection. Lets name it Connection1, Connection2 ...
class Connection {
virtual int type() = 0;
class Connection1 : public Connection {
shared_ptr<IpcProtocol> ipc_;
int type ()
return 1;
void start ()
ipc_.connect([self = shared_from_this()](){ self->connected(); });
[self = shared_from_this()](msg, err) {
if (!err)
} else {
void connected ()
void error ()
This pattern repeats for all connections one way or other.
messages are processed by the connection class itself. But it will let know of
other events [connect, error] to an observer. The reason -
Restart the connection, everytime it disconnect
Bunch of guys needs to know if the connection is established so that they can
send initial request/confguration to server.
There are things that needs be done based on connection status of muliple connections
Eg: if connection1 and connection2 are established, then start connection3 etc.
I added a middle Observer class is there so that the observers do have to directly connect to the connection everytime it is restarted. Each time connection breaks, the connection class is deleted and new one is created.
class Listeners {
virtual void notify_error(shared_ptr<Connection>) = 0;
virtual void notify_connect(shared_ptr<Connection>) = 0;
virtual void interested(int type) = 0;
class Observer {
std::vector<Listeners *> listeners_;
void notify_connect(shared_ptr<Connection> connection)
for (listener : listeners_) {
if (listener->interested(connection->type())) {
Now a rough prototype of this works. But I was wondering if this class design
any good. There are multiple streaming servers which will continuously produce states and send it to my module to program the state in h/w. This needs to be extensible as more clients will be added in future.
The legacy code had one thread per TCP connection and this worked fine. Here I am trying to handle multiple connections on same thread. Still there will be multiple threads calling ioservice. So the observer will run on multiple threads. I am planning to have a mutex per Listener, so that listeners wont get multiple events concurrently.
HTTP Implements a protocol over TCP so the HTTP Server asio examples are a good starting point for your design, especially: HTTP Server 2, HTTP Server 3 and HTTP Server 4.
Note: that connection lifetime is likely to be an issue, especially since you intend to use class member functions as handlers, see the question and answers here: How to design proper release of a boost::asio socket or wrapper thereof.

In what situation should we adopt state pattern?

In what situation should we adopt state pattern?
I've been assigned to maintain a project, the project state machine was implemented by switch-case that are 2000+ lines long. It will be hard to expand function, so I would like to refactor it.
I'm surveying state design pattern, but I have some confusions.
A simple example:
1. Initial state "WAIT", wait user send download command
2. While user send download command, move to "CONNECT" state, connect to server
3. After connection is created, move to "DOWNLOADING" state, keep receive data from server
4. While the data download complete, move to "DISCONNECT", disconnect link with server
5. After disconnect, move to "WAIT" state, wait user send download command
A simple state machine pic
Method 1: Before I survey state pattern, I think a trivial method --- wrapper different state behavior in different function, use a function pointer array to point each state function, and change state by call function.
typedef enum {
void (*statefunction[MAX_STATE])(void) =
void WAITState(void)
//do wait behavior
//while receive download command
void CONNECTState(void)
//do connect behavior
//while connect complete
void DOWNLOADINGState(void)
//do downloading behavior
//while download complete
void DISCONNECTState(void)
//do disconnect behavior
//while disconnect complete
Method 2: The state pattern encapsulates different state and its behavior in different class (object-oriented state machine), uses polymorphism to implement different state behavior, and defines a common interface for all concrete states.
class State
virtual void Handle(Context *pContext) = 0;
class Context
Context(State *pState) : m_pState(pState){}
void Request()
if (m_pState)
State *m_pState;
class WAIT : public State
virtual void Handle(Context *pContext)
//do wait behavior
class CONNECT : public State
virtual void Handle(Context *pContext)
//do connect behavior
class DOWNLOADING : public State
virtual void Handle(Context *pContext)
//do downloading behavior
class DISCONNECT : public State
virtual void Handle(Context *pContext)
//do disconnect behavior
I'm wondering whether the state pattern batter than function pointer in this case or not...
Using function pointer only also can improve readability (compare with switch-case), and more simple.
The state pattern will create several class, and more complex than using function pointer only.
What's the advantage of using state pattern?
Thanks for your time!
What's the advantage of using the state pattern?
First, one needs to notice, that both of the methods you've provided, are in fact examples of the very same pattern. One of the methods describes a function-based implementation, while the other one takes more of an object oriented approach.
That being said, the pattern itself has a few advantages:
It limits the number of states, a program can be in, and thus - eliminates undefined states,
It allows for easier expansion of the application, by adding new states, instead of refactoring the whole code,
From a company perspective, it is safe, even when multiple people work on the same class,
Since you tagged the question as related to c++, it is best to take into account what the language both gives and requires. While classes offer inheritance, a large number of classes can greatly increase the compilation time. Hence, when it comes to implementations, if your state machine is large, static polymorphism may be the way to go.

How to use callback results in asynchronous model C++

I have a C++ API which has a certain defined functions and it's related callbacks.
All these functions are asynchronous in nature.
Now, using this API I want to construct an asynchronous system which sends
multiple request to the server for collecting different data items and then use
these data item for further use.
For example:
void functionA()
requestDataForA(); //asynchronous request to the server
//async wait for the callback
void functionB()
requestDataForB(); //asynchronous request to the server
//async wait for the callback
void functionC()
requestDataForC(); //asynchronous request to the server
//async wait for the callback
Now my question is that when the callback gives the data item, how to use it for subsequent processing. It cannot be done in callback as callback doesn't know who will use the data.
You implicitly have this information, you just need to track it. Lets say that object A calls functionA, you should make A implement a particular interface that accepts data related that is the response from calling requestA. Lets say this response is DataA, then the interface would be
class InterfaceADataHandler
virtual void handle(DataA const&) = 0; // this is the method that will process the data..
class A : public InterfaceADataHandler
void handle(DataA const&) {} // do something with data
// Now I want to be called back
void foo()
functionA(this); // call function A with this instance
void functionA(InterfaceADataHandler* pHandler)
// store this handler against request (say some id)
// wait for callback
// when you have callback, lookup the handler that requested the data, and call that handler
In most API's, you the developer would be providing the callback which will be invoked by the API with the data that has been retrieved. You can then store the data and use it at a later time or use it within the callback (assuming that you won't take very long to process and promise not to block for I/O).
The model would look more like:
void functionA()
requestDataForA(processDataForA); //asynchronous request to the server
void processDataForA(void *someData)
// process "someData"