Storm > Howto Integrate Java callback into a Spout - clojure

I'm trying to integrate Storm (see here) into my project. I grok the concepts of topologies, spouts, and bolts. But now, I'm trying to figure out the actual implementation of a few things.
A) I have a polyglot environment with Java and Clojure. My Java code is a callback class with methods firing streaming data. The event data pushed to those methods, is what I want to use as a spout.
So the first question is how to connect the data coming into those methods, to a spout ? I'm trying to i) pass an backtype.storm.topology.IRichSpout , then ii) pass a backtype.storm.spout.SpoutOutputCollector (see here) to that spout's open function (see here). But I can't see a way to actually pass in any kind of map or list.
B) The rest of my project is all Clojure. There will be a lot of data coming through those methods. Each event will have an ID of between 1 and 100. In Clojure, I'll want to split data coming from the spout, into different threads of execution. Those, I think, will be the bolts.
How can I set up a Clojure bolt to take event data from the spout, then break-off a thread based on the ID of the incoming event ?
Thanks in advance
Tim
[EDIT 1]
I've actually gotten past this problem. I ended up 1) implementing my own IRichSpout. I then 2) connected that spout's internal tuple to the incoming stream data in my java callback class. I'm not sure if this is idiomatic. But it compiles and runs without error. However, 3) I don't see the incoming stream data (definitely there), coming through the printstuff bolt.
In order to ensure that the event data gets propagated, is there something specific I have to do in the spout or bolt implementation or topology definition? Thanks.
;; tie Java callbacks to a Spout that I created
(.setSpout java-callback ibspout)
(storm/defbolt printstuff ["word"] [tuple collector]
(println (str "printstuff --> tuple["tuple"] > collector["collector"]"))
)
(storm/topology
{ "1" (storm/spout-spec ibspout)
}
{ "3" (storm/bolt-spec { "1" :shuffle }
printstuff
)
})
[EDIT 2]
On the advice of SO member Ankur, I'm rejigging my topology. After I've created my Java callback, I pass it's tuple to the below IBSpout, using (.setTuple ibspout (.getTuple java-callback)). I don't pass the entire Java callback object, because I get a NotSerializable error. Everything compiles and runs without error. But again, there's no data coming to my printstuff bolt. Hmmm.
public class IBSpout implements IRichSpout {
/**
* Storm spout stuff
*/
private SpoutOutputCollector _collector;
private List _tuple = new ArrayList();
public void setTuple(List tuple) { _tuple = tuple; }
public List getTuple() { return _tuple; }
/**
* Storm ISpout interface functions
*/
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
_collector = collector;
}
public void close() {}
public void activate() {}
public void deactivate() {}
public void nextTuple() {
_collector.emit(_tuple);
}
public void ack(Object msgId) {}
public void fail(Object msgId) {}
public void declareOutputFields(OutputFieldsDeclarer declarer) {}
public java.util.Map getComponentConfiguration() { return new HashMap(); }
}

It seems that you are passing the spout to your callback class which seems to a bit weird. When a topology is executed storm will periodically calls the spouts nextTuple method, hence what you need to do is pass the java callback to your custom spout implementation so that when storm calls your spout, the spout calls the java callback to get the next set of tuples to be fed into the topology.
The key concept to understand is that Spouts pulls data when requested by storm, you don't push data to spouts. Your callback cannot call spout to push data to it, rather your spout should pull data (from some java method or any memory buffer) when your spout's nextTuple method is called.

Answer to part B:
The straightforward answer sounds to me like you are looking for a field grouping so you can control what works gets grouped together during execution by ID.
That said, I'm not confident that this is really a full answer because I don't know why you are trying to do it this way. If you just want a balanced workload, a shuffle grouping is a better choice.

Related

C++ Filter Pipeline

I want to develop a Filter Pipeline for my Application.
The Pipeline should consist of any number of filters.
For the Filters i declare an abstract base class like this:
struct AbstractFilter {
virtual void execute(const std::string& message) = 0;
virtual ~AbstractFilter() = default;
}
Each Filter should inherit from this base class and implement the execute Method.
Like so:
struct PrintMessage : public AbstractFilter {
void execute(const std::string& message) override {
std::cout << "Filter A " << message << '\n';
//hand over message to next Filter
}
}
struct Upper : public AbstractFilter {
void execute(const std::string& message) override {
std::string new_line;
for (char c : line)
new_line.push_back(std::toupper(c));
//hand over message to next Filter
}
}
struct WriteToFile : public AbstractFilter {
void execute(const std::string& message) override {
std::ofstream of{"test.txt"};
of << message;
of.close();
}
}
EDIT 1:
The Message should be send from one filter to the next in the Pipeline.
If the pipeline for example is like this:
Upper -- PrintMessage -- WriteToFile
The Message should pass all the 3 Filters. (For example if Upper finished his work the message should be send to PrintMessage and so on)
In the example above if the Message Hello World is send to the Pipeline the output should be:
Console:
HELLO WORLD
test.txt:
HELLO WORLD
EDIT 2:
The Filter only changes the content of the given Message. The Type is not changed. Every Filter should work with for example strings or a given class.
The Message is only forwarded to one recipient.
My Question is now how to connect these Filters?
My First guess was to use Queues. So every Filter gets an Input and Output Queue. For this i think every filter should run inside it's own Thread and be notified if data is added to his Input Queue. (The Output Queue of for example FilterA is also the Input Queue of FilterB)
My Second Guess was to use the Chain Of Responsibility Pattern and boost::signals2
So FilterB for example connects to the Signal of FilterA. FilterA calls these Filter when it finished it's work.
Which of the two solutions is the more flexible? Or is there even a better way to connect the Filters?
An additional Question is it also possible to run the whole Pipeline inside a Thread so that i can start multiple Pipelines? (In the Example have 3 of the FilterA-FilterB-FilterD Pipeline up and running?)
I would procede in this way:
create a List with all the implemented versions of the Abstract Filter. So, following your exmample, after reading the input file I will get a list with:
[0]:Upper
[1]:PrintMessage
[2]:WriteToFile
Then a single thread (or a thread poll if you need to process many string at time) waiting a string in an input queue. When a new string appears in the pool, the thread loops on the filter list and at the end posts the result in an output queue.
If you want to run it in parallel, you need to find a way to keep the order of the input strings anche nelle stringhe di output.
I think AbstractFilter is not necessary and I'd suggest to use std::tuple to define a pipeline:
std::tuple<FilterA, FilterB> pipeline1;
std::tuple<FilterA, FilterB, FilterC ... > pipeline2;
To run a message through a pipeline do (using c++17):
template<typename Pipeline>
void run_in_pipeline(const std::string& message, Pipeline& pipeline){
std::apply([&message](auto&& ... filter) {
(filter.execute(message), ...);
}, pipeline);
}
If you care about performance and filters must be executed sequentially, I wouldn't suggest using multithreading or signal-slot patterns on a single pipeline. Consider instead run different pipelines on different threads if you are dealing with multithreading applications
I believe the Chain of Responsibility pattern is simpler, allows for cleaner code and greater flexibility.
You do not need third-party libraries to implement it.
What you call filters are actually handlers. All handlers implement a common interface, defining a single method that could be named handle() and could even take an object as parameter to share state. Each handler stores a pointer to the next handler. It may or may not call that method on it; in the latter case processing is halted, and it acts as a filter.
Running pipeline stages in parallel is more involved if some of them require the output of others as input. For different pipelines to run in parallel, each one would run on its own thread, and you could use a queue to pass inputs to it.

What should be the best way to filter the kafka message

I'm consuming data from a kafka topic which includes the area code. I have to filter the data only for certain area codes. Can any one suggest be the best approach to solve this.
Here is my listener code looks like. Is it best practice to parse the data into object(as I mapped the payload to a TEST object) and filter the data based on the value which I need to filter or Does kafka provides any other libraries which I can make use of this filtering process.
Kafka Listener Method
#Service
public class Listener{
#KafkaListener(topics = "#{#topicName}")
public void listen(String payload) throws IOException {
LOGGER.info("received payload from topic='{}'", payload);
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
TEST test = objectMapper.readValue(payload,TEST.class);
}
}
My Kafka Configuration class:
#Configuration
public class Config {
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> properties = new HashMap<>();
properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, applicationConfiguration.getKafkaBootStrap());
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, applicationConfiguration.getKafkaKeyDeserializer());
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, applicationConfiguration.getKafkaValueDeserializer());
properties.put(ConsumerConfig.GROUP_ID_CONFIG, applicationConfiguration.getKafkaGroupId());
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, applicationConfiguration.getKafkaAutoOffsetReset());
return properties;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
#Bean
public Listener receiver() {
return new Listener();
}
}
See Filtering Messages.
The Spring for Apache Kafka project also provides some assistance by means of the FilteringMessageListenerAdapter class, which can wrap your MessageListener. This class takes an implementation of RecordFilterStrategy in which you implement the filter method to signal that a message is a duplicate and should be discarded. This has an additional property called ackDiscarded, which indicates whether the adapter should acknowledge the discarded record. It is false by default.
When you use #KafkaListener, set the RecordFilterStrategy (and optionally ackDiscarded) on the container factory so that the listener is wrapped in the appropriate filtering adapter.
/**
* Set the record filter strategy.
* #param recordFilterStrategy the strategy.
*/
public void setRecordFilterStrategy(RecordFilterStrategy<? super K, ? super V> recordFilterStrategy) {
this.recordFilterStrategy = recordFilterStrategy;
}
/**
* Implementations of this interface can signal that a record about
* to be delivered to a message listener should be discarded instead
* of being delivered.
*
* #param <K> the key type.
* #param <V> the value type.
*
* #author Gary Russell
*
*/
public interface RecordFilterStrategy<K, V> {
/**
* Return true if the record should be discarded.
* #param consumerRecord the record.
* #return true to discard.
*/
boolean filter(ConsumerRecord<K, V> consumerRecord);
}
What you did is alright.
If your payload has a lot of data besides the area code and you worry about long parsing, you can filter the messages before doing the whole parsing to TEST object by adding the area code as a header.
Later versions of Kafka (after 0.11) offers custom headers (KIP-82)
If you want to implement it by yourself (or if you use older version of Kafka), you can add the header to your message payload, lets say as the first 4 bytes of the message, they will represent the area code and can be extracted very fast prior to the parsing process.
New message payload:
([header-4-bytes],[original-payload-n-bytes])
So make your filter based on the header, and if you find out this is an area code you need, create your TEST object based on the rest of the message (cut the first 4 bytes to remove the header).
Kafka does not provide any filtering options that could help you, though it has the ability to send keyed messages in your Producer, so if your key is the area code Kafka guarantees that all messages with the same area codes goes to the same parition, maybe can help for your performance if used correctly.
The Producer can also send messages to a specific partitions, so if you knew you have fixed area code numbers you could also define the topic with partition number equal to the unique area codes count and send each area code to different partition, then use your Consumer to access only to the partitions with the area codes you are looking for, but, may be overkill for most cases.

POCO WebSocket - sending data from another class

I'm trying to create a WebSocket Server.
I can establish a connection and everything works fine so far.
In this GitHub example the data is send within the handleRequest() method that is called when a client connects.
But can I send data to the client from another class using the established WebSocket connection?
How can I archieve this? Is this even possible?
Thank you.
It is, of course, possible. In the example you referred, you should have a member pointer to WebSocket in the RequestHandlerFactory, eg.:
class RequestHandlerFactory: public HTTPRequestHandlerFactory
{
//...
private:
shared_ptr<WebSocket> _pwebSocket;
};
pass it to the WebSocketRequestHandler constructor:
return new WebSocketRequestHandler(_pwebSocket);
and WebSocketRequestHandler should look like this:
class WebSocketRequestHandler: public HTTPRequestHandler
{
public:
WebSocketRequestHandler(shared_ptr<WebSocket> pWebSocket) :_pWebSocket(pWebSocket)
{}
void handleRequest(HTTPServerRequest& request, HTTPServerResponse& response)
{
// ...
_pWebSocket.reset(make_shared<WebSocket>(request, response));
// ...
}
private:
shared_ptr<WebSocket> _pWebSocket;
}
Now, after the request handler creates it, you will have a pointer to the WebSocket in the factory (which is long lived, unlike RequestHandler, which comes and goes away with every request). Keep in mind that handler executes in its own thread, so you should have some kind of locking or notification mechanism to signal when the WebSocket has actually been created by the handler (bool cast of _pWebSocket will be true after WebSocket was successfully created).
The above example only illustrates the case with a single WebSocket - if you want to have multiple ones, you should have an array or vector of pointers and add/remove them as needed. In any case, the WebSocket pointer(s) need not necessarily reside in the factory - you can either (a) put them elsewhere in your application and propagate them to the factory/handler or (b) have a global facility (with proper multi-thread-access mechanism) holding the WebSocket(s).

SFDC Apex Code: Access class level static variable from "Future" method

I need to do a callout to webservice from my ApexController class. To do this, I have an asycn method with attribute #future (callout=true). The webservice call needs to refeence an object that gets populated in save call from VF page.
Since, static (future) calls does not all objects to be passed in as method argument, I was planning to add the data in a static Map and access that in my static method to do a webservice call out. However, the static Map object is getting re-initalized and is null in the static method.
I will really appreciate if anyone can give me some pointeres on how to address this issue.
Thanks!
Here is the code snipped:
private static Map<String, WidgetModels.LeadInformation> leadsMap;
....
......
public PageReference save() {
if(leadsMap == null){
leadsMap = new Map<String, WidgetModels.LeadInformation>();
}
leadsMap.put(guid,widgetLead);
}
//make async call to Widegt Webservice
saveWidgetCallInformation(guid)
//async call to widge webserivce
#future (callout=true)
public static void saveWidgetCallInformation(String guid) {
WidgetModels.LeadInformation cachedLeadInfo =
(WidgetModels.LeadInformation)leadsMap.get(guid);
.....
//call websevice
}
#future is totally separate execution context. It won't have access to any history of how it was called (meaning all static variables are reset, you start with fresh governor limits etc. Like a new action initiated by the user).
The only thing it will "know" is the method parameters that were passed to it. And you can't pass whole objects, you need to pass primitives (Integer, String, DateTime etc) or collections of primitives (List, Set, Map).
If you can access all the info you need from the database - just pass a List<Id> for example and query it.
If you can't - you can cheat by serializing your objects and passing them as List<String>. Check the documentation around JSON class or these 2 handy posts:
https://developer.salesforce.com/blogs/developer-relations/2013/06/passing-objects-to-future-annotated-methods.html
https://gist.github.com/kevinohara80/1790817
Side note - can you rethink your flow? If the starting point is Visualforce you can skip the #future step. Do the callout first and then the DML (if needed). That way the usual "you have uncommitted work pending" error won't be triggered. This thing is there not only to annoy developers ;) It's there to make you rethink your design. You're asking the application to have open transaction & lock on the table(s) for up to 2 minutes. And you're giving yourself extra work - will you rollback your changes correctly when the insert went OK but callout failed?
By reversing the order of operations (callout first, then the DML) you're making it simpler - there was no save attempt to DB so there's nothing to roll back if the save fails.

Better structure for request based protocol implementation

I am using a protocol, which is basically a request & response protocol over TCP, similar to other line-based protocols (SMTP, HTTP etc.).
The protocol has about 130 different request methods (e.g. login, user add, user update, log get, file info, files info, ...). All these methods do not map so well to the broad methods as used in HTTP (GET,POST,PUT,...). Such broad methods would introduce some inconsequent twists of the actual meaning.
But the protocol methods can be grouped by type (e.g. user management, file management, session management, ...).
Current server-side implementation uses a class Worker with methods ReadRequest() (reads request, consisting of method plus parameter list), HandleRequest() (see below) and WriteResponse() (writes response code & actual response data).
HandleRequest() will call a function for the actual request method - using a hash map of method name to member function pointer to the actual handler.
The actual handler is a plain member function there is one per protocol method: each one validates its input parameters, does whatever it has to do and sets response code (success yes/no) and response data.
Example code:
class Worker {
typedef bool (Worker::*CommandHandler)();
typedef std::map<UTF8String,CommandHandler> CommandHandlerMap;
// handlers will be initialized once
// e.g. m_CommandHandlers["login"] = &Worker::Handle_LOGIN;
static CommandHandlerMap m_CommandHandlers;
bool HandleRequest() {
CommandHandlerMap::const_iterator ihandler;
if( (ihandler=m_CommandHandlers.find(m_CurRequest.instruction)) != m_CommandHandler.end() ) {
// call actual handler
return (this->*(ihandler->second))();
}
// error case:
m_CurResponse.success = false;
m_CurResponse.info = "unknown or invalid instruction";
return true;
}
//...
bool Handle_LOGIN() {
const UTF8String username = m_CurRequest.parameters["username"];
const UTF8String password = m_CurRequest.parameters["password"];
// ....
if( success ) {
// initialize some state...
m_Session.Init(...);
m_LogHandle.Init(...);
m_AuthHandle.Init(...);
// set response data
m_CurResponse.success = true;
m_CurResponse.Write( "last_login", ... );
m_CurResponse.Write( "whatever", ... );
} else {
m_CurResponse.Write( "error", "failed, because ..." );
}
return true;
}
};
So. The problem is: My worker class now has about 130 "command handler methods". And each one needs access to:
request parameters
response object (to write response data)
different other session-local objects (like a database handle, a handle for authorization/permission queries, logging, handles to various sub-systems of the server etc.)
What is a good strategy for a better structuring of those command handler methods?
One idea was to have one class per command handler, and initializing it with references to request, response objects etc. - but the overhead is IMHO not acceptable (actually, it would add an indirection for any single access to everything the handler needs: request, response, session objects, ...). It could be acceptable if it would provide an actual advantage. However, that doesn't sound much reasonable:
class HandlerBase {
protected:
Request &request;
Response &response;
Session &session;
DBHandle &db;
FooHandle &foo;
// ...
public:
HandlerBase( Request &req, Response &rsp, Session &s, ... )
: request(req), response(rsp), session(s), ...
{}
//...
virtual bool Handle() = 0;
};
class LoginHandler : public HandlerBase {
public:
LoginHandler( Request &req, Response &rsp, Session &s, ... )
: HandlerBase(req,rsp,s,..)
{}
//...
virtual bool Handle() {
// actual code for handling "login" request ...
}
};
Okay, the HandlerBase could just take a reference (or pointer) to the worker object itself (instead of refs to request, response etc.). But that would also add another indirection (this->worker->session instead of this->session). That indirection would be ok, if it would buy some advantage after all.
Some info about the overall architecture
The worker object represents a single worker thread for an actual TCP connection to some client. Each thread (so, each worker) needs its own database handle, authorization handle etc. These "handles" are per-thread-objects that allow access to some sub-system of the server.
This whole architecture is based on some kind of dependency injection: e.g. to create a session object, one has to provide a "database handle" to the session constructor. The session object then uses this database handle to access the database. It will never call global code or use singletons. So, each thread can run undisturbed on its own.
But the cost is, that - instead of just calling out to singleton objects - the worker and its command handlers must access any data or other code of the system through such thread-specific handles. Those handles define its execution context.
Summary & Clarification: My actual question
I am searching for an elegant alternative to the current ("worker object with a huge list of handler methods") solution: It should be maintainable, have low-overhead & should not require writing too much glue-code. Additionally, it MUST still allow each single method control over very different aspects of its execution (that means: if a method "super flurry foo" wants to fail whenever full moon is on, then it must be possible for that implementation to do so). It also means, that I do not want any kind of entity abstraction (create/read/update/delete XFoo-type) at this architectural layer of my code (it exists at different layers in my code). This architectural layer is pure protocol, nothing else.
In the end, it will surely be a compromise, but I am interested in any ideas!
The AAA bonus: a solution with interchangeable protocol implementations (instead of just that current class Worker, which is responsible for parsing requests and writing responses). There maybe could be an interchangeable class ProtocolSyntax, that handles those protocol syntax details, but still uses our new shiny structured command handlers.
You've already got most of the right ideas, here's how I would proceed.
Let's start with your second question: interchangeable protocols. If you have generic request and response objects, you can have an interface that reads requests and writes responses:
class Protocol {
virtual Request *readRequest() = 0;
virtual void writeResponse(Response *response) = 0;
}
and you could have an implementation called HttpProtocol for example.
As for your command handlers, "one class per command handler" is the right approach:
class Command {
virtual void execute(Request *request, Response *response, Session *session) = 0;
}
Note that I rolled up all the common session handles (DB, Foo etc.) into a single object instead of passing around a whole bunch of parameters. Also making these method parameters instead of constructor arguments means you only need one instance of each command.
Next, you would have a CommandFactory which contains the map of command names to command objects:
class CommandFactory {
std::map<UTF8String, Command *> handlers;
Command *getCommand(const UTF8String &name) {
return handlers[name];
}
}
If you've done all this, the Worker becomes extremely thin and simply coordinates everything:
class Worker {
Protocol *protocol;
CommandFactory *commandFactory;
Session *session;
void handleRequest() {
Request *request = protocol->readRequest();
Response response;
Command *command = commandFactory->getCommand(request->getCommandName());
command->execute(request, &response, session);
protocol->writeResponse(&response);
}
}
If it were me I would probably use a hybrid solution of the two in your question.
Have a worker base class that can handle multiple related commands, and can allow your main "dispatch" class to probe for supported commands. For the glue, you would simply need to tell the dispatch class about each worker class.
class HandlerBase
{
public:
HandlerBase(HandlerDispatch & dispatch) : m_dispatch(dispatch) {
PopulateCommands();
}
virtual ~HandlerBase();
bool CommandSupported(UTF8String & cmdName);
virtual bool HandleCommand(UTF8String & cmdName, Request & req, Response & res);
virtual void PopulateCommands();
protected:
CommandHandlerMap m_CommandHandlers;
HandlerDispatch & m_dispatch;
};
class AuthenticationHandler : public HandlerBase
{
public:
AuthenticationHandler(HandlerDispatch & dispatch) : HandlerBase(dispatch) {}
bool HandleCommand(UTF8String & cmdName, Request & req, Response & res) {
CommandHandlerMap::const_iterator ihandler;
if( (ihandler=m_CommandHandlers.find(req.instruction)) != m_CommandHandler.end() ) {
// call actual handler
return (this->*(ihandler->second))(req,res);
}
// error case:
res.success = false;
res.info = "unknown or invalid instruction";
return true;
}
void PopulateCommands() {
m_CommandHandlers["login"]=Handle_LOGIN;
m_CommandHandlers["logout"]=Handle_LOGOUT;
}
void Handle_LOGIN(Request & req, Response & res) {
Session & session = m_dispatch.GetSessionForRequest(req);
// ...
}
};
class HandlerDispatch
{
public:
HandlerDispatch();
virtual ~HandlerDispatch() {
// delete all handlers
}
void AddHandler(HandlerBase * pHandler);
bool HandleRequest() {
vector<HandlerBase *>::iterator i;
for ( i=m_handlers.begin() ; i < m_handlers.end(); i++ ) {
if ((*i)->CommandSupported(m_CurRequest.instruction)) {
return (*i)->HandleCommand(m_CurRequest.instruction,m_CurRequest,m_CurResponse);
}
}
// error case:
m_CurResponse.success = false;
m_CurResponse.info = "unknown or invalid instruction";
return true;
}
protected:
std::vector<HandlerBase*> m_handlers;
}
And then to glue it all together you would do something like this:
// Init
m_handlerDispatch.AddHandler(new AuthenticationHandler(m_handlerDispatch));
As for the transport (TCP) specific part, did you have a look at the ZMQ library that supports various distributed computing patterns via messaging sockets/queues? IMHO you should find an appropriate pattern that serves your needs in their Guide document.
For choice of the protocol messages implementation i would personally favorite google protocol buffers which works very well with C++, we are using it for a couple of projects now.
At least you'll boil down to dispatcher and handler implementations for specific requests and their parameters + the necessary return parameters. Google protobuf message extensions allow to to this in a generic way.
EDIT:
To get a bit more concrete, using protobuf messages the main difference of the dispatcher model vs yours will be that you don't need to do the complete message parsing before dispatch, but you can register handlers that tell themselves if they can handle a particular message or not by the message's extensions. The (main) dispatcher class doesn't need to know about the concrete extensions to handle, but just ask the registered handler classes. You can easily extend this mechanism to have certain sub-dispatchers to cover deeper message category hierarchies.
Because the protobuf compiler can already see your messaging data model completely, you don't need any kind of reflection or dynamic class polymorphism tests to figure out the concrete message content. Your C++ code can statically ask for possible extensions of a message and won't compile if such doesn't exist.
I don't know how to explain this in a better way, or to show a concrete example how to improve your existing code with this approach. I'm afraid you already spent some efforts on the de-/serialization code of your message formats, that could have been avoided using google protobuf messages (or what kind of classes are Request and Response?).
The ZMQ library might help to implement your Session context to dispatch requests through the infrastructure.
Certainly you shouldn't end up in a single interface that handles all kinds of possible requests, but a number of interfaces that specialize on message categories (extension points).
I think this is an ideal case for a REST-like implementation. One other way could also be grouping the handler methods based on category/any-other-criteria to several worker classes.
If the protocol methods can only be grouped by type but methods of the same group do not have anything common in their implementation, possibly the only thing you can do to improve maintainability is distributing methods between different files, one file for a group.
But it is very likely that methods of the same group have some of the following common features:
There may be some data fields in the Worker class that are used by only one group of methods or by several (but not every) group. For example, if m_AuthHandle may be used only by user management and session management methods.
There may be some groups of input parameters, used by every method of some group.
There may be some common data, written to the response by every method of some group.
There may be some common methods, called by several methods of some group.
If some of these facts is true, there is a good reason to group these features into different classes. Not one class per command handler, but one class per event group. Or, if there are features, common to several groups, a hierarchy of classes.
It may be convenient to group instances of all these group classes in one place:
classe UserManagement: public IManagement {...};
classe FileManagement: public IManagement {...};
classe SessionManagement: public IManagement {...};
struct Handlers {
smartptr<IManagement> userManagement;
smartptr<IManagement> fileManagement;
smartptr<IManagement> sessionManagement;
...
Handlers():
userManagement(new UserManagement),
fileManagement(new FileManagement),
sessionManagement(new SessionManagement),
...
{}
};
Instead of new SomeClass, some template like make_unique may be used. Or, if "interchangeable protocol implementations" are needed, one of the possibilities is to use factories instead of some (or all) new SomeClass operators.
m_CommandHandlers.find() should be split into two map searches: one - to find appropriate handler in this structure, other (in the appropriate implementation of IManagement) - to find a member function pointer to the actual handler.
In addition to finding a member function pointer, HandleRequest method of any IManagement implementation may extract common parameters for its event group and pass them to event handlers (one by one if there are just several of them, or grouped in a structure if there are many).
Also IManagement implementation may contain WriteCommonResponce method to simplify writing responce fields, common to all event handlers.
The Command Pattern is your solution to both aspects of this problem.
Use it to implement your protocol handler with a generalised IProtocol Interface (and/or abstract base class) and different implementations of protocol handler with a different Classes specialised for each protocol.
Then implement your Commands the same way with an ICommand Interface and each Command Methods implemented in seperate class. You are nearly there with this. Split your existing Methods into new Specialised Classes.
Wrap Your Requests and Responses as Mememento objects