Object-oriented networking - c++

I've written a number of networking systems and have a good idea of how networking works. However I always end up having a packet receive function which is a giant switch statement. This is beginning to get to me. I'd far rather a nice elegant object-oriented way to handle receiving packets but every time I try to come up with a good solution I always end up coming up short.
For example lets say you have a network server. It is simply waiting there for responses. A packet comes in and the server needs to validate the packet and then it needs to decide how to handle it.
At the moment I have been doing this by switching on the packet id in the header and then having a huge bunch of function calls that handle each packet type. With complicated networking systems this results in a monolithic switch statement and I really don't like handling it this way. One way I've considered is to use a map of handler classes. I can then pass the packet to the relevant class and handle the incoming data. The problem I have with this is that I need some way to "register" each packet handler with the map. This means, generally, I need to create a static copy of the class and then in the constructor register it with the central packet handler. While this works it really seems like an inelegant and fiddly way of handling it.
Edit: Equally it would be ideal to have a nice system that works both ways. ie a class structure that easily handles sending the same packet types as receiving them (through different functions obviously).
Can anyone point me towards a better way to handle incoming packets? Links and useful information are much appreciated!
Apologies if I haven't described my problem well as my inability to describe it well is also the reason I've never managed to come up with a solution.

About the way to handle the packet type: for me the map is the best. However I'd use a plain array (or a vector) instead of a map. It would make access time constant if you enumerate your packet types sequentially from 0.
As to the class structure. There are libraries that already do this job: Available Game network protocol definition languages and code generation. E.g. Google's Protocol Buffer seems to be promising. It generates a storage class with getters, setters, serialization and deserialization routines for every message in the protocol description. The protocol description language looks more or less rich.

A map of handler instances is pretty much the best way to handle it. Nothing inelegant about it.

In my experience, table driven parsing is the most efficient method.
Although std::map is nice, I end up using static tables. The std::map cannot be statically initialized as a constant table. It must be loaded during run-time. Tables (arrays of structures) can be declared as data and initialized at compile time. I have not encountered tables big enough where a linear search was a bottleneck. Usually the table size is small enough that the overhead in a binary search is slower than a linear search.
For high performance, I'll use the message data as an index into the table.

When you are doing OOP, you try to represent every thing as an object, right? So your protocol messages become objects too; you'll probably have a base class YourProtocolMessageBase which will encapsulate any message's behavior and from which you will inherit your polymorphically specialized messages. Then you just need a way to turn every message (i.e. every YourProtocolMessageBase instance) into a string of bytes, and a way to do reverse. Such methods are called serialization techniques; some metaprogramming-based implementations exist.
Quick example in Python:
from socket import *
sock = socket(AF_INET6, SOCK_STREAM)
sock.bind(("localhost", 1234))
rsock, addr = sock.accept()
Server blocks, fire up another instance for a client:
from socket import *
clientsock = socket(AF_INET6, SOCK_STREAM)
clientsock.connect(("localhost", 1234))
Now use Python's built-in serialization module, pickle; client:
import pickle
obj = {1: "test", 2: 138, 3: ("foo", "bar")}
clientsock.send(pickle.dumps(obj))
Server:
>>> import pickle
>>> r = pickle.loads(rsock.recv(1000))
>>> r
{1: 'test', 2: 138, 3: ('foo', 'bar')}
So, as you can see, I just sent over link-local a Python object. Isn't this OOP?
I think the only possible alternative to serializing is maintaining the bimap IDs ⇔ classes. This looks really inevitable.

You want to keep using the same packet network protocol, but translate that into an Object in programming, right ?
There are several protocols that allow you to treat data as programming objects, but it seems, you don't want to change the protocol, just the way its treated in your application.
Does the packets come with something like a "tag" or metadata or any "id" or "data type" that allows you to map to an specific object class ? If it does, you may create an array that stores the id. and the matching class, and generate an object.

A more OO way to handle this is to build a state machine using the state pattern.
Handling incoming raw data is parsing where state machines provide an elegant solution (you will have to choose between elegant and performance)
You have a data buffer to process, each state has a handle buffer method that parses and processes his part of the buffer (if already possible) and sets the next state based on the content.
If you want to go for performance, you still can use a state machine, but leave out the OO part.

I would use Flatbuffers and/or Cap’n Proto code generators.

I solved this problem as part of my btech in network security and network programming and I can assure it's not one giant packet switch statement. The library is called cross platform networking and I modeled it around the OSI model and how to output it as a simple object serialization. The repository is here: https://bitbucket.org/ptroen/crossplatformnetwork/src/master/
Their is a countless protocols like NACK, HTTP, TCP,UDP,RTP,Multicast and they all are invoked via C++ metatemplates. Ok that is the summarized answer now let me dive a bit deeper and explain how you solve this problem and why this library can help you out whether you design it yourself or use the library.
First, let's talk about design patterns in general. To make it nicely organized you need first some design patterns around it as a way to frame your problem. For my C++ templates I framed it initially around the OSI Model(https://en.wikipedia.org/wiki/OSI_model#Layer_7:_Application_layer) down to the transport level(which becomes sockets at that point). To recap OSI :
Application Layer: What it means to the end user. IE signals getting deserialized or serialized and passed down or up from the networking stack
Presentation: Data independence from application and network stack
Session: dialogues between sessions
Transport: transporting the packets
But here's the kicker when you look at these closely these aren't design pattern but more like namespaces around transporting from A to B. So to a end user I designed cross platform network with the following standardized C++ metatemplate
template <class TFlyWeightServerIncoming, // a class representing the servers incoming payload. Note a flyweight is a design pattern that's a union of types ie putting things together. This is where you pack your incoming objects
class TFlyWeightServerOutgoing, // a class representing the servers outgoing payload of different types
class TServerSession, // a hook class that represent how to translate the payload in the form of a session layer translation. Key is to stay true to separation of concerns(https://en.wikipedia.org/wiki/Separation_of_concerns)
class TInitializationParameters> // a class representing initialization of the server(ie ports ,etc..)
two examples: https://bitbucket.org/ptroen/crossplatformnetwork/src/master/OSI/Transport/TCP/TCPTransport.h
https://bitbucket.org/ptroen/crossplatformnetwork/src/master/OSI/Transport/HTTP/HTTPTransport.h
And each protocol can be invoked like this:
OSI::Transport::Interface::ITransportInitializationParameters init_parameters;
const size_t defaultTCPPort = 80;
init_parameters.ParseServerArgs(&(*argv), argc, defaultTCPPort, defaultTCPPort);
OSI::Transport::TCP::TCP_ServerTransport<SampleProtocol::IncomingPayload<OSI::Transport::Interface::ITransportInitializationParameters>, SampleProtocol::OutgoingPayload<OSI::Transport::Interface::ITransportInitializationParameters>, SampleProtocol::SampleProtocolServerSession<OSI::Transport::Interface::ITransportInitializationParameters>, OSI::Transport::Interface::ITransportInitializationParameters> tcpTransport(init_parameters);
tcpTransport.RunServer();
citation:
https://bitbucket.org/ptroen/crossplatformnetwork/src/master/OSI/Application/Stub/TCPServer/main.cc
I also have in the code base under MVC a full MVC implementation that builds on top of this but let's get back to your question. You mentioned:
"At the moment I have been doing this by switching on the packet id in the header and then having a huge bunch of function calls that handle each packet type."
" With complicated networking systems this results in a monolithic switch statement and I really don't like handling it this way. One way I've considered is to use a map of handler classes. I can then pass the packet to the relevant class and handle the incoming data. The problem I have with this is that I need some way to "register" each packet handler with the map. This means, generally, I need to create a static copy of the class and then in the constructor register it with the central packet handler. While this works it really seems like an inelegant and fiddly way of handling it."
In cross platform network the approach to adding new types is as follows:
After you defined the server type you just need to make the incoming and outgoing types. The actual mechanism for handling them is embedded with in the incoming object type. The methods within it are ToString(), FromString(),size() and max_size(). These deal with the security concerns of keeping the layers below the application layer secure. But since your defining object handlers now you need to make the translation code to different object types. You'll need at minimum within this object:
1.A list of enumerated object types for the application layer. This could be as simple as numbering them. But for things like the session layer have a look at session layer concerns(for instance RTP has things like jitter and how to deal with imperfect connection. IE session concerns). Now you could also switch from enumerated to a hash/map but that's just another way of dealing of the problem how to look up the variable.
Defining Serialize and de serialize the object(for both incoming and outgoing types).
After you serialized or deserialize put the logic to dispatch it to the appropriate internal design pattern to handle the application layer. This could possibly be a builder , or command or strategy it really depends on it's use case. In cross platform some concerns is delegated by the TServerSession layer and others by the incoming and outgoing classes. It just depends on the seperation of concerns.
Deal with performance concerns. IE its not blocking(which becomes a bigger concern when you scale up concurrent user).
Deal with security concerns(pen test).
If you curious you can review my api implementation and it's a single threaded async boost reactor implementation and when you combine with something like mimalloc(to override new delete) you can get very good performance. I measured like 50k connections on a single thread easily.
But yeah it's all about framing your server in good design patterns , separating the concerns and selecting a good model to represent the server design. I believe the OSI model is appropriate for that which is why i put in cross platform network to provide superior object oriented networking.

Related

Don't break backwards-compatibility messages in a DDS system

I'm involved in a project which uses DDS as a protocol to communicate and C++ as language. As you know exchanged messages are called Topics. Well, sometimes a team must change a topic definition, as a consequence of this, other software which depends on this topic stops working, it is necessary to update the topic everywhere and recompile. So, my question is, do you know how not to break backwards compatiblity? I have been searching and I have found google protocol buffer, they say this:
"You can add new fields to your message formats without breaking
backwards-compatibility; old binaries simply ignore the new field when
parsing. So if you have a communications protocol that uses protocol
buffers as its data format, you can extend your protocol without
having to worry about breaking existing code."
Any other idea?
Thanks in advance.
The OMG specification Extensible and Dynamic Topic Types for DDS (or DDS-XTypes for short) addresses your problem. Quoted from that specification:
The Type System supports type evolution so that it is possible to
“evolve the type” as described above and retain interoperability
between components that use different versions of the type
Not all DDS implementations currently support XTypes, so you might have to resort to a different solution. For example, you could include a version numbering scheme in your Topic name to avoid typing conflicts between different components. In order to make sure every component receives the right data it needs, you can create a service that has the responsibility to forward between different versions of the Topics as needed. This service has to be aware of the different versions of the Topics and should take care of filling default values and/or transform between their different types. Whether or not this is a viable solution depends, among other things, on your system requirements.
The use of a different type system like Protocol Buffers inside DDS is not recommended if it is just for the purpose of solving the type evolution problem. You would essentially be transporting your PB messages as data opaque to the DDS middleware. This means that you would also lose some of the nice tooling features like dynamic discovery and display of types, because the PB messages are not understood by the DDS middleware. Also, your applications would become more complex because they would be responsible for invoking the right PB de/serialization methods. It is easier to let DDS take care of all that.
Whichever route you take, it is recommended that you manage your data model evolution tightly. If you let just anybody add or remove some attributes to their liking, the situation will become difficult to maintain quickly.
The level of support for protocol buffers with DDS depends on how its implemented. With PrismTech's Vortex for instance, you don't lose content-awareness, nor dynamic discovery nor display of types as both the middleware and its tools DO understand the PB messages. W.r.t. 'populating' your PB-based topic, thats conform the PB-standard and generated transparently by the protoc compiler, so one could state that if you're familiar with protobuf (and perhaps not with OMG's IDL alternative), then you can really benefit from a proper integration of DDS and GPB that retains the global-dataspace advantages yet i.c.w. a well-known/popular type-system (GPB)
You could try wrapping the DDS layer in your own communication layer.
Define a set of DDS Types and DDS Profiles that covers all your needs.
Then each topic will be define as one of those types associated to one of those profiles.
For example you could have a type for strings, and a binary type.
If you use the same compiler for all your software, you can even safely memcopy a C struct to the binary type.
module atsevents {
valuetype DataType {
public string<128> mDataId;
public boolean mCustomField1;
public boolean mCustomField2;
};
valuetype StringDataType : DataType {
public string<128> mDataKey; //#key
public string<1024> mData;
};
valuetype BinaryDataType : DataType {
public string<128> mDataKey; //#key
public sequence<octet, 1024> mData;
public unsigned long mTypeHash;
}
}

Message receive for c++ actor system

I am trying to implement message handling for actors in c++. The following code in scala is something I am trying to implement in c++
def receive = {
case Message1 =>{/* logic code */}
case Message2 =>{/* logic code */}
}
Thus the idea is to create a set of handler functions for the various message type and create a dispatch method to route the message to its appropiate message handler. All messages will extends the base message type.
What would be the best approach to solve this problem:
Maintain a Map(Message_type, function_pointer), the dispatch method will check the map and call the appropiate method. This mapping however needs to be done mannually in the Actor class.
I read this library, the lbrary is handling message exactly as I want to but I cant understand how they do pattern matching on the lambda fucntions created on line 56.
I would appreciate any suggestion or reading links that could get me closer to the solution of this problem.
Since you've already mentioned CAF: why do you want to implement your own actor library instead of using CAF? If you are writing the lib as an exercise, I suggest start reading libcaf_core/caf/match_case.hpp, libcaf_core/caf/on.hpp and libcaf_core/caf/detail/try_match.hpp. This is the "core" of CAF's pattern matching facility. Be warned, you will be looking at a lot of metaprogramming code. The code is meant to be read by C++ experts. It's definitely not a good place to learn the techniques.
I can outline what's going on, though.
CAF stores patterns as a list of match_case objects in detail::behavior_impl
You never get a pointer to either one as user
message_handler and behavior store a pointer to behavior_impl
Match cases can be generated differently:
Directly from callbacks/lambdas (trivial case)
Using a catch-all rule (via others >> ...)
Using the advanced on(...) >> ... notation
CAF can only match against tuples stored in message objects
"Emulates" (a subset of) reflections
Values and meta information (i.e. type information) are needed
For the matching itself, CAF simply iterates over the list of match_case objects
Try match the input with each case
Stop on first match (just like functional languages implement this)
We put a lot of effort into the pattern matching implementation to get a high-level and clean interface on the user's end. It's not easy, though. So, if you're doing this as an exercise, be warned that you need a lot of metaprogramming experience to understand the code.
In case you don't do this as an exercise, I would be interested why you think CAF does not cover your use case and maybe we can convince you to participate in its development rather than developing something else from scratch. ;)
Try to use sobjectizer (batteries included) or rotor(still experimental, but quite lightweight actor-like solution).

Best way of serializing/deserializing a simple protocol in C++

I want to build a simple application protocol using Berkeley sockets in C++ using on Linux. The transport layer should be UDP, and the protocols will contain the following two parts:
The first part:
It is a fixed part which is the protocol Header with the following fields:
1. int HeaderType
2. int TransactionID
3. unsigned char Source[4]
4. unsigned char Destination[4]
5. int numberoftlvs
The second part
It will contain variable number of TLVs, each TLV will contain the following fields:
1. int type
2. int length
3. unsigned char *data "Variable length"
My question is for preparing the message to be sent over the wire,what's the best way to do serialization and deserialization, to be portable on all the systems like little Endian and big Endian?
Should I prepare a big buffer of "unsigned char", and start copying the fields one by one to it? And after that, just call send command?
If I am going to follow the previous way, how can I keep tracking the pointer to where to copy my fields, my guess would be to build for each datatype a function which will know how many bytes to move the pointer, correct?
If someone can provide me with a well explained example ,it will much appreciated.
some ideas... in no particular order... and probably not making sense all together
You can have a Buffer class. This class contains raw memory pointer where you are composing your message and it can contains counter or pointers to keep track of how much have you written, where you're writing and how far can you go.
Probably you would like to have one instance of the Buffer class for each thread reading/writing. No more, because you don't want to have expensive buffers like this around. Bound to a specific thread because you don't want to share them without locking (and locking is expensive)
Probably you would like to reuse a Buffer from one message to the next, avoiding the cost of creating and destroying it.
You might want to explore the idea of a Decorator a class that inherits or contains each of your data classes. In this case they idea for this decorator is to contain the methods to serialize and deserialize each of your data types.
One option for this is to make the Decorator a template and use class template specialization to provide the different formats.
Combining the Decorator methods and Buffer methods you should have all the control you need.
You can have in the Buffer class magical templated methods that have an arbitary object as parameter and automatically creates a Decorator for it and serializes.
Connversely, deserializing should get you a Decorator that should be made convertible to the decorated type.
I'm sorry I don't have the time right now to give you a full blown example, but I hope the above ideas can get you started.
As an example I shamelessly plug my own (de)serialization library that's packing into msgpackv5 format:
flurry

Design Pattern for an EEPROM burner

I've built myself a basic EEPROM burner using a Teensy++ 2.0 for my PC bridge, and it's working great, but as I look to expand its compatibility, my code is getting rather hacky. I'm looking for some advice for a proper design for making this code expandable. I've taken a class in software design patterns, but it was awhile ago, and I'm currently drawing a blank. Basically, here's the use case:
I have several methods, such as ReadByte(), WriteByte(), ProgramByte() (for FlashROMs that require a multi-byte write sequence in order to program), EraseChip(), etc. so basically I have an EEPROM pure virtual base class that gets implemented by concrete classes for each chip type that I want to support. The tricky part is determining which chip type object to generate. I'm currently using a pseudo-terminal front-end on the Teensy++ serial input, a basic command-line type interface with parameters, to send options like the chip type to the Teensy++. The question is, is there a design pattern (in C/C++), something like a Factory Pattern, that would take a string input of the chip type (because that's what I'm getting from the user), and return an EEPROM object of the correct derived type, without having to manually create some big switch statement or something ugly like that where I'd have to add the new chip to a list any time I create a new chip derived class? So something like:
public const EEPROM & GetEEPROM(const std::string & id)
and if I pass it the string "am29f032b" it returns a reference to an AM29F032B object, or if I pass it the string "sst39sf040" it returns a reference to an SST39SF040 object, which I could then call the previously mentioned functions on, and it would work for the specified chip.
This code will be run on an AVR microcontroller, so I can't have anything with a huge OOP overhead, but the particular microcontroller I'm using does have a relatively large amount of program flash and work RAM, so it's not like I'm trying to operate in 2kb, but I do have to keep in mind limited resources.
What you're looking for is a Pluggable Factory. There's a good description here. It's attributed to John Vlissides (1 of the Gang of Four) and takes the Abstract factory pattern a few steps further. This pattern happens to also be the architectural foundation of COM.
The usual way of implementing one in C++ is to maintain a static registry of abstract factories. With judicious use of a few templates and static initialisers, you can wrap the whole lot up few lines of boiler-plate which you include in each concrete product (e.g. chip type).
The use of static initialisers allows a complete decoupling of concrete products from both the registry and the code wanting to create products, and has the possibility of implementing each as a plug-in.
You could have a singleton factory manager that keeps a map of string->factory object. Then each factory class would have a global instance that registers itself with the manager on startup.
I try to avoid globals in general, and singletons in particular, but any other approach would require some form of explicit list which you're trying to avoid. You will have to be careful with timing issues (you can't really assume anything about the order in which the various factories will be created).

Design methods for multiple serialization targets/formats (not versions)

Whether as members, whether perhaps static, separate namespaces, via friend-s, via overloads even, or any other C++ language feature...
When facing the problem of supporting multiple/varying formats, maybe protocols or any other kind of targets for your types, what was the most flexible and maintainable approach?
Were there any conventions or clear cut winners?
A brief note why a particular approach helped would be great.
Thanks.
[ ProtoBufs like suggestions should not cut it for an upvote, no matter how flexible that particular impl might be :) ]
Reading through the already posted responses, I can only agree with a middle-tier approach.
Basically, in your original problem you have 2 distinct hierarchies:
n classes
m protocols
The naive use of a Visitor pattern (as much as I like it) will only lead to n*m methods... which is really gross and a gateway towards maintenance nightmare. I suppose you already noted it otherwise you would not ask!
The "obvious" target approach is to go for a n+m solution, where the 2 hierarchies are clearly separated. This of course introduces a middle-tier.
The idea is thus ObjectA -> MiddleTier -> Protocol1.
Basically, that's what Protocol Buffers does, though their problematic is different (from one language to another via a protocol).
It may be quite difficult to work out the middle-tier:
Performance issues: a "translation" phase add some overhead, and here you go from 1 to 2, this can be mitigated though, but you will have to work on it.
Compatibility issues: some protocols do not support recursion for example (xml or json do, edifact does not), so you may have to settle for a least-common approach or to work out ways of emulating such behaviors.
Personally, I would go for "reimplementing" the JSON language (which is extremely simple) into a C++ hierarchy:
int
strings
lists
dictionaries
Applying the Composite pattern to combine them.
Of course, that is the first step only. Now you have a framework, but you don't have your messages.
You should be able to specify a message in term of primitives (and really think about versionning right now, it's too late once you need another version). Note that the two approaches are valid:
In-code specification: your message is composed of primitives / other messages
Using a code generation script: this seems overkill there, but... for the sake of completion I thought I would mention it as I don't know how many messages you really need :)
On to the implementation:
Herb Sutter and Andrei Alexandrescu said in their C++ Coding Standards
Prefer non-member non-friend methods
This applies really well to the MiddleTier -> Protocol step > creates a Protocol1 class and then you can have:
Protocol1 myProtocol;
myProtocol << myMiddleTierMessage;
The use of operator<< for this kind of operation is well-known and very common. Furthermore, it gives you a very flexible approach: not all messages are required to implement all protocols.
The drawback is that it won't work for a dynamic choice of the output protocol. In this case, you might want to use a more flexible approach. After having tried various solutions, I settled for using a Strategy pattern with compile-time registration.
The idea is that I use a Singleton which holds a number of Functor objects. Each object is registered (in this case) for a particular Message - Protocol combination. This works pretty well in this situation.
Finally, for the BOM -> MiddleTier step, I would say that a particular instance of a Message should know how to build itself and should require the necessary objects as part of its constructor.
That of course only works if your messages are quite simple and may only be built from few combination of objects. If not, you might want a relatively empty constructor and various setters, but the first approach is usually sufficient.
Putting it all together.
// 1 - Your BOM
class Foo {};
class Bar {};
// 2 - Message class: GetUp
class GetUp
{
typedef enum {} State;
State m_state;
};
// 3 - Protocl class: SuperProt
class SuperProt: public Protocol
{
};
// 4 - GetUp to SuperProt serializer
class GetUp2SuperProt: public Serializer
{
};
// 5 - Let's use it
Foo foo;
Bar bar;
SuperProt sp;
GetUp getUp = GetUp(foo,bar);
MyMessage2ProtBase.serialize(sp, getUp); // use GetUp2SuperProt inside
If you need many output formats for many classes, I would try to make it a n + m problem instead of an n * m problem. The first way I come to think of is to have the classes reductible to some kind of dictionary, and then have a method to serlalize those dictionarys to each output formats.
Assuming you have full access to the classes that must be serialized. You need to add some form of reflection to the classes (probably including an abstract factory). There are two ways to do this: 1) a common base class or 2) a "traits" struct. Then you can write your encoders/decoders in relation to the base class/traits struct.
Alternatively, you could require that the class provide a function to export itself to a container of boost::any and provide a constructor that takes a boost::any container as its only parameter. It should be simple to write a serialization function to many different formats if your source data is stored in a map of boost::any objects.
That's two ways I might approach this. It would depend highly on the similarity of the classes to be serialized and on the diversity of target formats which of the above methods I would choose.
I used OpenH323 (famous enough for VoIP developers) library for long enough term to build number of application related to VoIP starting from low density answering machine and up to 32xE1 border controller. Of course it had major rework so I knew almost anything about this library that days.
Inside this library was tool (ASNparser) which converted ASN.1 definitions into container classes. Also there was framework which allowed serialization / de-serialization of these containers using higher layer abstractions. Note they are auto-generated. They supported several encoding protocols (BER,PER,XER) for ASN.1 with very complex ASN sntax and good-enough performance.
What was nice?
Auto-generated container classes which were suitable enough for clear logic implementation.
I managed to rework whole container layer under ASN objects hierarchy without almost any modification for upper layers.
It was relatively easy to do refactoring (performance) for debug features of that ASN classes (I understand, authors didn't intended to expect 20xE1 calls signalling to be logged online).
What was not suitable?
Non-STL library with lazy copy under this. Refactored by speed but I'd like to have STL compatibility there (at least that time).
You can find Wiki page of all the project here. You should focus only on PTlib component, ASN parser sources, ASN classes hierarchy / encoding / decoding policies hierarchy.
By the way,look around "Bridge" design pattern, it might be useful.
Feel free to comment questions if something seen to be strange / not enough / not that you requested actuall.