C++ design - Network packets and serialization - c++

I have, for my game, a Packet class, which represents network packet and consists basically of an array of data, and some pure virtual functions
I would then like to have classes deriving from Packet, for example: StatePacket, PauseRequestPacket, etc. Each one of these sub-classes would implement the virtual functions, Handle(), which would be called by the networking engine when one of these packets is received so that it can do it's job, several get/set functions which would read and set fields in the array of data.
So I have two problems:
The (abstract) Packet class would need to be copyable and assignable, but without slicing, keeping all the fields of the derived class. It may even be possible that the derived class will have no extra fields, only function, which would work with the array on the base class. How can I achieve that?
When serializing, I would give each sub-class an unique numeric ID, and then write it to the stream before the sub-class' own serialization. But for unserialization, how would I map the read ID to the appropriate sub-class to instanciate it?
If anyone want's any clarifications, just ask.
-- Thank you
Edit: I'm not quite happy with it, but that's what I managed:
Packet.h: http://pastebin.com/f512e52f1
Packet.cpp: http://pastebin.com/f5d535d19
PacketFactory.h: http://pastebin.com/f29b7d637
PacketFactory.cpp: http://pastebin.com/f689edd9b
PacketAcknowledge.h: http://pastebin.com/f50f13d6f
PacketAcknowledge.cpp: http://pastebin.com/f62d34eef
If someone has the time to look at it and suggest any improvements, I'd be thankful.
Yes, I'm aware of the factory pattern, but how would I code it to construct each class? A giant switch statement? That would also duplicade the ID for each class (once in the factory and one in the serializator), which I'd like to avoid.

For copying you need to write a clone function, since a constructor cannot be virtual:
virtual Packet * clone() const = 0;
Which each Packet implementation implement like this:
virtual Packet * clone() const {
return new StatePacket(*this);
}
for example for StatePacket. Packet classes should be immutable. Once a packet is received, its data can either be copied out, or thrown away. So a assignment operator is not required. Make the assignment operator private and don't define it, which will effectively forbid assigning packages.
For de-serialization, you use the factory pattern: create a class which creates the right message type given the message id. For this, you can either use a switch statement over the known message IDs, or a map like this:
struct MessageFactory {
std::map<Packet::IdType, Packet (*)()> map;
MessageFactory() {
map[StatePacket::Id] = &StatePacket::createInstance;
// ... all other
}
Packet * createInstance(Packet::IdType id) {
return map[id]();
}
} globalMessageFactory;
Indeed, you should add check like whether the id is really known and such stuff. That's only the rough idea.

You need to look up the Factory Pattern.
The factory looks at the incomming data and created an object of the correct class for you.

To have a Factory class that does not know about all the types ahead of time you need to provide a singleton where each class registers itself. I always get the syntax for defining static members of a template class wrong, so do not just cut&paste this:
class Packet { ... };
typedef Packet* (*packet_creator)();
class Factory {
public:
bool add_type(int id, packet_creator) {
map_[id] = packet_creator; return true;
}
};
template<typename T>
class register_with_factory {
public:
static Packet * create() { return new T; }
static bool registered;
};
template<typename T>
bool register_with_factory<T>::registered = Factory::add_type(T::id(), create);
class MyPacket : private register_with_factory<MyPacket>, public Packet {
//... your stuff here...
static int id() { return /* some number that you decide */; }
};

Why do we, myself included, always make such simple problems so complicated?
Perhaps I'm off base here. But I have to wonder: Is this really the best design for your needs?
By and large, function-only inheritance can be better achieved through function/method pointers, or aggregation/delegation and the passing around of data objects, than through polymorphism.
Polymorphism is a very powerful and useful tool. But it's only one of many tools available to us.
It looks like each subclass of Packet will need its own Marshalling and Unmarshalling code. Perhaps inheriting Packet's Marshalling/Unmarshalling code? Perhaps extending it? All on top of handle() and whatever else is required.
That's a lot of code.
While substantially more kludgey, it might be shorter & faster to implement Packet's data as a struct/union attribute of the Packet class.
Marshalling and Unmarshalling would then be centralized.
Depending on your architecture, it could be as simple as write(&data). Assuming there are no big/little-endian issues between your client/server systems, and no padding issues. (E.g. sizeof(data) is the same on both systems.)
Write(&data)/read(&data) is a bug-prone technique. But it's often a very fast way to write the first draft. Later on, when time permits, you can replace it with individual per-attribute type-based Marshalling/Unmarshalling code.
Also: I've taken to storing data that's being sent/received as a struct. You can bitwise copy a struct with operator=(), which at times has been VERY helpful! Though perhaps not so much in this case.
Ultimately, you are going to have a switch statement somewhere on that subclass-id type. The factory technique (which is quite powerful and useful in its own right) does this switch for you, looking up the necessary clone() or copy() method/object.
OR you could do it yourself in Packet. You could just use something as simple as:
( getHandlerPointer( id ) ) ( this )
Another advantage to an approach this kludgey (function pointers), aside from the rapid development time, is that you don't need to constantly allocate and delete a new object for each packet. You can re-use a single packet object over and over again. Or a vector of packets if you wanted to queue them. (Mind you, I'd clear the Packet object before invoking read() again! Just to be safe...)
Depending on your game's network traffic density, allocation/deallocation could get expensive. Then again, premature optimization is the root of all evil. And you could always just roll your own new/delete operators. (Yet more coding overhead...)
What you lose (with function pointers) is the clean segregation of each packet type. Specifically the ability to add new packet types without altering pre-existing code/files.
Example code:
class Packet
{
public:
enum PACKET_TYPES
{
STATE_PACKET = 0,
PAUSE_REQUEST_PACKET,
MAXIMUM_PACKET_TYPES,
FIRST_PACKET_TYPE = STATE_PACKET
};
typedef bool ( * HandlerType ) ( const Packet & );
protected:
/* Note: Initialize handlers to NULL when declared! */
static HandlerType handlers [ MAXIMUM_PACKET_TYPES ];
static HandlerType getHandler( int thePacketType )
{ // My own assert macro...
UASSERT( thePacketType, >=, FIRST_PACKET_TYPE );
UASSERT( thePacketType, <, MAXIMUM_PACKET_TYPES );
UASSERT( handlers [ thePacketType ], !=, HandlerType(NULL) );
return handlers [ thePacketType ];
}
protected:
struct Data
{
// Common data to all packets.
int number;
int type;
union
{
struct
{
int foo;
} statePacket;
struct
{
int bar;
} pauseRequestPacket;
} u;
} data;
public:
//...
bool readFromSocket() { /*read(&data); */ } // Unmarshal
bool writeToSocket() { /*write(&data);*/ } // Marshal
bool handle() { return ( getHandler( data.type ) ) ( * this ); }
}; /* class Packet */
PS: You might dig around with google and grab down cdecl/c++decl. They are very useful programs. Especially when playing around with function pointers.
E.g.:
c++decl> declare foo as function(int) returning pointer to function returning void
void (*foo(int ))()
c++decl> explain void (* getHandler( int ))( const int & );
declare getHandler as function (int) returning pointer to function (reference to const int) returning void

Related

Calling different template function specialisations based on a run-time value

This is related to a previous question in that it's part of the same system, but it's a different problem.
I'm working on an in-house messaging system, which is designed to send messages (structs) to consumers.
When a project wants to use the messaging system, it will define a set of messages (enum class), the data types (struct), and the relationship between these entities:
template <MessageType E> struct expected_type;
template <> struct expected_type<MessageType::TypeA> { using type = Foo; };
template <> struct expected_type<MessageType::TypeB> { using type = Bar; };
template <> struct expected_type<MessageType::TypeM> { using type = Foo; };
Note that different types of message may use the same data type.
The code for sending these messages is discussed in my previous question. There's a single templated method that can send any message, and maintains type safety using the template definitions above. It works quite nicely.
My question regards the message receiver class. There is a base class, which implements methods like these:
ReceiveMessageTypeA(const Foo & data) { /* Some default action */ };
ReceiveMessageTypeB(const Bar & data) { /* Some default action */ };
ReceiveMessageTypeM(const Foo & data) { /* Some default action */ };
It then implements a single message processing function, like this:
bool ProcessMessage(MessageType msgType, void * data) {
switch (msgType) {
case TypeA:
ReceiveMessageTypeA(data);
break;
case TypeB:
ReceiveMessageTypeB(data);
break;
// Repeat for all supported message types
default:
// error handling
break;
}
}
When a message receiver is required, this base class is extended, and the desired ReceiveMessageTypeX methods are implemented. If that particular receiver doesn't care about a message type, the corresponding function is left unimplemented, and the default from the base class is used instead.
Side note: ignore the fact that I'm passing a void * rather than the specific type. There's some more code in between to handle all that, but it's not a relevant detail.
The problem with the approach is the addition of a new message type. As well as having to define the enum, struct, and expected_type<> specialisation, the base class has to be modified to add a new ReceiveMessageTypeX default method, and the switch statement in the ProcessMessage function must be updated.
I'd like to avoid manually modifying the base class. Specifically, I'd like to use the information stored in expected_type to do the heavy lifting, and to avoid repetition.
Here's my attempted solution:
In the base class, define a method:
template <MessageType msgType>
bool Receive(expected_type<msgType>::type data) {
// Default implementation. Print "Message not supported", or something
}
Then, the subclasses can just implement the specialisations they care about:
template<> Receive<MessageType::TypeA>(const Foo & data) { /* Some processing */ }
// Don't care about TypeB
template<> Receive<MessageType::TypeM>(const Foo & data) { /* Some processing */ }
I think that solves part of the problem; I don't need to define new methods in the base class.
But I can't figure out how to get rid of the switch statement. I'd like to be able to do this:
bool ProcessMessage(MessageType msgType, void * data) {
Receive<msgType>(data);
}
This won't do, of course, because templates don't work like that.
Things I've thought of:
Generating the switch statement from the expected_type structure. I have no idea how to do this.
Maintaining some sort of map of function pointers, and calling the desired one. The problem is that I don't know how to initialise the map without repeating the data from expected_type, which I don't want to do.
Defining expected_type using a macro, and then playing preprocessor games to massage that data into a switch statement as well. This may be viable, but I try to avoid macros if possible.
So, in summary, I'd like to be able to call a different template specialisation based on a run-time value. This seems like a contradiction to me, but I'm hoping someone can point me in a useful direction. Even if that is informing me that this is not a good idea.
I can change expected_type if needed, as long as it doesn't break my Send method (see my other question).
You had right idea with expected_type and Receive templates; there's just one step left to get it all working.
First, we need to give us some means to enumerate over MessageType:
enum class MessageType {
_FIRST = 0,
TypeA = _FIRST,
TypeB,
TypeM = 100,
_LAST
};
And then we can enumerate over MessageType at compile time and generate dispatch functions (using SFINAE to skip values not defined in expected_types):
// this overload works when expected_types has a specialization for this value of E
template<MessageType E> void processMessageHelper(MessageType msgType, void * data, typename expected_type<E>::type*) {
if (msgType == E) Receive<E>(*(expected_type<E>::type*)data);
else processMessageHelper<(MessageType)((int)E + 1)>(msgType, data, nullptr);
}
template<MessageType E> void processMessageHelper(MessageType msgType, void * data, bool) {
processMessageHelper<(MessageType)((int)E + 1)>(msgType, data, nullptr);
}
template<> void processMessageHelper<MessageType::_LAST>(MessageType msgType, void * data, bool) {
std::cout << "Unexpected message type\n";
}
void ProcessMessage(MessageType msgType, void * data) {
processMessageHelper<MessageType::_FIRST>(msgType, data, nullptr);
}
Your title says: "Calling different template function specialisations based on a run-time value"
That can only be done with some sort of manual switch statement, or with virtual functions.
On the one hand, it looks on the surface like you are doing object-oriented programming, but you don't yet have any virtual methods. If you find you are writing pseudo-objects everywhere, but you don't have any virtual functions, then it means you are not doing OOP. This is not a bad thing though. If you overuse OOP, then you might fail to appreciate the particular cases where it is useful and therefore it will just cause more confusion.
Simplify your code, and don't get distracted by OOP
You want the message type object to have some 'magic' associated with it, where it's MessageType controls how it is dispatched. This means you need a virtual function.
struct message {
virtual void Receive() = 0;
}
struct message_type_A : public message {
virtual void Receive() {
....
}
}
This allows you, where appropriate, to pass these objects as message&, and to call msg.process_me()

Best practices to implement a Payload-containing class in C++?

I have a question about hierarchy, references and pointers... The question comes to my mind when I had tried to do the following stuff:
class packet {
public:
int address;
int command; /**< Command select the type of Payload that I must decode */
Payload p; /**< Generic payload, first question:
Payload p or Payload * p or Payload &p ?
I have a background in C, for this reason I prefer
Payload p but I know that this is not recommended for C++ */
private:
/** All getter and setter for attributes */
/** Second question: What is the best way to implement a getter
and setter for Payload?... I prefer something
similar to Java if this is possible */
}
Now imagine that I have a lot of types of Payload, all these payloads are children of the super class (generic) Payload.
I want to read the header and switch o the command. For example, if command is 1 I create a PayloadReset : Payload and fill in all of its attributes, then I want to set on my packet this payload (up-casting). In other part of the program I want to read my current packet and then read the command field and down-cast to the appropriate type depending on the command field.
When I tried to do this, I could do the up-casting without problems but the problem comes when I tried to do the downcasting to the specific Payload, in our example PayloadReset.
To answer the first question (which was buried inside the comments in your first code example:
Payload *p;
The first thing you need to learn as part of your transition from Java to C++ is what pointers are and how they work. What will be confusing to you, for some time, is the fact that all objects in Java are really pointers. You never needed to know that, when working with Java. But you must know that now, in order to understand C++. So, declaring a C++ class as
Payload p;
Is not the same thing as making a similar declaration in Java. There is no equivalent to this declaration in Java. In Java you really have a pointer here, and you have to instantiate it using the new keyword. That part Java originally aped from C++. This is the same process as C++, except that you have to explicitly declare it as a pointer.
Payload *p;
Then, somewhere else, using your example of a PayloadReset subclass:
class PayloadReset : public Payload { /* Class declaration */ };
PayloadReset *r = new PayloadReset( /* Constructor argument */ };
p=r;
And the second thing you need to learn as part of your transaction from Java to C++ is when, and how, to delete all instantiated objects. You don't have Java's garbage collector here. This becomes your job, now.
Tagging onto Sam's answer.
Before you go any further, learn the difference between stack and heap allocation. In the example you posted, you're allocating your Payload p; object on the stack - implying that the size of the object is known at this point and said size will be allocated on the stack. If you wanted to assign an derived object to p, it wouldn't work, because said object will likely be of different size. This is why you instead declare a pointer to the object (8 bytes on 64-bit architecture, 4 bytes on 32 bit), and then when you know which type of derived object you want to allocate, you do it using the new operator, as such:
Payload *p;
p = new PayloadReset(...);
The above method would require manually managing memory, i.e. calling delete on the new allocated pointer. As of C++11, the recommendation is to use smart pointers from the <memory> header. These are essentially reference counted pointers that automatically call delete for you.
std::shared_ptr<Payload> p;
p = std::make_shared<PayloadReset>(...);
Your question is somewhat related to Java syntax, but mostly about Object Oriented Programming.
First of all, you should take a moment to get familiar with Java naming conventions. There are commonly used recommendations that you can find all over the web. Here is one example of Java Naming Conventions. I brought this up because single variable names is generally not a good idea and having descriptive variables names pays dividends as the program grows in size and especially if there are more than one person on a team. So, instead of Payload p use Payload payload.
Secondly, in OO (Object Oriented), it is best to always keep your Class instance variables private, not public. Give access to these variables only if necessary and shield access to them by providing public methods. So, in your example of class Packet, your public/private is backwards. Your class should look more like:
public class Packet{
//private fields
private int address;
private int command;
private Payload payload;
//Maybe provide a nice constructor to take in expected
//properties on instantiation
public Packet(Payload pay){
}
//public methods - as needed
public void getPayload(){
return this.payload;
}
public void setAddress(int addy){
this.address = addy;
}
public int getCommand(){
return this.command;
}
}
Also, to answer more of your question about the naming of Payload. Like i said earlier..use descriptive names. Java does not have pointer references like C and generally handles memory management for you, so the & is not required or supported.
Your last question/topic is really again about OO and Class heirarchy.
It seems that Payload would be a generic base class and you may have multiple, specific 'Payload types', like ResetPayload. If that is the case, you would then define Payload and create the ResetPayload class that extends Payload. I'm not sure exactly what you are trying to do, but think of Classes/objects ad nouns and methods as verbs. Also think about the 'is-a' and 'has-a' concept. From what I see, maybe all Payloads 'has-acommand and an address. Also, maybe eachPayloadalso has multiplePackets, whatever. Just as an example, you would then define yourPayload` class like this:
public class Payload{
private int address;
private int command;
private List<Packet> packets = new ArrayList<>();
public Payload(int addy, int comm){
this.address = addy;
this.command = comm;
}
public void addPacket(Packet p){
packets.add(p);
}
public List<Packet> getPackets(){
return this.packets;
}
public int getCommand(){
return this.command;
}
public int getAddress(){
return this.address;
}
}
Then if you had a type of Payload that is more specific, like Reset, you would create the class, extends Payload and provide the additional properties/operations specific to this type, something this like:
public class ResetPayload extends Payload{
public ResetPayload(int addy, int comm){
super(addy, comm);
}
public void reset(){
//Do stuff here to reset the payload
}
}
Hopefully, that answers your questions and moves you along further. Good luck.
Here is my take on the general problem, it extends the tagged union idea. Advantages are 1.) no inheritance/dynamic_cast 2.) no shared ptr 3.) POD 4.) rtti is used to generate unique tags:
using cleanup_fun_t = void(*)(msg*);
class msg
{
public:
template<typename T, typename... Args>
static msg make(Args&&... args);
private:
std::type_index tag_;
mutable std::atomic<cleanup_fun_t> del_fn_; // hell is waiting for me,
uint64_t meta_;
uint64_t data_;
};
Please fill in all the nice member functions. This class is move only. You are creating messages with payload by the static member function make:
template<typename T, typename... Args>
msg msg::make(Args&&... args)
{
msg m;
m.tag_ = typeid(T);
m.del_fn_ = nullptr;
if (!(std::is_empty<T>::value))
{
auto ptr = std::make_unique<T>(std::forward<Args>(args)...);
m.data_ = (uint64_t)ptr.release();
m.del_fn_ = &details::cleanup_t<T>::fun; // deleter template not shown
}
return m;
}
// creation:
msg m = msg::make<Payload>(params passed to payload constructor);
// using
if (m.tag() == typeid(Payload))
{
Payload* ptr = (Payload*)m.data;
ptr-> ...
}
Just check the tag if it contains your expected data (type) and cast the data to a pointer type.
Disclaimer: It is not the complete class. Some access member function are missing here.

Flexible Data Messaging in a component oriented system

I'm creating a Component orientated system for a small game I'm developing. The basic structure is as follows: Every object in the game is composed of a "GameEntity"; a container holding a vector of pointers to items in the "Component" class.
Components and entities communicate with one another by calling the send method in a component's parent GameEntity class. The send method is a template which has two parameters, a Command (which is an enum which includes instructions such as STEP_TIME and the like), and a data parameter of generic type 'T'. The send function loops through the Component* vector and calls each's component's receive message, which due to the template use conveniently calls the overloaded receive method which corresponds to data type T.
Where the problem comes in however (or rather the inconvenience), is that the Component class is a pure virtual function and will always be extended. Because of the practical limitation of not allowing template functions to be virtualised, I would have to declare a virtual receive function in the header for each and every data type which could conceivably be used by a component. This is not very flexible nor extensible, and moreover at least to me seems to be a violation of the OO programming ethos of not duplicating code unnecessarily.
So my question is, how can I modify the code stubs provided below to make my component orientated object structure as flexible as possible without using a method which violates best coding practises
Here is the pertinent header stubs of each class and an example of in what ways an extended component class might be used, to provide some context for my problem:
Game Entity class:
class Component;
class GameEntity
{
public:
GameEntity(string entityName, int entityID, int layer);
~GameEntity(void){};
//Adds a pointer to a component to the components vector.
void addComponent (Component* component);
void removeComponent(Component*);
//A template to allow values of any type to be passed to components
template<typename T>
void send(Component::Command command,T value){
//Iterates through the vector, calling the receive method for each component
for(std::vector<Component*>::iterator it =components.begin(); it!=components.end();it++){
(*it)->receive(command,value);
}
}
private:
vector <Component*> components;
};
Component Class:
#include "GameEntity.h"
class Component
{
public:
static enum Command{STEP_TIME, TOGGLE_ANTI_ALIAS, REPLACE_SPRITE};
Component(GameEntity* parent)
{this->compParent=parent;};
virtual ~Component (void){};
GameEntity* parent(){
return compParent;
}
void setParent(GameEntity* parent){
this->compParent=parent;
}
virtual void receive(Command command,int value)=0;
virtual void receive(Command command,string value)=0;
virtual void receive(Command command,double value)=0;
virtual void receive(Command command,Sprite value)=0;
//ETC. For each and every data type
private:
GameEntity* compParent;
};
A possible extension of the Component class:
#include "Sprite.h"
#include "Component.h"
class GraphicsComponent: Component{
public:
GraphicsComponent(Sprite sprite, string name, GameEntity* parent);
virtual void receive(Command command, Sprite value){
switch(command){
case REPLACE_SPRITE: this->sprite=value; break
}
}
private:
Spite sprite;
}
Should I use a null pointer and cast it as the appropriate type? This might be feasible as in most cases the type will be known from the command, but again is not very flexible.
This is a perfect case for type erasure!
When template based generic programming and object oriented programming collide, you are left with a simple, but hard to solve problem: how do I store, in a safe way, a variable where I don't care about the type but instead care about how I can use it? Generic programming tends to lead to an explosion of type information, where as object oriented programming depends on very specific types. What is a programmer to do?
In this case, the simplest solution is some sort of container which has a fixed size, can store any variable, and SAFELY retrieve it / query it's type. Luckily, boost has such a type: boost::any.
Now you only need one virtual function:
virtual void receive(Command command,boost::any val)=0;
Each component "knows" what it was sent, and can thus pull out the value, like so:
virtual void receive(Command command, boost::any val)
{
// I take an int!
int foo = any_cast<int>(val);
}
This will either successfully convert the int, or throw an exception. Don't like exceptions? Do a test first:
virtual void receive(Command command, boost::any val)
{
// Am I an int?
if( val.type() == typeid(int) )
{
int foo = any_cast<int>(val);
}
}
"But oh!" you might say, eager to dislike this solution, "I want to send more than one parameter!"
virtual void receive(Command command, boost::any val)
{
if( val.type() == typeid(std::tuple<double, char, std::string>) )
{
auto foo = any_cast< std::tuple<double, char, std::string> >(val);
}
}
"Well", you might ponder, "How do I allow arbitrary types to be passed, like if I want float one time and int another?" And to that, sir, you would be beaten, because that is a Bad Idea. Instead, bundle two entry points to the same internal object:
// Inside Object A
virtual void receive(Command command, boost::any val)
{
if( val.type() == typeid(std::tuple<double, char, std::string>) )
{
auto foo = any_cast< std::tuple<double, char, std::string> >(val);
this->internalObject->CallWithDoubleCharString(foo);
}
}
// Inside Object B
virtual void receive(Command command, boost::any val)
{
if( val.type() == typeid(std::tuple<float, customtype, std::string>) )
{
auto foo = any_cast< std::tuple<float, customtype, std::string> >(val);
this->internalObject->CallWithFloatAndStuff(foo);
}
}
And there you have it. By removing the pesky "interesting" part of the type using boost::any, we can now pass arguments safely and securely.
For more information on type erasure, and on the benefits to erasing the parts of the type on objects you don't need so they mesh better with generic programming, see this article
Another idea, if you love string manipulations, is this:
// Inside Object A
virtual void receive(Command command, unsigned int argc, std::string argv)
{
// Use [boost::program_options][2] or similar to break argv into argc arguments
// Left as exercise for the reader
}
This has a curious elegance to it; programs parse their parameters in the same way, so you could conceptualize the data messaging as running "sub-programs", which then opens up a whole host of metaphors and such that might lead to interesting optimizations, such as threading off parts of the data messaging, etc etc.
However, the cost is high: string operations can be quite expensive compare to a simple cast. Also note that boost::any does not come at zero cost; each any_cast requires RTTI lookups, compared to the zero lookups needed for just passing fixed amounts of parameters. Flexibility and indirection require costs; in this case, it is more than worth it however.
If you wish to avoid any such costs at all, there IS one possibility that gets the necessary flexibility as well as no dependencies, and perhaps even a more palatable syntax. But while it is a standard feature, it can be quite unsafe:
// Inside Object A
virtual void receive(Command command, unsigned int argc, ...)
{
va_list args;
va_start ( args, argc );
your_type var = va_arg ( args, your_type );
// etc
va_end( args );
}
The variable argument feature, used in printf for example, allows you to pass arbitrary many arguments; obviously, you will need to tell the callee function how many arguments passed, so that's provided via argc. Keep in mind, however, that the callee function has no way to tell if the correct parameters were passed; it will happily take whatever you give it and interpret it as if it were correct. So, if you accidentally pass the wrong information, there will be no compile time support to help you figure out what goes wrong. Garbage in, Garbage out.
Also, there area host of things to remember regarding va_list, such as all floats are upconverted to double, structs are passed by pointers (I think), but if your code is correct and precise, there will be no problems and you will have efficiency, lack of dependencies, and ease of use. I would recommend, for most uses, to wrap the va_list and such into a macro:
#define GET_DATAMESSAGE_ONE(ret, type) \
do { va_list args; va_start(args,argc); ret = va_args(args,type); } \
while(0)
And then a version for two args, then one for three. Sadly, a template or inline solution can't be used here, but most data packets will not have more than 1-5 parameters, and most of them will be primitives (almost certainly, though your use case may be different), so designing a few ugly macros to help your users out will deal largely with the unsafety aspect.
I do not recommend this tactic, but it may very well be the fastest and easiest tactic on some platforms, such as ones that do not allow even compile time dependencies or embedded systems, where virtual calls may be unallowed.

Subdata (substring-like?) of a shared_ptr

I have a data buffer stored in a shared_ptr<void>.
This buffer is organized in several encapsulated layers so that I end up with:
-----------------------------------...
- Header 1 | Header 2 | Data
-----------------------------------...
(Actually it's an Ethernet packet where I decapsulate the layers one after the other).
Once I read Header 1, I would like to pass the rest of the packet to the next layer for reading, so I would like to create a pointer to :
-----------------------...
- Header 2 | Data
-----------------------...
It would be very easy with a raw pointer, as it would just be a matter of pointer arithmetic. But how can I achieve that with a shared_ptr ? (I use boost::shared_ptr) :
I cannot create a new shared_ptr to "first shared_ptr.get() + offset" because it makes no sense to get the ownership to just Header 2 + Data (and delete would crash eventually)
I do not want to copy the data because it would be silly
I want the ownership on the whole buffer to be shared between the two objects (ie. as long as the parent object or the one which requires only Header 2 needs the data, the data should not be deleted).
I could wrap that up in a structure like boost::tuple<shared_ptr<void>, int /*offset*/, int /*length*/> but I wonder if there is a more convenient / elegant way to achieve that result.
Thanks,
I would recommend encapsulating the layers each in a class that knows how to deal with the data as though it were that layer. Think each one as a view into your buffer. Here is a starting point to get you thinking.
class Layer1{
public:
Layer1(shared_ptr<void> buffer) : buffer_(buffer) { }
/* All the functions you need for treating your buffer as a Layer 1 type */
void DoSomething() {}
private:
shared_ptr<void> buffer_;
};
class Layer2{
public:
Layer2(shared_ptr<void> buffer) : buffer_(buffer) { }
/* All the functions you need for treating your buffer as a Layer 2 type */
void DoSomethingElse() {}
private:
shared_ptr<void> buffer_;
};
And how to use it:
shared_ptr<void> buff = getBuff(); //< Do what you need to get the raw buffer.
// I show these together, but chances are, sections of your code will only need
// to think about the data as though it belongs to one layer or the other.
Layer1 l1(buff);
Layer2 l2(buff);
l1.DoSomething();
l2.DoSomethingElse();
Laying things out this way allows you to write functions that operate solely on that layer even though they internally represent the same data.
But, this is by no means perfect.
Perhaps Layer2 should be able to call Layer1's methods. For that you would want inheritance as well. I don't know enough about your design to say whether that would be helpful. Other room for improvement is replacing the shared_ptr<void> with a class that has helpful methods for dealing with the buffer.
can you just use a simple wrapper?
something like this maybe?
class HeaderHolder : protected shared_ptr<void> {
public:
// Constructor and blah blah
void* operator* () {
offset += a_certain_length;
return (shared_ptr<void>::operator*() + offset);
}
};
By the way, I just used a simple wrapper that I reproduce here if someone ever stumbles on the question.
class DataWrapper {
public:
DataWrapper (shared_ptr<void> pData, size_t offset, size_t length) : mpData(pData), mOffset(offset), mLength(length) {}
void* GetData() {return (unsigned char*)mpData.get() + mOffset;}
// same with const...
void SkipData (size_t skipSize) { mOffset += skipSize; mLength -= skipSize; }
void GetLength const {return mLength;}
// Then you can add operator+, +=, (void*), -, -=
// if you need pointer-like semantics.
// Also a "memcpy" member function to copy just this buffer may be useful
// and other helper functions if you need
private:
shared_ptr<void> mpData;
size_t mOffset, mLength;
};
Just be careful when you use GetData: be sure that the buffer will not be freed while you use the unsafe void*. It is safe to use the void* as long as you know the DataWrapper object is alive (because it holds a shared_ptr to the buffer, so it prevents it from being freed).

Design pattern to refactor switch statement

I have something like the following in the header
class MsgBase
{
public:
unsigned int getMsgType() const { return type_; }
...
private:
enum Types { MSG_DERIVED_1, MSG_DERIVED_2, ... MSG_DERIVED_N };
unsigned int type_;
...
};
class MsgDerived1 : public MsgBase { ... };
class MsgDerived2 : public MsgBase { ... };
...
class MsgDerivedN : public MsgBase { ... };
and is used as
MsgBase msgHeader;
// peeks into the input stream to grab the
// base class that has the derived message type
// non-destructively
inputStream.deserializePeek( msgHeader );
unsigned int msgType = msgHeader.getMsgType();
MsgDerived1 msgDerived1;
MsgDerived2 msgDerived2;
...
MsgDerivedN msgDerivedN;
switch( msgType )
{
case MSG_DERIVED_1:
// fills out msgDerived1 from the inputStream
// destructively
inputStream.deserialize( msgDerived1 );
/* do MsgDerived1 processing */
break;
case MSG_DERIVED_2:
inputStream.deserialize( msgDerived2 );
/* do MsgDerived1 processing */
break;
...
case MSG_DERIVED_N:
inputStream.deserialize( msgDerivedN );
/* do MsgDerived1 processing */
break;
}
This seems like the type of situation which would be fairly common and well suited to refactoring. What would be the best way to apply design patterns (or basic C++ language feature redesign) to refactor this code?
I have read that the Command pattern is commonly used to refactor switch statements but that seems only applicable when choosing between algorithms to do a task. Is this a place where the factory or abstract factory pattern is applicable (I am not very familiar with either)? Double dispatch?
I've tried to leave out as much inconsequential context as possible but if I missed something important just let me know and I'll edit to include it. Also, I could not find anything similar but if this is a duplicate just redirect me to the appropriate SO question.
You could use a Factory Method pattern that creates the correct implementation of the base class (derived class) based on the value you peek from the stream.
The switch isn't all bad. It's one way to implement the factory pattern. It's easily testable, it makes it easy to understand the entire range of available objects, and it's good for coverage testing.
Another technique is to build a mapping between your enum types and factories to make the specific objects from the data stream. This turns the compile-time switch into a run-time lookup. The mapping can be built at run-time, making it possible to add new types without recompiling everything.
// You'll have multiple Factories, all using this signature.
typedef MsgBase *(*Factory)(StreamType &);
// For example:
MsgBase *CreateDerived1(StreamType &inputStream) {
MsgDerived1 *ptr = new MsgDerived1;
inputStream.deserialize(ptr);
return ptr;
}
std::map<Types, Factory> knownTypes;
knownTypes[MSG_DERIVED_1] = CreateDerived1;
// Then, given the type, you can instantiate the correct object:
MsgBase *object = (*knownTypes[type])(inputStream);
...
delete object;
Pull Types and type_ out of MsgBase, they don't belong there.
If you want to get totally fancy, register all of your derived types with the factory along with the token (e.g. 'type') that the factory will use to know what to make. Then, the factory looks up that token on deserialize in its table, and creates the right message.
class DerivedMessage : public Message
{
public:
static Message* Create(Stream&);
bool Serialize(Stream&);
private:
static bool isRegistered;
};
// sure, turn this into a macro, use a singleton, whatever you like
bool DerivedMessage::isRegistered =
g_messageFactory.Register(Hash("DerivedMessage"), DerivedMessage::Create);
etc. The Create static method allocates a new DerivedMessage and deserializes it, the Serialize method writes the token (in this case, Hash("DerivedMessage")) and then serializes itself. One of them should probably test isRegistered so that it doesn't get dead stripped by the linker.
(Notably, this method doesn't require an enum or other "static list of everything that can ever exist". At this time I can't think of another method that doesn't require circular references to some degree.)
It's generally a bad idea for a base class to have knowledge about derived classes, so a redesign is definitely in order. A factory pattern is probably what you want here as you already noted.