Extending a class and maintaining binary backward compatibility - c++

I'm trying to add new functionality to an existing library. I would need to add new data to a class hierarchy so that the root class would have accessors for it. Anyone should be able to get this data only sub-classes could set it (i.e. public getter and protected setter).
To maintain backward compatibility, I know I must not do any of the following (list only includes actions relevant to my problem):
Add or remove virtual functions
Add or remove member variables
Change type of existing member variable
Change signature of existing function
I can think of two ways to add this data to hierarchy: adding a new member variable to root class or adding pure virtual accessor functions (so that data could be stored in sub-classes). However, to maintain backward compatilibity I can not do either of these.
The library is using extensively pimpl idiom but unfortunately the root class I have to modify does not use this idiom. Sub-classes, however, use this idiom.
Now only solution that I can think of is simulating member variable with static hash-map. So I could create a static hash-map, store this new member to it, and implement static accessors for it. Something like this (in pseudo c++):
class NewData {...};
class BaseClass
{
protected:
static setNewData(BaseClass* instance, NewData* data)
{
m_mapNewData[instance] = data;
}
static NewData* getNewData(BaseClass* instance)
{
return m_mapNewData[instance];
}
private:
static HashMap<BaseClass*, NewData*> m_mapNewData;
};
class DerivedClass : public BaseClass
{
void doSomething()
{
BaseClass::setNewData(this, new NewData());
}
};
class Outside
{
void doActions(BaseClass* action)
{
NewData* data = BaseClass::getNewData(action);
...
}
};
Now, while this solution might work, I find it very ugly (of course I could also add non-static accessor functions but this wouldn't remove the ugliness).
Are there any other solutions?
Thank you.

You could use the decorator pattern. The decorator could expose the new data-elements, and no change to the existing classes would be needed. This works best if clients obtain their objects through factories, because then you can transparently add the decorators.

Finally, check binary compatibility using automated tools like abi-compliance-checker.

You can add exported functions (declspec import/export) without affecting binary compatibility (ensuring you do not remove any current functions and add your new functions at the end), but you cannot increase the size of the class by adding new data members.
The reason you cannot increase the size of the class is that for someone that compiled using the old size but uses the newly extended class would mean that the data member stored after your class in their object (and more if you add more than 1 word) would get trashed by the end of the new class.
e.g.
Old:
class CounterEngine {
public:
__declspec(dllexport) int getTotal();
private:
int iTotal; //4 bytes
};
New:
class CounterEngine {
public:
__declspec(dllexport) int getTotal();
__declspec(dllexport) int getMean();
private:
int iTotal; //4 bytes
int iMean; //4 bytes
};
A client then may have:
class ClientOfCounter {
public:
...
private:
CounterEngine iCounter;
int iBlah;
};
In memory, ClientOfCounter in the old framework will look something like this:
ClientOfCounter: iCounter[offset 0],
iBlah[offset 4 bytes]
That same code (not recompiled but using your new version would look like this)
ClientOfCounter: iCounter[offset 0],
iBlah[offset 4 bytes]
i.e. it doesn't know that iCounter is now 8 bytes rather than 4 bytes, so iBlah is actually trashed by the last 4 bytes of iCounter.
If you have a spare private data member, you can add a Body class to store any future data members.
class CounterEngine {
public:
__declspec(dllexport) int getTotal();
private:
int iTotal; //4 bytes
void* iSpare; //future
};
class CounterEngineBody {
private:
int iMean; //4 bytes
void* iSpare[4]; //save space for future
};
class CounterEngine {
public:
__declspec(dllexport) int getTotal();
__declspec(dllexport) int getMean() { return iBody->iMean; }
private:
int iTotal; //4 bytes
CounterEngineBody* iBody; //now used to extend class with 'body' object
};

If your library is open-source then you can request to add it to the upstream-tracker. It will automatically check all library releases for backward compatibility. So you can easily maintain your API.
EDIT: reports for qt4 library are here.

It is hard to maintain binary compatibility - it is much easier to maintain only interface compatibility.
I think that the only reasonable solution is to break supporting current library and redesign it to only export pure virtual interfaces for classes.
That interfaces could never be modified in the future, but you can add new interfaces.
In that interfaces you could only use primitive types like pointers and specified size integers or floats. You should not have interfaces with for example std::strings or other non-primitive types.
When returning pointers to data allocated in DLL, you need to provide a virtual method for deallocation, so that the application deallocates the data using DLL's delete.

Adding data members to the root will break binary compatibility (and force a rebuild, if that is your concern), but it won't break backward compatibility and neither will adding member functions (virtual or not). Adding new member functions is the obvious way to go.

Related

How to declare a class member that may be one of two classes

I am working with a project that is largely not of my creation, but am tasked with adding in some functionality to it. Currently, there is a device class that has a member variable that is responsible for storing information about a storage location, setup like this:
device.hpp
class device {
public:
// Stuff
private:
// Stuff
StorageInfo storage_info_;
// Even more stuff
}
StorageInfo.hpp
class StorageInfo {
public:
void initializeStorage();
void updateStorageInfo();
int popLocation();
int peakLocation();
uint16_t totalSize();
uint16_t remainingSize();
// More declarations here
private:
//Even more stuff here
}
I am tasked with implementing a different storage option so that the two can be switched between. The information functions that this new storage option has would be the same as the initial storage option, but the implementation in retrieving that information is vastly different. In order to keep things clean and make it easier to maintain this application for years to come, they really need to be defined in two different files. However, this creates an issue inside of device.cpp, and in every single other file that calls the StorageInfo class. If I create two separate member variables, one for each type of storage, then not only will I need to insert a million different ifelse statements, but I have the potential to run into initialization issues in the constructors. What I would instead like to do is have one member variable that has the potential to hold either storage option class. Something like this:
StorageInfoA.hpp
class StorageInfoA: StorageInfo {
public:
void initializeStorage();
void updateStorageInfo();
int popLocation();
int peakLocation();
uint16_t totalSize();
uint16_t remainingSize();
// More declarations here
private:
//Even more stuff here
}
StorageInfoB.hpp
class StorageInfoB: StorageInfo {
public:
void initializeStorage();
void updateStorageInfo();
int popLocation();
int peakLocation();
uint16_t totalSize();
uint16_t remainingSize();
// More declarations here
private:
//Even more stuff here
}
device.hpp
class device {
public:
// Stuff
private:
// Stuff
StorageInfo storage_info_;
// Even more stuff
}
device.cpp
//Somewhere in the constructor of device.cpp
if(save_to_cache){
storage_info_ = StorageInfoA();
} else {
storage_info_ = StorageInfoB();
}
// Then, these types of calls would return the correct implementation without further ifelse calls
storage_info_.updateStorageInfo();
However, I know that cpp absolutely hates anything with dynamic typing, so I don't really know how to implement this. Is this kind of thing even possible? If not, does anyone know of a similar way to implement this that does work with cpp's typing rules?
You are on the right track, but you have to learn how to use polymorphism. In your example, you need the following fixes:
In the base class, make all functions virtual, and add a virtual
destructor:
class StorageInfo {
public:
virtual ~StorageInfo(){}
virtual void initializeStorage();
//...
};
Make your inheritance public:
class StorageInfoA: public StorageInfo {
Instead of holding StorageInfo by value, hold it in a smart pointer:
class device {
private:
std::unique_ptr<StorageInfo> storage_info_;
};
device constructor will look like
//Somewhere in the constructor of device.cpp
if(save_to_cache){
storage_info_ = std::make_unique<StorageInfoA>();
} else {
storage_info_ = std::make_unique<StorageInfoB>();
}
Finally, you will use it like an ordinary pointer:
storage_info_->updateStorageInfo();

variable global const "macros" in C++ and optimal design patterns

I inherited some 10 year old code I have to complete. The code is in MFC (C++).
There's a .h file where the custom data structures are written and some const variables are in there as Globals. Some of these are used for MS Office file extensions, of type CString, and are declared as _T(".doc"), _T(".xls"), etc.
Obviously these are dated and need to be updated to recognize the Office 2007 and later extensions. My first brilliant idea was to use const_cast to change the constant if needed, but found out later that's a no-no and resulted in undefined behavior (sometimes it would switch back to .doc).
I then decided to create a struct and have two structs inherit from it. I created a void method in the base struct to make it abstract but otherwise it does nothing. Here's the code:
struct eOfficeExtensions{
const CString WORD_EXTENSION;
const CString EXCEL_EXTENSION;
const CString WORDPAD_EXTENSION;
const INT EXTENSION2007;
eOfficeExtensions(CString word, CString excel, CString wordpad, INT ver) :
WORD_EXTENSION(word), EXCEL_EXTENSION(excel), WORDPAD_EXTENSION(wordpad), EXTENSION2007(ver){}
//method to ensure base class is abstract
virtual void Interface() = 0;
};
struct eOfficeExtensions2003 : public eOfficeExtensions{
public:
eOfficeExtensions2003() : eOfficeExtensions(_T(".doc"), _T(".xls"), _T(".rtf"), 0){}
private:
virtual void Interface(){}
};
struct eOfficeExtensions2007OrLater : public eOfficeExtensions{
eOfficeExtensions2007OrLater() : eOfficeExtensions(_T(".docx"), _T(".xlsx"), _T(".rtf"), 1){}
private:
virtual void Interface(){}
};
This feels like a ridiculous amount of code for what should be a simple conditional definition. What would an experienced programmer do?
EDIT
These constants should only be set once and never changed. The version of MS Office installed is determined by scanning registry subkeys in a class that deals with memory management.
The constants are mainly used to create new files or search a directory for files with that extension, not for resolving conditional statements. The struct should also be instantiated once as a eOfficeExtensions* pointer to the relevant child struct.
Your inheritance tree essentially defines two different values for the base struct.
You don't need inheritance just to define those values, you only need two variables:
struct eOfficeExtensions{
const CString WORD_EXTENSION;
const CString EXCEL_EXTENSION;
const CString WORDPAD_EXTENSION;
const INT EXTENSION2007;
};
const eOfficeExtensions extensions2003{_T(".doc"), _T(".xls"), _T(".rtf"), 0};
const eOfficeExtensions extensions2007{_T(".docx"), _T(".xlsx"), _T(".rtf"), 1};
const eOfficeExtensions* extensions = 0;
// ... Later ...
if (office2007Installed)
extensions = &extensions2007;
else
extensions = &extensions2003;

What is the proper way to handle a large number of interface implementations?

For one of my current projects I have an interface defined for which I have a large number of implementations. You could think of it as a plugin interface with many plugins.
These "plugins" each handle a different message type in a network protocol.
So when I get a new message, I loop through a list of my plugins, see who can handle it, and call into them via the interface.
The issue I am struggling with is how to allocate, initialize, and "load" all the implementations into my array/vector/whatever.
Currently I am declaring all of the "plugins" in main(), then calling an "plugin_manager.add_plugin(&plugin);" for each one. This seems less than ideal.
So, the actual questions:
1. Is there a standardized approach to this sort of thing?
2. Is there any way to define an array (global?) pre-loaded with the plugins?
3. Am I going about this the wrong way entirely? Are there other (better?) architecture options for this sort of problem?
Thanks.
EDIT:
This compiles (please excuse the ugly code)... but it kind of seems like a hack.
On the other hand, it solves the issue of allocation, and cleans up main()... Is this a valid solution?
class intf
{
public:
virtual void t() = 0;
};
class test : public intf
{
public:
test(){}
static test* inst(){ if(!_inst) _inst = new test; return _inst; }
static test* _inst;
void t(){}
};
test* test::_inst = NULL;
intf* ints[] =
{
test::inst(),
NULL
};
Store some form of smart pointer in a container. Dynamically allocate the plugins and register them in the container so that they can be used later.
One possible approach for your solution would be, if you have some form of message id that the plugin can decode, to use a map from that id to the plugin that handles that. This approach allows you to have fast lookup of the plugin given the input message.
One way of writing less code would be to use templates for the instantiation function. Then you only need to write one and put it in the interface, instead of having one function per implementation class.
class intf
{
public:
virtual void t() = 0;
template<class T>
static T* inst()
{
static T instance;
return &instance;
}
};
class test : public intf { ... };
intf* ints[] =
{
intf::inst<test>(),
NULL
};
The above code also works around two bugs you have in your code: One is a memory leak, in your old inst() function you allocate but you never free; The other is that the constructor sets the static member to NULL.
Other tips is to read more about the "singleton" pattern, which is what you have. It can be useful in some situations, but is generally advised against.

C++ Having trouble with syntax, so a class can pass data to other classes

I'm not having a lot of luck in C++ getting one of my classes to see/reference/copy data from one of my other classes so it can use it.
Basically I get the error 'Core' does not name a type or when I try to forward declare (http://stackoverflow.com/questions/2133250/does-not-name-a-type-error-in-c) I get field 'core' has incomplete type
I'm guessing the second error is due to the class not really being initialized possibly, so it has nothing to get? I dunno :( (see code at the bottom)
In my C# games I would normally create a "core" class, and then within that I would start other classes such as 'entities', 'player', 'weapons', etc. When I start these other classes I would pass "this"
public WeaponManager c_WeaponManager;
...
c_WeaponManager = new WeaponManager(this);
so I could always access public values of any class from anywhere as long as it passed through core.
Eg:
So when I do my update through the 'weapon' class, and it detects its hit the player, I'd simply get a function within that class to...
core.playerClass.deductHealth(damageAmmount);
..or something like that.
It allowed me to keep lots of variables I wanted to access globally neatly tucked away in areas that I felt were appropriate.
I know this isn't a good method of programming, but its what I'm fairly comfortable with and I mainly do hobby programming so I like being able to access my data quickly without bureaucratic Get() and Set() functions handing data from one class to another and another. Also I'm still fumbling my way through header files as they seem to be a pain in the ass
//-----------[In vid_glinit.h]-------------------
include "core.h"
class Vid_glInit
{
public:
RECT glWindowRect;
Core core;
Vid_glInit();
~Vid_glInit();
void StartGl(HWND _hGameWindow, int resolutionX, int resolutionY);
private:
};
//------------[in core.h]----------
include "vid_glinit.h"
class Core
{
public:
Vid_glInit vid_glinit(this);
enum GAME_MODE
{
INIT,
MENUS,
GAMEPLAY
};
GAME_MODE gameMode;
HWND hGameWindow;
HGLRC hGameRenderContext; // Permanent Rendering Context
HDC hGameDeviceContext; // Private GDI Device Context
//functions go here
Core();
~Core();
void testFunc();
void Run();
void Update();
void Render();
void StartGl(int resoultionX, int resolutionY);
private:
};
The goal is that when I start OpenGl, instead of having lots of little functions to pass data around I simply tell the glFunctions who need the Device or Rendering context to use core.hGameDeviceContext , if that makes sense
The problem is that you've got
class Vid_glInit
{
public:
Core core;
which means allocate a full copy of the Core object inline inside this class, and also
class Core
{
public:
Vid_glInit vid_glinit(this);
which means allocate a full copy of the Vid_glInit object inline inside the class - and this is now circular, and neither structure's size can be computed.
You probably actually want to allocate at least one of them by reference or pointer, i.e.
class Core
{
public:
Vid_glInit* vid_glinit; // pointer: access properties as core.vid_glinit->foo
Vid_glInit& vid_glinit; // reference: properties are core.vid_glinit.foo
In that case you can use the class Vid_glInit; simple forward declaration because these are just pointers internally and the size of a pointer is fixed regardless of the structure behind it, i.e. C++ can lay out the Core class in memory without full knowledge of the Vid_glInit structure.

Is there any way to prepare a struct for future additions?

I have the following struct which will be used to hold plugin information. I am very sure this will change (added to most probably) over time. Is there anything better to do here than what I have done assuming that this file is going to be fixed?
struct PluginInfo
{
public:
std::string s_Author;
std::string s_Process;
std::string s_ReleaseDate;
//And so on...
struct PluginVersion
{
public:
std::string s_MajorVersion;
std::string s_MinorVersion;
//And so on...
};
PluginVersion o_Version;
//For things we aren't prepared for yet.
void* p_Future;
};
Further, is there any precautions I should take when building shared objects for this system. My hunch is I'll run into lots of library incompatibilities. Please help. Thanks
What about this, or am I thinking too simple?
struct PluginInfo2: public PluginInfo
{
public:
std::string s_License;
};
In your application you are probably passing around only pointers to PluginInfos, so version 2 is compatible to version 1. When you need access to the version 2 members, you can test the version with either dynamic_cast<PluginInfo2 *> or with an explicit pluginAPIVersion member.
Either your plugin is compiled with the same version of C++ compiler and std library source (or its std::string implementation may not be compatible, and all your string fields will break), in which case you have to recompile the plugins anyway, and adding fields to the struct won't matter
Or you want binary compatibility with previous plugins, in which case stick to plain data and fixed size char arrays ( or provide an API to allocate the memory for the strings based on size or passing in a const char* ), in which case it's not unheard of to have a few unused fields in the struct, and then change these to be usefully named items when the need arises. In such cases, it's also common to have a field in the struct to say which version it represents.
But it's very rare to expect binary compatibility and make use of std::string. You'll never be able to upgrade or change your compiler.
As was said by someone else, for binary compatibility you will most likely restrict yourself to a C API.
The Windows API in many places maintains binary compatibility by putting a size member into the struct:
struct PluginInfo
{
std::size_t size; // should be sizeof(PluginInfo)
const char* s_Author;
const char* s_Process;
const char* s_ReleaseDate;
//And so on...
struct PluginVersion
{
const char* s_MajorVersion;
const char* s_MinorVersion;
//And so on...
};
PluginVersion o_Version;
};
When you create such a beast, you need to set the size member accordingly:
PluginInfo pluginInfo;
pluginInfo.size = sizeof(pluginInfo);
// set other members
When you compile your code against a newer version of the API, where the struct has additional members, its size changes, and that is noted in its size member. The API functions, when being passed such a struct presumably will first read its size member and branch into different ways to handle the struct, depending on its size.
Of course, this assumes that evolution is linear and new data is always only added at the end of the struct. That is, you will never have different versions of such a type that have the same size.
However, using such a beast is a nice way of ensuring that user introduce errors into their code. When they re-compile their code against a new API, sizeof(pluginInfo) will automatically adapt, but the additional members won't be set automatically. A reasonably safety would be gained by "initializing" the struct the C way:
PluginInfo pluginInfo;
std::memset( &pluginInfo, 0, sizeof(pluginInfo) );
pluginInfo.size = sizeof(pluginInfo);
However, even putting aside the fact that, technically, zeroing memory might not put a reasonable value into each member (for example, there could be architectures where all bits set to zero is not a valid value for floating point types), this is annoying and error-prone because it requires three-step construction.
A way out would be to design a small and inlined C++ wrapper around that C API. Something like:
class CPPPluginInfo : PluginInfo {
public:
CPPPluginInfo()
: PluginInfo() // initializes all values to 0
{
size = sizeof(PluginInfo);
}
CPPPluginInfo(const char* author /* other data */)
: PluginInfo() // initializes all values to 0
{
size = sizeof(PluginInfo);
s_Author = author;
// set other data
}
};
The class could even take care of storing the strings pointed to by the C struct's members in a buffer, so that users of the class wouldn't even have to worry about that.
Edit: Since it seems this isn't as clear-cut as I thought it is, here's an example.
Suppose that very same struct will in a later version of the API get some additional member:
struct PluginInfo
{
std::size_t size; // should be sizeof(PluginInfo)
const char* s_Author;
const char* s_Process;
const char* s_ReleaseDate;
//And so on...
struct PluginVersion
{
const char* s_MajorVersion;
const char* s_MinorVersion;
//And so on...
};
PluginVersion o_Version;
int fancy_API_version2_member;
};
When a plugin linked to the old version of the API now initializes its struct like this
PluginInfo pluginInfo;
pluginInfo.size = sizeof(pluginInfo);
// set other members
its struct will be the old version, missing the new and shiny data member from version 2 of the API. If it now calls a function of the second API accepting a pointer to PluginInfo, it will pass the address of an old PluginInfo, short one data member, to the new API's function. However, for the version 2 API function, pluginInfo->size will be smaller than sizeof(PluginInfo), so it will be able catch that, and treat the pointer as pointing to an object that doesn't have the fancy_API_version2_member. (Presumably, internal of the host app's API, PluginInfo is the new and shiny one with the fancy_API_version2_member, and PluginInfoVersion1 is the new name of the old type. So all the new API needs to do is to cast the PluginInfo* it got handed be the plugin into a PluginInfoVersion1* and branch off to code that can deal with that dusty old thing.)
The other way around would be a plugin compiled against the new version of the API, where PluginInfo contains the fancy_API_version2_member, plugged into an older version of the host app that knows nothing about it. Again, the host app's API functions can catch that by checking whether pluginInfo->size is greater than the sizeof their own PluginInfo. If so, the plugin presumably was compiled against a newer version of the API than the host app knows about. (Or the plugin write failed to properly initialize the size member. See below for how to simplify dealing with this somewhat brittle scheme.)
There's two ways to deal with that: The simplest is to just refuse to load the plugin. Or, if possible, the host app could work with this anyhow, simply ignoring the binary stuff at the end of the PluginInfo object it was passed which it doesn't know how to interpret.
However, the latter is tricky, since you need to decide this when you implement the old API, without knowing exactly what the new API will look like.
what rwong suggest (std::map<std::string, std::string>) is a good direction. This is makes it possible to add deliberate string fields. If you want to have more flexibility you might declare an abstract base class
class AbstractPluginInfoElement { public: virtual std::string toString() = 0;};
and
class StringPluginInfoElement : public AbstractPluginInfoElement
{
std::string m_value;
public:
StringPluginInfoElement (std::string value) { m_value = value; }
virtual std::string toString() { return m_value;}
};
You might then derive more complex classes like PluginVersion etc. and store a map<std::string, AbstractPluginInfoElement*>.
One hideous idea:
A std::map<std::string, std::string> m_otherKeyValuePairs; would be enough for the next 500 years.
Edit:
On the other hand, this suggestion is so prone to misuse that it may qualify for a TDWTF.
Another equally hideous idea:
a std::string m_everythingInAnXmlBlob;, as seen in real software.
(hideous == not recommended)
Edit 3:
Advantage: The std::map member is not subject to object slicing. When older source code copies an PluginInfo object that contains new keys in the property bag, the entire property bag is copied.
Disadvantage: many programmers will start adding unrelated things to the property bag, and even starts writing code that processes the values in the property bag, leading to maintenance nightmare.
Here's an idea, not sure whether it works with classes, it for sure works with structs: You can make the struct "reserve" some space to be used in the future like this:
struct Foo
{
// Instance variables here.
int bar;
char _reserved[128]; // Make the class 128 bytes bigger.
}
An initializer would zero out whole struct before filling it, so newer versions of the class which would access fields that would now be within the "reserved" area are of sane default values.
If you only add fields in front of _reserved, reducing its size accordingly, and not modify/rearrange other fields you should be OK. No need for any magic. Older software will not touch the new fields as they don't know about them, and the memory footprint will remain the same.