Dynamically storing an internal configuration - c++

I've been thinking of The Right Way (R) to store my program's internal configuration.
Here's the details:
The configuration is runtime only, so generated each run.
It can be adapted (and should) through directives in a "project" file (the reading of that file is not in the scope of this question)
It needs to be extensible, ie there should be a way to add new "variables" with assignes values.
My questions about this:
How should I begin with this? Is a
class with accessors and setters
with an internal std::map for
custom variables a good option?
Are there any known and "good" ways
of doing this?
Should there be a difference between
integer, boolean and string
configuration variables?
Should there be a difference at all
between user and built-in
(pre-existing as in I already
thought of them) variables?
Thanks!
PS: If the question isn't clear, feel free to ask for more info.
UPDATE: Wow, every answer seems to have implicitely or explicitly used boost. I should have mentioned I'd like to avoid boost (I want to explore the Standard libraries' capabilities as is for now).

You could use Boost.PropertyTree for this.
Property trees are versatile data
structures, but are particularly
suited for holding configuration data.
The tree provides its own,
tree-specific interface, and each node
is also an STL-compatible Sequence for
its child nodes.

You could do worse than some kind of a property map (StringMap is just a typedef'd std::map)
class PropertyMap
{
private:
StringMap m_Map;
public:
PropertyMap() { };
~PropertyMap() { };
// properties
template<class T>
T get(const String& _key, const T& _default = T()) const
{
StringMap_cit cit(m_Map.find(_key));
return (cit != m_Map.end()) ? boost::lexical_cast<T>(cit->second) : _default;
}; // eo get
// methods
void set(const String& _cap, const String& _value)
{
m_Map[_cap] = _value;
}; // eo set
template<class T>
void set(const String& _key, const T& _val)
{
set(_key, boost::lexical_cast<String>(_val));
}; // eo set
};

It is very useful to support nesting in configuration files. Something like JSON.
As parameter values can be scalars, arrays and nested groups of parameters, it could be stored in a std::map of boost::variant's, whose value can be a scalar, array or other std::map recursively. Note that std::map sorts by name, so if the original config file order of parameters is important there should be a sequential index of parameters as well. This can be achieved by using boost::multi_index with an ordered or hashed index for fast lookup and a sequential index for traversing the parameters in the original config file order.
I haven't checked, that boost property map could do that from what I've heard.
It is possible to store all values as strings (or arrays of strings for array values) converting them to the destination type only when accessing it.

Related

What's the best way to store a maya api object inside an std::map that will never become stale? c++

I have an std::map that I'm using to cache some objects info and I will be using that later, I'd like to have my object as a key so it takes as little time as possible to access it.
I was thinking of converting the UUID to string but I realised you can actually end up with duplicated uuids if the object gets brought to the scene as a reference multiple times.
I've also tried with adding an MDagPath but it won't allow me to store that into an std::map. I imagine I'd have to make it hashable but I can't think of a way to do that safely. Using a name I also think it's a big nono since it can be renamed.
Thanks for the help. I hope I was clear enough with my problem.
std::map isn't a hash map (you'd want unordered_map for that). As for adding an MDagPath to a std::map, you could do it via something along these lines:
// I don't think this is defined for dagpath?
// So you should be able to define it for MDagPath.
static inline bool operator < (const MDagPath& a, const MDagPath& b) {
return a.fullPathName() < b.fullPathName();
}
std::map<MDagPath, MyObjectInfo> myMap;
Alternatively, if you want to generate a lookup based on the object, then you can use MObjectHandle. It may still go stale, but at least it has the isAlive() and isValid() methods to tell you when it is stale.
You could insert that into an unordered_map
namespace std {
// override the std::hash for the MObjectHandle
template <>
struct hash< MObjectHandle >
{
std::size_t operator()(const MObjectHandle& k) const
{
return k.hashCode();
}
};
}
std::unordered_map<MObjectHandle, MyObjectInfo> myMap;
(code is untested - I don't have access to Maya at the moment)

Creating debugger-friendly enums for property table

I have a situation where I have a hierarchy of classes: Widget and Doobry are object types that inherit from Base (in reality there's a lot more than 2 types). Each instance of an object has a list of properties. There are some properties that are common to all objects, and some properties that are specific to each item type. A simple implementation might look like this:
enum PropertyType {
COMMON_SIZE=0, // first section: properties common to all
COMMON_POSITION,
...
WIDGET_PROPERTY_START=100, // ensure enough space for expansion
WIDGET_DONGLE_SIZE,
WIDGET_TEXT,
...
DOOBRY_PROPERTY_START=200
DOOBRY_COLOUR
....
};
class Base {
public:
Base();
std::vector<std::pair<PropertyType, string>> properties;
};
This achieves one objective in that in the debugger I can see a list of properties mapped onto meaningful names. However it suffers from some drawbacks:
Properties for all items have be defined in one header (not good for encapsulation)
We have to pick some arbitrary numbers for each of the class start positions to allow enough space for future expansion if we add more properties to one of the classes.
My question is whether there's another way of achieving this objective. One thought is that I could use string constants which would be larger to store and slower to lookup, but have the advantage that it's easier to make the names unique and each item type can define its own properties.
EDIT:
It is required that the properties will be serialised and so must be stable over time (ie the enums don't change). There may be up to 1M objects but the vast majority will have empty property tables (as they use default values). Lookup performance is more important than insertion, and the performance hit of doing a string hashing is probably negligible (we can't measure whether it is yet as we haven't written it!).
struct UniqueTag {
friend TagManager;
char const* tag;
UniqueTag( UniqueTag const& other):tag(other.tag) {}
UniqueTag():tag(nullptr) {}; // being able to create the null tag is useful
bool operator<( UniqueTag const& other )const{ return tag < other.tag; }
bool operator==( UniqueTag const& other )const{ return tag == other.tag; }
// do other operators
private:
UniqueTag( char const* t ):tag(t) {}
};
#include <map> // or unordered_map for performance
class TagManager {
std::map<std::string, UniqueTag> cache;
std::vector< std::unique_ptr< char[] > > data;
public:
TagManager() {};
UniqueTag GetTag( std::string s ) {
auto range = cache.equal_range(s);
if (range.first != range.second) {
return range.first->second;
}
std::unique_ptr< char[] > str( new char[ s.size()+1 ] );
std::copy( s.begin(), s.end(), &str[0] );
str[s.length()] = '\0';
UniqueTag retval( str.get() );
data.push_back( std::move(str) );
if(s.length()==0) {
retval = UniqueTag(); // empty string is null tag, so we don't have both!
}
cache.insert( range.first, make_pair( std::move(s), retval ) );
return retval;
}
};
A single TagManager maintains a bunch of unique pointers to strings. We can do fast comparison because we compare on the pointer value. Converting from a string to one of the unique tags is slow, and it has the anti-pattern of a single tag manager implied, but...
Alternative versions include having your UniqueTag stick a hash next to itself, and look things up on the hash (with some kind of assert that no two strings hash to the same value in debug -- birthday paradox makes that happen far more likely than one would naively expect). This gets rid of the single manager class (at least in release -- in debug, you'd have a way to detect collisions. If your hash is deterministic, the lack of collisions in debug could imply no collisions in release).
A boost::variant<enum1, enum2, enum3> with appropriate visualizer and some operator overloads would let you have multiple independent enums. Or a home-brew union over enumes, with a primary enum that says which is valid, with a visualizer on top of it would let you split up management all over the place. In both cases, you export the index of the "type" of enum, then the enum value -- so the order of the enums has to be stable, and the values within each enum have to be stable, but no magic integers are needed. To check for equality, a two-integer chained comparison is needed instead of one (which you could hack into a single 64 bit comparison, if that is faster).

In a hashmap/unordered_map, is it possible to avoid data duplication when the value already contains the key

Given the following code:
struct Item
{
std::string name;
int someInt;
string someString;
Item(const std::string& aName):name(aName){}
};
std::unordered_map<std::string, Item*> items;
Item* item = new Item("testitem");
items.insert(make_pair(item.name, item);
The item name will be stored in memory two times - once as part of the Item struct and once as the key of the map entry. Is it possible to avoid the duplication? With some 100M records this overhead becomes huge.
Note:
I need to have the name inside the Item structure because I use the hashmap as index to another container of Item-s, and there I don't have access to the map's key values.
OK, since you say you are using pointers as values, I hereby bring my answer back to life.
A bit hacky, but should work. Basicly you use pointer and a custom hash function
struct Item
{
std::string name;
int someInt;
string someString;
Item(const std::string& aName):name(aName){}
struct name_hash
{
size_t operator() (std::string* name)
{
std::hash<std::string> h;
return h(*name);
}
};
};
std::unordered_map<std::string*, Item*, Item::name_hash> items;
Item* item = new Item ("testitem");
items.insert(make_pair(&(item->name), item);
Assuming the structure you use to store your items in the first place is a simple list, you could replace it with a multi-indexed container.
Something along thoses lines (untested) should fulfill your requirements:
typedef multi_index_container<
Item,
indexed_by<
sequenced<>,
hashed_unique<member<Item, std::string, &Item::name
>
> itemContainer;
itemContainer items;
Now you can access items either in their order of insertion, or look them up by name:
itemContainer::nth_index<0>::type & sequentialItems = items.get<O>();
// use sequentialItems as a regular std::list
itemContainer::nth_index<1>::type & associativeItems = items.get<1>();
// uses associativeItems as a regular std::unordered_set
Depending on your needs, you can use other indexings as well.
Don't store std::string name field in your struct. Anyway when you perform lookup you already know name field.
TL;DR If you are using libstdc++ (coming with gcc) you are already fine.
There are 3 ways, 2 are "simple":
split your object in two Key/Value, and stop duplicated the Key in the Value
store your object in a unordered_set instead
The 3rd one is more complicated, unless provided by your compiler:
use an implementation of std::string that is reference counted (such as libstdc++'s)
In this case, when you copy a std::string into another, the reference counter of the internal buffer is incremented... and that's all. Copy is deferred to a time where a modification is requested by one of the owners: Copy On Write.
No, there isn't. You can:
Not store name in Item and pass it around separately.
Create Item, ItemData that has the same fields as Item except the name and either
derive Item from std::pair<std::string, ItemData> (= value_type of the type) or
make it convertible to and from that type.
Use a reference to string for the key. You should be able to use std::reference_wrapper<const std::string> as key and pass key in std::cref(value.name) for key and std::cref(std::string(whatever)) for searching. You may have to specialize std::hash<std::reference_wrapper<const std::string>>, but it should be easy.
Use std::unordered_set, but it has the disadvantage that lookup creates dummy Item for lookup.
When you actually have Item * as value type, you can move the name to a base class and use polymorphism to avoid that disadvantage.
Create custom hash map, e.g. with Boost.Intrusive.

C++ store multiple data types in one collection

Problem: I want disparate sections of my code to be able to access a common collection that stores objects of different types in such a way that the type of each object is known and, crucially, retrieval from the collection should be type checked at compile time. (I realise this is close to questions asked before, but please read on, this is somewhat more specific.)
To give a concrete example, I would like something that does the following:
// Stuff that can go in the collection:
enum Key { NUM_APPLES /* (unsigned int) */, APPLE_SIZE /* (double) */ }
map<Key, Something> collection;
unsigned int * nApples = collection.find(NUM_APPLES);
int * appleSize = collection.find(APPLE_SIZE); // COMPILATION ERROR - WRONG TYPE
My solution: So far I have devised the following solution using boost::any:
The key:
using namespace std;
using namespace boost::any;
struct KeySupertype
{
protected:
// Can't create an instance of this type
KeySupertype() {}
private:
// Can't copy
KeySupertype& operator = (const KeySupertype& other) {}
KeySupertype(const KeySupertype& other) {}
};
template <typename Type>
struct Key : public KeySupertype
{
public:
Key() {}
};
The collection:
class PropertiesMap
{
public:
template<typename T>
T * find(Key<T> & key);
/* Skipping erase, insert and other methods for brevity. */
private:
map<const KeySupertype *, any> myAnyMap;
};
template <typename T>
T * PropertiesMap::find(Key<T> & key)
{
const map<const KeySupertype *, any>::iterator it = myAnyMap.find(&key);
if(it == myAnyMap.end())
return NULL;
return any_cast<T>(&it->second);
}
Usage:
static const Key<unsigned int> NUM_APPLES;
static const Key<double> APPLE_SIZE;
PropertiesMap collection;
/* ...insert num apples and apple size into collection ...*/
unsigned int * const nApples = collection.find(NUM_APPLES);
int * const nApples = collection.find(NUM_APPLES); // COMPILATION ERROR
This way type information is encoded with each Key according to its template parameter so the type will be enforced when interacting with the collection.
Questions:
1) Is this a reasonable way to achieve my goal?
2) A point of nastyness is that the collection uses the address of Key objects as the internal std::map key. Is there a way around this? Or at least a way to mitigate misuse? I've tried using a unique int in each Key that was generated from a static int (and making the std::map key type an int), but I'd like to avoid statics if possible for threading reasons.
3) To avoid using boost::any would it be reasonable to have the std::map be of type <const KeySupertype *, void *> and use a static_cast<T> instead of any_cast?
1) Looks nice to me, a clever solution
2) I guess you're afraid that someone will copy the key and its address will change. If that's your concern, keep an "original address" field in your KeySuperType. During construction set the original address to this, during copying set the original address to the original address of the right hand (source). Use this original address to access the map contents. I really couldn't think of a compile time solution to this, since at compile time, compilation units will not know about each other. You could assign unique ID's to the keys earliest in link time, and getting the address of global variables is sort of equivalent to that. The only weak point I can see with this solution is if you define the same key in two dynamic shared libraries without extern, they'll silently have their own versions of the key with different addresses. Of course, if everything goes into the same binary, you won't have that problem, since two declarations without extern will cause a "Multiple Declaration" linker error.
3) If your problem with boost::any is depending on boost (which is more OK then you think), then implement any yourself, it's surprisingly simple. If the problem is performance, then static_cast<> also seems OK to me, as long as you keep the internals of your PropertiesMap
away from those that don't know what they're doing...

Extending a thrift generated object in C++

Using the following .thrift file
struct myElement {
1: required i32 num,
}
struct stuff {
1: optional map<i32,myElement> mymap,
}
I get thrift-generated class with an STL map. The instance of this class is long-lived
(I append and remove from it as well as write it to disk using TSimpleFileTransport).
I would like to extend myElement in C++, the extenstions should not affect
the serialized version of this object (and this object is not used in any
other language). Whats a clean way to acomplish that?
I contemplated the following, but they didn't seem clean:
Make a second, non thrift map that is indexed with the same key
keeping both in sync could prove to be a pain
Modify the generated code either by post-processing of the generated
header (incl. proprocessor hackery).
Similar to #2, but modify the generation side to include the following in the generated struct and then define NAME_CXX_EXT in a forced-included header
#ifdef NAME_CXX_EXT
NAME_CXX_EXT ...
#endif
All of the above seem rather nasty
The solution I am going to go with for now:
[This is all pseudo code, didn't check this copy for compilation]
The following generated code, which I cannot modify
(though I can change the map to a set)
class GeneratedElement {
public:
// ...
int32_t num;
// ...
};
class GeneratedMap {
public:
// ...
std::map<int32_t, GeneratedElement> myGeneratedMap;
// ...
};
// End of generated code
Elsewhere in the app:
class Element {
public:
GeneratedElement* pGenerated; // <<== ptr into element of another std::map!
time_t lastAccessTime;
};
class MapWrapper {
private:
GeneratedMap theGenerated;
public:
// ...
std::map<int32_t, Element> myMap;
// ...
void doStuffWIthBoth(int32_t key)
{
// instead of
// theGenerated.myGeneratedMap[key].num++; [lookup in map #1]
// time(&myMap[key].lastAccessTime); [lookup in map #2]
Element& el=myMap[key];
el.pGenerated->num++;
time(&el.lastAccessTime);
}
};
I wanted to avoid the double map lookup for every access
(though I know that the complexity remains the same, it is still two lookups ).
I figured I can guarantee that all insertions and removals to/from the theGenerated)
are done in a single spot, and in that same spot is where I populate/remove
the corresponding entry in myMap, I would then be able to initialize
Element::pGenerated to its corresponding element in theGenerated.myGeneratedMap
Not only will this let me save half of the lookup time, I may even change
myMap to a better container type for my keytype (say a hash_map or even a boost
multi index map)
At first this sounded to me like a bad idea. With std::vector and std::dqueue I can
see how this can be a problem as the values will be moved around,
invalidating the pointers. Given that std::map is implemented with a tree
structure, is there really a time where a map element will be relocated?
(my above assumptions were confirmed by the discussion in enter link description here)
While I probably won't provide an access method to each member of myElement or any syntactic sugar (like overloading [] () etc), this lets me treat these elements almost a consistent manner. The only key is that (aside for insertion) I never look for members of mymap directly.
Have you considered just using simple containership?
You're using C++, so you can just wrap the struct(s) in some class or other struct, and provide wrapper methods to do whatever you want.