Creating debugger-friendly enums for property table - c++

I have a situation where I have a hierarchy of classes: Widget and Doobry are object types that inherit from Base (in reality there's a lot more than 2 types). Each instance of an object has a list of properties. There are some properties that are common to all objects, and some properties that are specific to each item type. A simple implementation might look like this:
enum PropertyType {
COMMON_SIZE=0, // first section: properties common to all
COMMON_POSITION,
...
WIDGET_PROPERTY_START=100, // ensure enough space for expansion
WIDGET_DONGLE_SIZE,
WIDGET_TEXT,
...
DOOBRY_PROPERTY_START=200
DOOBRY_COLOUR
....
};
class Base {
public:
Base();
std::vector<std::pair<PropertyType, string>> properties;
};
This achieves one objective in that in the debugger I can see a list of properties mapped onto meaningful names. However it suffers from some drawbacks:
Properties for all items have be defined in one header (not good for encapsulation)
We have to pick some arbitrary numbers for each of the class start positions to allow enough space for future expansion if we add more properties to one of the classes.
My question is whether there's another way of achieving this objective. One thought is that I could use string constants which would be larger to store and slower to lookup, but have the advantage that it's easier to make the names unique and each item type can define its own properties.
EDIT:
It is required that the properties will be serialised and so must be stable over time (ie the enums don't change). There may be up to 1M objects but the vast majority will have empty property tables (as they use default values). Lookup performance is more important than insertion, and the performance hit of doing a string hashing is probably negligible (we can't measure whether it is yet as we haven't written it!).

struct UniqueTag {
friend TagManager;
char const* tag;
UniqueTag( UniqueTag const& other):tag(other.tag) {}
UniqueTag():tag(nullptr) {}; // being able to create the null tag is useful
bool operator<( UniqueTag const& other )const{ return tag < other.tag; }
bool operator==( UniqueTag const& other )const{ return tag == other.tag; }
// do other operators
private:
UniqueTag( char const* t ):tag(t) {}
};
#include <map> // or unordered_map for performance
class TagManager {
std::map<std::string, UniqueTag> cache;
std::vector< std::unique_ptr< char[] > > data;
public:
TagManager() {};
UniqueTag GetTag( std::string s ) {
auto range = cache.equal_range(s);
if (range.first != range.second) {
return range.first->second;
}
std::unique_ptr< char[] > str( new char[ s.size()+1 ] );
std::copy( s.begin(), s.end(), &str[0] );
str[s.length()] = '\0';
UniqueTag retval( str.get() );
data.push_back( std::move(str) );
if(s.length()==0) {
retval = UniqueTag(); // empty string is null tag, so we don't have both!
}
cache.insert( range.first, make_pair( std::move(s), retval ) );
return retval;
}
};
A single TagManager maintains a bunch of unique pointers to strings. We can do fast comparison because we compare on the pointer value. Converting from a string to one of the unique tags is slow, and it has the anti-pattern of a single tag manager implied, but...
Alternative versions include having your UniqueTag stick a hash next to itself, and look things up on the hash (with some kind of assert that no two strings hash to the same value in debug -- birthday paradox makes that happen far more likely than one would naively expect). This gets rid of the single manager class (at least in release -- in debug, you'd have a way to detect collisions. If your hash is deterministic, the lack of collisions in debug could imply no collisions in release).
A boost::variant<enum1, enum2, enum3> with appropriate visualizer and some operator overloads would let you have multiple independent enums. Or a home-brew union over enumes, with a primary enum that says which is valid, with a visualizer on top of it would let you split up management all over the place. In both cases, you export the index of the "type" of enum, then the enum value -- so the order of the enums has to be stable, and the values within each enum have to be stable, but no magic integers are needed. To check for equality, a two-integer chained comparison is needed instead of one (which you could hack into a single 64 bit comparison, if that is faster).

Related

Boost Variant : How can I do a visitor that returns the type that was set?

I'm trying to write a generic map that uses a boost:variant as the value.
I'm stuck on trying to write the get(std::string key) function that will return the appropriate type.
Here is what I came up with so far:
class GenericHashMap {
private:
std::map< std::string, boost::variant<int, bool, double, std::string> > genericMap;
public:
template<typename T>
bool getValue(const std::string & key, T & value) {
if ( _map.find(key) == _map.end() ) {
return false;
}
T * valuePtr = boost::get<T>(_map[key]);
if (valuePtr == NULL) {
return false;
}
value = *valuePtr;
return true;
}
}
I'm curious how I should handle iterators? Is it worth making my own nested iterators or just return the nested std::map.
Edit
I added the class design I was hoping to achieve (i.e. a generic hashmap). The problem I had was that I wanted a way for the user to query if for a specific key it was stored as a specific type.
If you have such an issue, it probably means you should use a visitor instead of wanting to get the value out of your variant. It is usually the way to go with boost::variant.
If you think about it: you do not want to hardwire a specific type for a specific key value. Otherwise, it means you lose all the power of boost::variant. And it means you should have different maps for each key sets (as you know them statically, you should not put everything in the same map).
boost::variant is really here to help you with dynamic dispatch, not static branching.
Note: In your example you lookup your item twice when it is found, you should store the result of find instead of discarding it, saving you the second lookup.

How can I reduce allocation for lookup in C++ map/unordered_map containers?

Suppose I am using std::unordered_map<std::string, Foo> in my code. It's nice and convenient, but unfortunately every time I want to do a lookup (find()) in this map I have to come up with an instance of std::string.
For instance, let's say I'm tokenizing some other string and want to call find() on every token. This forces me to construct an std::string around every token before looking it up, which requires an allocator (std::allocator, which amounts to a CRT malloc()). This can easily be slower than the actual lookup itself. It also contends with other threads since heap management requires some form of synchronization.
A few years ago I found the Boost.intrusive library; it was just a beta version back then. The interesting thing was it had a container called boost::intrusive::iunordered_set which allowed code to perform lookups with any user-supplied type.
I'll explain it how I'd like it to work:
struct immutable_string
{
const char *pf, *pl;
struct equals
{
bool operator()(const string& left, immutable_string& right) const
{
if (left.length() != right.pl - right.pf)
return false;
return std::equals(right.pf, right.pl, left.begin());
}
};
struct hasher
{
size_t operator()(const immutable_string& s) const
{
return boost::hash_range(s.pf, s.pl);
}
};
};
struct string_hasher
{
size_t operator()(const std::string& s) const
{
return boost::hash_range(s.begin(), s.end());
}
};
std::unordered_map<std::string, Foo, string_hasher> m;
m["abc"] = Foo(123);
immutable_string token; // token refers to a substring inside some other string
auto it = m.find(token, immutable_string::equals(), immutable_string::hasher());
Another thing would be to speed up the "find and insert if not found" use caseā€”the trick with lower_bound() only works for ordered containers. The intrusive container has methods called insert_check() and insert_commit(), but that's for a separate topic I guess.
Turns out boost::unordered_map (as of 1.42) has a find overload that takes CompatibleKey, CompatibleHash, CompatiblePredicate types, so it can do exactly what I asked for here.
When it comes to lexing, I personally use two simple tricks:
I use StringRef (similar to LLVM's) which just wraps a char const* and a size_t and provides string-like operations (only const operations, obviously)
I pool the encountered strings using a bump allocator (using lumps of say 4K)
The two combined is quite efficient, though one need understand that all StringRef that point into the pool are obviously invalidated as soon as the pool is destroyed.

Extending a thrift generated object in C++

Using the following .thrift file
struct myElement {
1: required i32 num,
}
struct stuff {
1: optional map<i32,myElement> mymap,
}
I get thrift-generated class with an STL map. The instance of this class is long-lived
(I append and remove from it as well as write it to disk using TSimpleFileTransport).
I would like to extend myElement in C++, the extenstions should not affect
the serialized version of this object (and this object is not used in any
other language). Whats a clean way to acomplish that?
I contemplated the following, but they didn't seem clean:
Make a second, non thrift map that is indexed with the same key
keeping both in sync could prove to be a pain
Modify the generated code either by post-processing of the generated
header (incl. proprocessor hackery).
Similar to #2, but modify the generation side to include the following in the generated struct and then define NAME_CXX_EXT in a forced-included header
#ifdef NAME_CXX_EXT
NAME_CXX_EXT ...
#endif
All of the above seem rather nasty
The solution I am going to go with for now:
[This is all pseudo code, didn't check this copy for compilation]
The following generated code, which I cannot modify
(though I can change the map to a set)
class GeneratedElement {
public:
// ...
int32_t num;
// ...
};
class GeneratedMap {
public:
// ...
std::map<int32_t, GeneratedElement> myGeneratedMap;
// ...
};
// End of generated code
Elsewhere in the app:
class Element {
public:
GeneratedElement* pGenerated; // <<== ptr into element of another std::map!
time_t lastAccessTime;
};
class MapWrapper {
private:
GeneratedMap theGenerated;
public:
// ...
std::map<int32_t, Element> myMap;
// ...
void doStuffWIthBoth(int32_t key)
{
// instead of
// theGenerated.myGeneratedMap[key].num++; [lookup in map #1]
// time(&myMap[key].lastAccessTime); [lookup in map #2]
Element& el=myMap[key];
el.pGenerated->num++;
time(&el.lastAccessTime);
}
};
I wanted to avoid the double map lookup for every access
(though I know that the complexity remains the same, it is still two lookups ).
I figured I can guarantee that all insertions and removals to/from the theGenerated)
are done in a single spot, and in that same spot is where I populate/remove
the corresponding entry in myMap, I would then be able to initialize
Element::pGenerated to its corresponding element in theGenerated.myGeneratedMap
Not only will this let me save half of the lookup time, I may even change
myMap to a better container type for my keytype (say a hash_map or even a boost
multi index map)
At first this sounded to me like a bad idea. With std::vector and std::dqueue I can
see how this can be a problem as the values will be moved around,
invalidating the pointers. Given that std::map is implemented with a tree
structure, is there really a time where a map element will be relocated?
(my above assumptions were confirmed by the discussion in enter link description here)
While I probably won't provide an access method to each member of myElement or any syntactic sugar (like overloading [] () etc), this lets me treat these elements almost a consistent manner. The only key is that (aside for insertion) I never look for members of mymap directly.
Have you considered just using simple containership?
You're using C++, so you can just wrap the struct(s) in some class or other struct, and provide wrapper methods to do whatever you want.

Dynamically storing an internal configuration

I've been thinking of The Right Way (R) to store my program's internal configuration.
Here's the details:
The configuration is runtime only, so generated each run.
It can be adapted (and should) through directives in a "project" file (the reading of that file is not in the scope of this question)
It needs to be extensible, ie there should be a way to add new "variables" with assignes values.
My questions about this:
How should I begin with this? Is a
class with accessors and setters
with an internal std::map for
custom variables a good option?
Are there any known and "good" ways
of doing this?
Should there be a difference between
integer, boolean and string
configuration variables?
Should there be a difference at all
between user and built-in
(pre-existing as in I already
thought of them) variables?
Thanks!
PS: If the question isn't clear, feel free to ask for more info.
UPDATE: Wow, every answer seems to have implicitely or explicitly used boost. I should have mentioned I'd like to avoid boost (I want to explore the Standard libraries' capabilities as is for now).
You could use Boost.PropertyTree for this.
Property trees are versatile data
structures, but are particularly
suited for holding configuration data.
The tree provides its own,
tree-specific interface, and each node
is also an STL-compatible Sequence for
its child nodes.
You could do worse than some kind of a property map (StringMap is just a typedef'd std::map)
class PropertyMap
{
private:
StringMap m_Map;
public:
PropertyMap() { };
~PropertyMap() { };
// properties
template<class T>
T get(const String& _key, const T& _default = T()) const
{
StringMap_cit cit(m_Map.find(_key));
return (cit != m_Map.end()) ? boost::lexical_cast<T>(cit->second) : _default;
}; // eo get
// methods
void set(const String& _cap, const String& _value)
{
m_Map[_cap] = _value;
}; // eo set
template<class T>
void set(const String& _key, const T& _val)
{
set(_key, boost::lexical_cast<String>(_val));
}; // eo set
};
It is very useful to support nesting in configuration files. Something like JSON.
As parameter values can be scalars, arrays and nested groups of parameters, it could be stored in a std::map of boost::variant's, whose value can be a scalar, array or other std::map recursively. Note that std::map sorts by name, so if the original config file order of parameters is important there should be a sequential index of parameters as well. This can be achieved by using boost::multi_index with an ordered or hashed index for fast lookup and a sequential index for traversing the parameters in the original config file order.
I haven't checked, that boost property map could do that from what I've heard.
It is possible to store all values as strings (or arrays of strings for array values) converting them to the destination type only when accessing it.

C++ dynamic class ( dynamic hack )

Is there any way to add a field to a class at runtime ( a field that didn't exist before ) ? Something like this snippet :
Myobject *ob; // create an object
ob->addField("newField",44); // we add the field to the class and we assign an initial value to it
printf("%d",ob->newField); // now we can access that field
I don't really care how it would be done , I don't care if it's an ugly hack or not , I would like to know if it could be done , and a small example , if possible .
Another Example: say I have an XML file describing this class :
<class name="MyClass">
<member name="field1" />
<member name="field2" />
</class>
and I want to "add" the fields "field1" and "field2" to the class (assuming the class already exists) . Let's say this is the code for the class :
class MyClass {
};
I don't want to create a class at runtime , I just want to add members/fields to an existing one .
Thank you !
Use a map and a variant.
For example, using boost::variant. See http://www.boost.org/doc/libs/1_36_0/doc/html/variant.html
(But of course, you can create your own, to suit the types of your XML attributes.)
#include <map>
#include <boost/variant.hpp>
typedef boost::variant< int, std::string > MyValue ;
typedef std::map<std::string, MyValue> MyValueMap ;
By adding MyValueMap as a member of your class, you can add properties according to their names. Which means the code:
oMyValueMap.insert(std::make_pair("newField", 44)) ;
oMyValueMap.insert(std::make_pair("newField2", "Hello World")) ;
std::cout << oMyValueMap["newField"] ;
std::cout << oMyValueMap["newField2"] ;
By encapsulating it in a MyObject class, and adding the right overloaded accessors in this MyObject class, the code above becomes somewhat clearer:
oMyObject.addField("newField", 44) ;
oMyObject.addField("newField2", "Hello World") ;
std::cout << oMyObject["newField"] ;
std::cout << oMyObject["newField2"] ;
But you lose somewhat the type safety of C++ doing so. But for XML, this is unavoidable, I guess.
There's no way to do it in the way you've described, since the compiler needs to resolve the reference at compile time - it will generate an error.
But see The Universal Design Pattern.
You can't make that syntax work (because of static checking at compile time), but if you're willing to modify the syntax, you can achieve the same effect pretty easily. It would be fairly easy to have a dictionary member with a string->blob mapping, and have member functions like:
template< typename T > T get_member( string name );
template< typename T > void set_member( string name, T value );
You could make the syntax more compact/tricky if you want (eg: using a '->' operator override). There are also some compiler-specific tricks you could possibly leverage (MSVC supports __declspec(property), for example, which allows you to map references to a member variable to methods of a specific format). At the end of the day, though, you're not going to be able to do something the compiler doesn't accept in the language and get it to compile.
Short version: Can't do it. There is no native support for this, c++ is statically typed and the compiler has to know the structure of each object to be manipulated.
Recommendation: Use an embedded interperter. And don't write your own (see below), get one that is already working and debugged.
What you can do: Implement just enough interperter for your needs.
It would be simple enough to setup the class with a data member like
std::vector<void*> extra_data;
to which you could attach arbitrary data at run-time. The cost of this is that you will have to manage that data by hand with methods like:
size_t add_data_link(void *p); // points to existing data, returns index
size_t add_data_copy(void *p, size_t s) // copies data (dispose at
// destruction time!), returns
// index
void* get_data(size_t i); //...
But that is not the limit, with a little more care, you could associate the arbitrary data with a name and you can continue to elaborate this scheme as far as you wish (add type info, etc...), but what this comes down to is implementing an interperter to take care of your run-time flexibility.
No -- C++ does not support any manipulation of the type system like this. Even languages with some degree of runtime reflection (e.g. .NET) would not support exactly this paradigm. You would need a much more dynamic language to be able to do it.
I was looking at this and I did a little search around, this code snippet obtained from : Michael Hammer's Blog
seems to be a good way to do this, by using boost::any
First you define a structure that defines an std::map that contains a key (i.e. variable name) and the value. A function is defined to ad the pair and set it along with a function to get the value. Pretty simple if you ask me, but it seems a good way to start before doing more complex things.
struct AnyMap {
void addAnyPair( const std::string& key , boost::any& value );
template<typename T>
T& get( const std::string key ) {
return( boost::any_cast<T&>(map_[key]) );
}
std::map<const std::string, boost::any> map_;
};
void AnyMap::addAnyPair( const std::string& key , boost::any& value ) {
map_.insert( std::make_pair( key, value ) );
}
Bottom line, this is a hack, since C++ is strict type-checking language, and thus monster lie within for those that bend the rules.