Using std::unique_ptr of a polymorphic class as key in std::unordered_map - c++

My problem comes from a project that I'm supposed to finish. I have to create an std::unordered_map<T, unsigned int> where T is a pointer to a base, polymorphic class. After a while, I figured that it will also be a good practice to use an std::unique_ptr<T> as a key, since my map is meant to own the objects. Let me introduce some backstory:
Consider class hierarchy with polymorphic sell_obj as a base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>. Therefore, erasing a pair from that map will automatically free the memory pointed by key. The whole idea is to have keys pointing to books/tables and value of those keys will represent the amount of that product that our shop contains.
As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this:
namespace std{
template <> struct hash<book>{
size_t operator()(const book& b) const
{
return 1; // simplified
}
};
template <> struct hash<table>{
size_t operator()(const table& b) const
{
return 2; // simplified
}
};
// The standard provides a specilization so that std::hash<unique_ptr<T>> is the same as std::hash<T*>.
template <> struct hash<sell_obj*>{
size_t operator()(const sell_obj *s) const
{
const book *b_p = dynamic_cast<const book*>(s);
if(b_p != nullptr) return std::hash<book>()(*b_p);
else{
const table *t_p = static_cast<const table*>(s);
return std::hash<table>()(*t_p);
}
}
};
}
Now let's look at implementation of the map. We have a class called Shop which looks like this:
#include "sell_obj.h"
#include "book.h"
#include "table.h"
#include <unordered_map>
#include <memory>
class Shop
{
public:
Shop();
void add_sell_obj(sell_obj&);
void remove_sell_obj(sell_obj&);
private:
std::unordered_map<std::unique_ptr<sell_obj>, unsigned int> storeroom;
};
and implementation of two, crucial functions:
void Shop::add_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
storeroom[std::move(n_ptr)]++;
}
void Shop::remove_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
auto target = storeroom.find(std::move(n_ptr));
if(target != storeroom.end() && target->second > 0) target->second--;
}
in my main I try to run the following code:
int main()
{
book *b1 = new book("foo", "bar", 10);
sell_obj *ptr = b1;
Shop S_H;
S_H.add_sell_obj(*ptr); // works fine I guess
S_H.remove_sell_obj(*ptr); // usually (not always) crashes [SIGSEGV]
return 0;
}
my question is - where does my logic fail? I heard that it's fine to use std::unique_ptr in STL containters since C++11. What's causing the crash? Debugger does not provide any information besides the crash occurance.
If more information about the project will be needed, please point it out. Thank you for reading

There are quite a few problems with logic in the question. First of all:
Consider class hierarchy with polymorphic sell_obj as base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>.
In such cases std::unique_ptr<sell_obj*> is not what we would want. We would want std::unique_ptr<sell_obj>. Without the *. std::unique_ptr is already "a pointer".
As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this: [...]
This is also quite of an undesired approach. This would require changing that part of the code every time we add another subclass in the hierarchy. It would be best to delegate the hashing (and comparing) polymorphically to avoid such problems, exactly as #1201programalarm suggested.
[...] implementation of two, crucial functions:
void Shop::add_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
storeroom[std::move(n_ptr)]++;
}
void Shop::remove_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
auto target = storeroom.find(std::move(n_ptr));
if(target != storeroom.end() && target->second > 0) target->second--;
}
This is wrong for couple of reasons. First of all, taking an argument by non-const reference suggest modification of the object. Second of all, the creation of n_ptr from a pointer obtained by using & on an argumnet is incredibly risky. It assumes that the object is allocated on the heap and it is unowned. A situation that generally should not take place and is incredibly dangerous. In case where the passed object is on the stack and / or is already managed by some other owner, this is a recipe for a disaster (like a segfault).
What's more, it is more or less guaranteed to end up in a disaster, since both add_sell_obj() and remove_sell_obj() create std::unique_ptrs to potentially the same object. This is exactly the case from the original question's main(). Two std::unique_ptrs pointing to the same object result in double delete.
While it's not necessarily the best approach for this problem if one uses C++ (as compared to Java), there are couple of interesting tools that can be used for this task. The code below assumes C++20.
The class hierarchy
First of all, we need a base class that will be used when referring to all the objects stored in the shop:
struct sell_object { };
And then we need to introduce classes that will represent conrete objects:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
};
For simplicity (but to still have some distinctions) I chose for them to have just one, distinct field (title and number_of_legs).
The storage
The shop class that will represent storage for any sell_object needs to somehow store, well, any sell_object. For that we either need to use pointers or references to the base class. You can't have a container of references, so it's best to use pointers. Smart pointers.
Originally the question suggested the usage of std::unordered_map. Let us stick with it:
class shop {
std::unordered_map<
std::unique_ptr<sell_object>, int,
> storage;
public:
auto add(...) -> void {
...
}
auto remove(...) -> void {
...
}
};
It is worth mentioning that we chose std::unique_ptr as key for our map. That means that the storage is going to copy the passed objects and use the copies it owns to compare with elements we query (add or remove). No more than one equal object will be copied, though.
The fixed version of storage
There is a problem, however. std::unordered_map uses hashing and we need to provide a hash strategy for std::unique_ptr<sell_object>. Well, there already is one and it uses the hash strategy for T*. The problem is that we want to have custom hashing. Those particular std::unique_ptr<sell_object>s should be hashed according to the associated sell_objects.
Because of this, I opt to choose a different approach than the one proposed in the question. Instead of providing a global specialization in the std namespace, I will choose a custom hashing object and a custom comparator:
class shop {
struct sell_object_hash {
auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
return object->hash();
}
};
struct sell_object_equal {
auto operator()(
std::unique_ptr<sell_object> const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (*lhs <=> *rhs) == 0;
}
};
std::unordered_map<
std::unique_ptr<sell_object>, int,
sell_object_hash, sell_object_equal
> storage;
public:
auto add(...) -> void {
...
}
auto remove(...) -> void {
...
}
};
Notice a few things. First of all, the type of storage has changed. No longer it is an std::unordered_map<std::unique_ptr<T>, int>, but an std::unordered_map<std::unique_ptr<T>, int, sell_object_hash, sell_object_equal>. This is to indicate that we are using custom hasher (sell_object_hash) and custom comparator (sell_object_equal).
The lines we need to pay extra attention are:
return object->hash();
return (*lhs <=> *rhs) == 0;
Onto them:
return object->hash();
This is a delegation of hashing. Instead of being an observer and trying to have a type that for each and every possible type derived from sell_object implements a different hashing, we require that those objects supply the sufficient hashing themselves. In the original question, the std::hash specialization was the said "observer". It certainly did not scale as a solution.
In order to achieve the aforementioned, we modify the base class to impose the listed requirement:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
};
Thus we also need to change our book and table classes:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
auto hash() const -> std::size_t override {
return std::hash<std::string>()(title);
}
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
auto hash() const -> std::size_t override {
return std::hash<int>()(number_of_legs);
}
};
return (*lhs <=> *rhs) == 0;
This is a C++20 feature called the three-way comparison operator, sometimes called the spaceship operator. I opted into using it, since starting with C++20, most types that desire to be comparable will be using this operator. That means we also need our concrete classes to implement it. What's more, we need to be able to call it with base references (sell_object&). Yet another virtual function (operator, actually) needs to be added to the base class:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
};
Every subclass of sell_object is going to be required to be comparable with other sell_objects. The main reason is that we need to compare sell_objects in our storage map. For completeness, I used std::partial_ordering, since we require every sell_object to be comparable with every other sell_object. While comparing two books or two tables yields strong ordering (total ordering where two equivalent objects are indistinguishable), we also - by design - need to support comparing a book to a table. This is somewhat meaningless (always returns false). Fortunately, C++20 helps us here with std::partial_ordering::unordered. Those elements are not equal and neither of them is greater or less than the other. Perfect for such scenarios.
Our concrete classes need to change accordingly:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
auto hash() const -> std::size_t override {
return std::hash<std::string>()(title);
}
auto operator<=>(book const& other) const {
return title <=> other.title;
};
auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
if (auto book_ptr = dynamic_cast<book const*>(&other)) {
return *this <=> *book_ptr;
} else {
return std::partial_ordering::unordered;
}
}
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
auto hash() const -> std::size_t override {
return std::hash<int>()(number_of_legs);
}
auto operator<=>(table const& other) const {
return number_of_legs <=> other.number_of_legs;
};
auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
if (auto table_ptr = dynamic_cast<table const*>(&other)) {
return *this <=> *table_ptr;
} else {
return std::partial_ordering::unordered;
}
}
};
The overriden operator<=>s are required due to the base class' requirements. They are quite simple - if the other object (the one we are comparing this object to) is of the same type, we delegate to the <=> version that uses the concrete type. If not, we have a type mismatch and we report the unordered ordering.
For those of you who are curious why the <=> implementation that compares two, identical types is not = defaulted: it would use the base-class comparison first, which would delegate to the sell_object version. That would dynamic_cast again and delegate to the defaulted implementation. Which would compare the base class and... result in an infinite recursion.
add() and remove() implementation
Everything seems great, so we can move on to adding and removing items to and from our shop. However, we immediately arrive at a hard design decision. What arguments should add() and remove() accept?
std::unique_ptr<sell_object>? That would make their implementation trivial, but it would require the user to construct a potentially useless, dynamically allocated object just to call a function.
sell_object const&? That seems correct, but there are two problems with it: 1) we would still need to construct an std::unique_ptr with a copy of passed argument to find the appropriate element to remove; 2) we wouldn't be able to correctly implement add(), since we need the concrete type to construct an actual std::unique_ptr to put into our map.
Let us go with the second option and fix the first problem. We certainly do not want to construct a useless and expensive object just to look for it in the storage map. Ideally we would like to find a key (std::unique_ptr<sell_object>) that matches the passed object. Fortunately, transparent hashers and comparators come to the rescue.
By supplying additional overloads for hasher and comparator (and providing a public is_transparent alias), we allow for looking for a key that is equivalent, without needing the types to match:
struct sell_object_hash {
auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
return object->hash();
}
auto operator()(sell_object const& object) const -> std::size_t {
return object.hash();
}
using is_transparent = void;
};
struct sell_object_equal {
auto operator()(
std::unique_ptr<sell_object> const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (*lhs <=> *rhs) == 0;
}
auto operator()(
sell_object const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (lhs <=> *rhs) == 0;
}
auto operator()(
std::unique_ptr<sell_object> const& lhs,
sell_object const& rhs
) const -> bool {
return (*lhs <=> rhs) == 0;
}
using is_transparent = void;
};
Thanks to that, we can now implement shop::remove() like so:
auto remove(sell_object const& to_remove) -> void {
if (auto it = storage.find(to_remove); it != storage.end()) {
it->second--;
if (it->second == 0) {
storage.erase(it);
}
}
}
Since our comparator and hasher are transparent, we can find() an element that is equivalent to the argument. If we find it, we decrement the corresponding count. If it reaches 0, we remove the entry completely.
Great, onto the second problem. Let us list the requirements for the shop::add():
we need the concrete type of the object (merely a reference to the base class is not enough, since we need to create matching std::unique_ptr).
we need that type to be derived from sell_object.
We can achieve both with a constrained* template:
template <std::derived_from<sell_object> T>
auto add(T const& to_add) -> void {
if (auto it = storage.find(to_add); it != storage.end()) {
it->second++;
} else {
storage[std::make_unique<T>(to_add)] = 1;
}
}
This is, again, quite simple
*References: {1} {2}
Correct destruction semantics
There is only one more thing that separates us from the correct implementation. It's the fact that if we have a pointer (either smart or not) to a base class that is used to deallocate it, the destructor needs to be virtual.
This leads us to the final version of the sell_object class:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
virtual ~sell_object() = default;
};
See full implementation with example and additional printing utilities.

Related

Can static polymorphism (templates) be used despite type erasure?

Having returned relatively recently to C++ after decades of Java, I am currently struggling with a template-based approach to data conversion for instances where type erasure has been applied. Please bear with me, my nomenclature may still be off for C++-natives.
This is what I am trying to achieve:
Implement dynamic variables which are able to hold essentially any value type
Access the content of those variables using various other representations (string, ints, binary, ...)
Be able to hold variable instances in containers, independent of their value type
Convert between variable value and representation using conversion functions
Be able to introduce new representations just by providing new conversion functions
Constraints: use only C++-11 features if possible, no use of libraries like boost::any etc.
A rough sketch of this might look like this:
#include <iostream>
#include <vector>
void convert(const std::string &f, std::string &t) { t = f; }
void convert(const int &f, std::string &t) { t = std::to_string(f); }
void convert(const std::string &f, int &t) { t = std::stoi(f); }
void convert(const int &f, int &t) { t = f; }
struct Variable {
virtual void get(int &i) = 0;
virtual void get(std::string &s) = 0;
};
template <typename T> struct VariableImpl : Variable {
T value;
VariableImpl(const T &v) : value{v} {};
void get(int &i) { convert(value, i); };
void get(std::string &s) { convert(value, s); };
};
int main() {
VariableImpl<int> v1{42};
VariableImpl<std::string> v2{"1234"};
std::vector<Variable *> vars{&v1, &v2};
for (auto &v : vars) {
int i;
v->get(i);
std::string s;
v->get(s);
std::cout << "int representation: " << i <<
", string representation: " << s << std::endl;
}
return 0;
}
The code does what it is supposed to do, but obvoiusly I would like to get rid of Variable::get(int/std::string/...) and instead template them, because otherwise every new representation requires a definition and an implementation with the latter being exactly the same as all the others.
I've played with various approaches so far, like virtual templated, methods, applying the CRDT with intermediate type, various forms of wrappers, yet in all of them I get bitten by the erased value type of VariableImpl. On one hand, I think there might not be a solution, because after type erasure, the compiler cannot possibly know what templated getters and converter calls it must generate. On the other hand I think i might be missing something really essential here and there should be a solution despite the constraints mentioned above.
This is a classical double dispatch problem. The usual solution to this problem is to have some kind of dispatcher class with multiple implementations of the function you want to dispatch (get in your case). This is called the visitor pattern. The well-known drawback of it is the dependency cycle it creates (each class in the hierarchy depends on all other classes in the hierarchy). Thus there's a need to revisit it each time a new type is added. No amount of template wizardry eliminates it.
You don't have a specialised Visitor class, your Variable serves as a Visitor of itself, but this is a minor detail.
Since you don't like this solution, there is another one. It uses a registry of functions populated at run time and keyed on type identification of their arguments. This is sometimes called "Acyclic Visitor".
Here's a half-baked C++11-friendly implementation for your case.
#include <map>
#include <vector>
#include <typeinfo>
#include <typeindex>
#include <utility>
#include <functional>
#include <string>
#include <stdexcept>
struct Variable
{
virtual void convertValue(Variable& to) const = 0;
virtual ~Variable() {};
virtual std::type_index getTypeIdx() const = 0;
template <typename K> K get() const;
static std::map<std::pair<std::type_index, std::type_index>,
std::function<void(const Variable&, Variable&)>>
conversionMap;
template <typename T, typename K>
static void registerConversion(K (*fn)(const T&));
};
template <typename T>
struct VariableImpl : Variable
{
T value;
VariableImpl(const T &v) : value{v} {};
VariableImpl() : value{} {}; // this is needed for a declaration of
// `VariableImpl<K> below
// It can be avoided but it is
// a story for another day
void convertValue(Variable& to) const override
{
auto typeIdxFrom = getTypeIdx();
auto typeIdxTo = to.getTypeIdx();
if (typeIdxFrom == typeIdxTo) // no conversion needed
{
dynamic_cast<VariableImpl<T>&>(to).value = value;
}
else
{
auto fcnIter = conversionMap.find({getTypeIdx(), to.getTypeIdx()});
if (fcnIter != conversionMap.end())
{
fcnIter->second(*this, to);
}
else
throw std::logic_error("no conversion");
}
}
std::type_index getTypeIdx() const override
{
return std::type_index(typeid(T));
}
};
template <typename K> K Variable::get() const
{
VariableImpl<K> vk;
convertValue(vk);
return vk.value;
}
template <typename T, typename K>
void Variable::registerConversion(K (*fn)(const T&))
{
// add a mutex if you ever spread this over multiple threads
conversionMap[{std::type_index(typeid(T)), std::type_index(typeid(K))}] =
[fn](const Variable& from, Variable& to) {
dynamic_cast<VariableImpl<K>&>(to).value =
fn(dynamic_cast<const VariableImpl<T>&>(from).value);
};
}
Now of course you need to call registerConversion e.g. at the beginning of main and pass it each conversion function.
Variable::registerConversion(int_to_string);
Variable::registerConversion(string_to_int);
This is not ideal, but hardly anything is ever ideal.
Having said all that, I would recommend you revisit your design. Do you really need all these conversions? Why not pick one representation and stick with it?
Implement dynamic variables which are able to hold essentially any value type
Be able to hold variable instances in containers, independent of their value type
These two requirements are quite challenging on its own. The class templates don't really encourage inheritance, and you already did the right thing to hold what you asked for: introduced a common base class for the class template, which you can later refer to in order to store pointers of the said type in a collection.
Access the content of those variables using various other representations (string, ints, binary, ...)
Be able to introduce new representations just by providing new conversion functions
This is where it breaks. Function templates assume common implementation for different types, while inheritance assumes different implementation for the same types.
You goal is to introduce different implementation for different types, and in order to make your requirements viable you have to switch to one of those two options instead (or put up with a number of functions for each case which you have already introduced yourself)
Edit:
One of the strategies you may employ to enforce inheritance approach is generalisation of the arguments to the extent where they can be used interchangeably by the abstract interface. E.g. you may wrap the converting arguments inside of a union like this:
struct Variable {
struct converter_type {
enum { INT, STRING } type;
union {
int* m_int;
std::string* m_string;
};
};
virtual void get(converter_type& var) = 0;
virtual ~Variable() = default;
};
And then take whatever part of it inside of the implementation:
void get(converter_type& var) override {
switch (var.type) {
case converter_type::INT:
convert(value, var.m_int);
break;
case converter_type::STRING:
convert(value, var.m_string);
break;
}
}
To be honest I don't think this is a less verbose approach compared to just having a number of functions for each type combination, but i think you got the idea that you can just wrap your arguments somehow to cement the abstract class interface.
Implement std::any. It is similar to boost::any.
Create a conversion dispatcher based off typeids. Store your any alongside the conversion dispatcher.
"new conversion functions" have to be passed to the dispatcher.
When asked to convert to a type, pass that typeid to the dispatcher.
So we start with these 3 types:
using any = std::any; // implement this
using converter = std::function<any(any const&)>;
using convert_table = std::map<std::type_index, converter>;
using convert_lookup = convert_table(*)();
template<class T>
convert_table& lookup_convert_table() {
static convert_table t;
return t;
}
struct converter_any: any {
template<class T,
typename std::enable_if<
!std::is_same<typename std::decay<T>::type, converter_any>::value, bool
>::type = true
>
converter_any( T&& t ):
any(std::forward<T>(t)),
table(&lookup_convert_table<typename std::decay<T>::type>())
{}
converter_any(converter_any const&)=default;
converter_any(converter_any &&)=default;
converter_any& operator=(converter_any const&)=default;
converter_any& operator=(converter_any&&)=default;
~converter_any()=default;
converter_any()=default;
convert_table const* table = nullptr;
template<class U>
U convert_to() const {
if (!table)
throw 1; // make a better exception than int
auto it = table->find(typeid(U));
if (it == table->end())
throw 2; // make a better exception than int
any const& self = *this;
return any_cast<U>((it->second)(self));
}
};
template<class Dest, class Src>
bool add_converter_to_table( Dest(*f)(Src const&) ) {
lookup_convert_table<Src>()[typeid(Dest)] = [f](any const& s)->any {
Src src = std::any_cast<Src>(s);
auto r = f(src);
return r;
};
return true;
}
now your code looks like:
const bool bStringRegistered =
add_converter_to_table(+[](std::string const& f)->std::string{ return f; })
&& add_converter_to_table(+[](std::string const& f)->int{ return std::stoi(f); });
const bool bIntRegistered =
add_converter_to_table(+[](int const& i)->int{ return i; })
&& add_converter_to_table(+[](int const& i)->std::string{ return std::to_string(i); });
int main() {
converter_any v1{42};
converter_any v2{std::string("1234")};
std::vector<converter_any> vars{v1, v2}; // copies!
for (auto &v : vars) {
int i = v.convert_to<int>();
std::string s = v.convert_to<std::string>();
std::cout << "int representation: " << i <<
", string representation: " << s << std::endl;
}
}
live example.
...
Ok, what did I do?
I used any to be a smart void* that can store anything. Rewriting this is a bad idea, use someone else's implementation.
Then, I augmented it with a manually written virtual function table. Which table I add is determined by the constructor of my converter_any; here, I know the type stored, so I can store the right table.
Typically when using this technique, I'd know what functions are in there. For your implementation we do not; so the table is a map from the type id of the destination, to a conversion function.
The conversion function takes anys and returns anys -- again, don't repeat this work. And now it has a fixed signature.
To add support for a type, you independently register conversion functions. Here, my conversion function registration helper deduces the from type (to determine which table to register it in) and the destination type (to determine which entry in the table), and then automatically writes the any boxing/unboxing code for you.
...
At a higher level, what I'm doing is writing my own type erasure and object model. C++ has enough power that you can write your own object models, and when you want features that the default object model doesn't solve, well, roll a new object model.
Second, I'm using value types. A Java programmer isn't used to value types having polymorphic behavior, but much of C++ works much better if you write your code using value types.
So my converter_any is a polymorphic value type. You can store copies of them in vectors etc, and it just works.

Stop an increasing infinite recursive template instantiation, that is not needed

I'm implementing a graph class, with each vertex having a Label of not necessarily the same type. I want the user to be able to provide any Labels (at compile time), without the Graph or the Vertex to know what the type is. For this, I used templated polymorphism, which I've hidden inside a Label class, in order for the Labels to have value semantics. It works like a charm and the relevant code is this (ignore the commented parts for now):
//Label.hpp:
#include <memory>
class Label {
public:
template<class T> Label(const T& name) : m_pName(new Name<T>(name)) {}
Label(const Label& other) : m_pName(other.m_pName->copy()) {}
// Label(const Label& other, size_t extraInfo) : m_pName(other.m_pName->copyAndAddInfo(extraInfo)) {}
bool operator==(const Label& other) const { return *m_pName == *other.m_pName; }
private:
struct NameBase {
public:
virtual ~NameBase() = default;
virtual NameBase* copy() const = 0;
// virtual NameBase* copyAndAddInfo(size_t info) const = 0;
virtual bool operator==(const NameBase& other) const = 0;
};
template<class T> struct Name : NameBase {
public:
Name(T name) : m_name(std::move(name)) {}
NameBase* copy() const override { return new Name<T>(m_name); }
// NameBase* copyAndAddInfo(size_t info) const override {
// return new Name<std::pair<T, size_t>>(std::make_pair(m_name, info));
// }
bool operator==(const NameBase& other) const override {
const auto pOtherCasted = dynamic_cast<const Name<T>*>(&other);
if(pOtherCasted == nullptr) return false;
return m_name == pOtherCasted->m_name;
}
private:
T m_name;
};
std::unique_ptr<NameBase> m_pName;
};
One requirement of the user (aka me) is to be able to create disjoint unions of Graphs (he is already able to create dual Graphs, unions of Graphs (where vertices having the same Label, are mapped to the same vertex), etc.). The wish is that the labels of the new Graph are pairs of the old label and some integer, denoting from which graph the label came (this also ensures that the new labels are all different). For this, I thought that I could use the commented parts of the Label class, but the problem that my g++17 compiler has, is that the moment I define the first Label with some type T, it tries to instantiate everything that could be used:
Name<T>, Name<std::pair<T, size_t>>, Name<std::pair<std::pair<T, size_t>, size_t>>, ...
Try for example to compile this (just an example, that otherwise works):
// testLabel.cpp:
#include "Label.hpp"
#include <vector>
#include <iostream>
int main() {
std::vector<Label> labels;
labels.emplace_back(5);
labels.emplace_back(2.1);
labels.emplace_back(std::make_pair(true, 2));
Label testLabel(std::make_pair(true, 2));
for(const auto& label : labels)
std::cout<<(label == testLabel)<<std::endl;
return 0;
}
The compilation just freezes. (I do not get the message "maximum template recursion capacity exceeded", that I saw others get, but it obviously tries to instantiate everything). I've tried to separate the function in another class and explicitly initialize only the needed templates, in order to trick the compiler, but with no effect.
The desired behaviour (I do not know if possible), is to instantiate the used template classes (together with the member function declarations), but define the member functions lazily, i.e. only if they really get called. For example, if I call Label(3), there should be a class Name<int>, but the function
NameBase* Name<int>::copyAndAddInfo(size_t info) const;
shall only be defined if I call it, at some point. (thus, the Name<std::pair<int, size_t>> is only going to be instantiated on demand)
It feels like something which should be doable, since the compiler already defines templated functions on demand.
An idea whould be to completely change the implementation and use variants, but
I do not want to keep track of the types the user needs manually, and
I quite like this implementation approach and want to see its limits, before changing it.
Does anyone have any hints on how I could solve this problem?
To directly answer your question, the virtual and template combo makes it impossible for the compiler to lazily implement the body copyAndAddInfo. The virtual base type pointer hides the type information, so when the compiler sees other.m_pName->copyAndAddInfo, it couldn't know what type it needs to lazily implement.
EDIT:
Ok, so based on your rationale for using templates, it seems like you only want to accept labels of different types, and might not actually care if the disjoint union information is part of the type. If that's the case, you could move it from the name to the label, and make it run-time information:
class Label {
public:
template<class T> Label(const T& name) : m_pName(new Name<T>(name)) {}
Label(const Label& other) : m_pName(other.m_pName->copy()), m_extraInfo(other.m_extraInfo) { }
Label(const Label& other, size_t extraInfo) : m_pName(other.m_pName->copy()), m_extraInfo(other.m_extraInfo) {
m_extraInfo.push_back(extraInfo);
}
bool operator==(const Label& other) const {
return *m_pName == *other.m_pName && std::equal(
m_extraInfo.begin(), m_extraInfo.end(),
other.m_extraInfo.begin(), other.m_extraInfo.end()); }
private:
struct NameBase { /* same as before */ };
std::vector<size_t> m_extraInfo;
std::unique_ptr<NameBase> m_pName;
};
If the disjoint union info being part of the type is important, than please enjoy my original sarcastic answer below.
ORIGINAL ANSWER:
That said, if you're willing to put a cap on the recursion, I have an evil solution for you that works for up to N levels of nesting: use template tricks to count the level of nesting. Then use SFINAE to throw an error after N levels, instead of recursing forever.
First, to count the levels of nesting:
template <typename T, size_t Level>
struct CountNestedPairsImpl
{
static constexpr size_t value = Level;
};
template <typename T, size_t Level>
struct CountNestedPairsImpl<std::pair<T, size_t>, Level> : CountNestedPairsImpl<T, Level + 1>
{
using CountNestedPairsImpl<T, Level + 1>::value;
};
template <typename T>
using CountNestedPairs = CountNestedPairsImpl<T, 0>;
Then, use std::enable_if<> to generate different bodies based on the nesting level:
constexpr size_t NESTING_LIMIT = 4;
NameBase* copyAndAddInfo(size_t info) const override {
return copyAndAddInfoImpl(info);
}
template <typename U = T, typename std::enable_if<CountNestedPairs<U>::value < NESTING_LIMIT, nullptr_t>::type = nullptr>
NameBase* copyAndAddInfoImpl(size_t info) const {
return new Name<std::pair<T, size_t>>(std::make_pair(m_name, info));
}
template <typename U = T, typename std::enable_if<CountNestedPairs<U>::value >= NESTING_LIMIT, nullptr_t>::type = nullptr>
NameBase* copyAndAddInfoImpl(size_t info) const {
throw std::runtime_error("too much disjoint union nesting");
}
Why did I call this evil? It's going to generate every possible level of nesting allowed, so if you use NESTING_LIMIT=20 it will generate 20 classes per label type. But hey, at least it compiles!
https://godbolt.org/z/eaQTzB

What is the preferred way to store one or no object in c++?

In the spirit of "choose your containers wisely", I am interested in what is the best way to store either exactly one or no object, for example as a member in a class. This could be the case, e.g., if the object being held is expensive to calculate and should be cached in some way (or any other type of "late" creation).
The obvious candidates are std::vector and std::unique_ptr, for example:
class object_t;
class foo_t {
std::unique_ptr<object_t> m_cache;
public:
object_t getObject() {
if( not m_cache ) {
m_cache.reset(new object_t()); // object creation is expensive
}
return object_t(*m_cache);
}
};
and similarly with vector (or almost any other container):
class object_t;
class foo_t {
std::vector<object_t> m_cache;
public:
object_t getObject() {
if( m_cache.empty() ) {
m_cache.push_back(object_t()); // object creation is expensive
}
return m_cache.front();
}
};
Of course, there is still the possibility to have some boolean variable, which holds the state of the object:
class object_t;
class foo_t {
bool cache_healthy;
object_t m_cache;
public:
foo_t() : cache_healthy(false), m_cache() {}
object_t getObject() {
if( not cache_healthy ) {
m_cache = object_t();
cache_healthy = true;
}
return m_cache;
}
/* do other things that might set cache_healthy to false. */
};
From the three examples, I like the last one the less, because it either creates the object twice, or, if I change object_t to have a "cheap" / incomplete constructor, might return a invalid object.
The solution with the vector I dislike more semantically, because a vector (or any other container type) might give the impression that there might be more than just one object.
Now thinking of it again, I think I like the pointer solution most, however, still am not entirely happy with it and would like to hear if you know of any solution that is the most elegant in this case.
The "obvious" solution is using boost::optional or (in C++17) std::optional.
An implementation of something like this could look like the following:
template <typename T>
class optional
{
public:
optional() : m_isset(false) {}
template <typename ...Args>
optional(Args... args) {
m_isset = true;
new (&m_data[0]) optional { args... };
}
// overload operator-> and operator* by reinterpret_casting m_data, throwing exceptions if isset == false
private:
bool m_isset;
char m_data[sizeof(T)];
}
The disadvantages of your solutions are unneeded heap allocation in 1 and 2 and reliance on a copy in 3.

shared_ptr<T> to shared_ptr<T const> and vector<T> to vector<T const>

I'm trying to define a good design for my software which implies being careful about read/write access to some variables. Here I simplified the program for the discussion. Hopefully this will be also helpful to others. :-)
Let's say we have a class X as follow:
class X {
int x;
public:
X(int y) : x(y) { }
void print() const { std::cout << "X::" << x << std::endl; }
void foo() { ++x; }
};
Let's also say that in the future this class will be subclassed with X1, X2, ... which can reimplement print() and foo(). (I omitted the required virtual keywords for simplicity here since it's not the actual issue I'm facing.)
Since we will use polymorphisme, let's use (smart) pointers and define a simple factory:
using XPtr = std::shared_ptr<X>;
using ConstXPtr = std::shared_ptr<X const>;
XPtr createX(int x) { return std::make_shared<X>(x); }
Until now, everything is fine: I can define goo(p) which can read and write p and hoo(p) which can only read p.
void goo(XPtr p) {
p->print();
p->foo();
p->print();
}
void hoo(ConstXPtr p) {
p->print();
// p->foo(); // ERROR :-)
}
And the call site looks like this:
XPtr p = createX(42);
goo(p);
hoo(p);
The shared pointer to X (XPtr) is automatically converted to its const version (ConstXPtr). Nice, it's exactly what I want!
Now come the troubles: I need a heterogeneous collection of X. My choice is a std::vector<XPtr>. (It could also be a list, why not.)
The design I have in mind is the following. I have two versions of the container: one with read/write access to its elements, one with read-only access to its elements.
using XsPtr = std::vector<XPtr>;
using ConstXsPtr = std::vector<ConstXPtr>;
I've got a class that handles this data:
class E {
XsPtr xs;
public:
E() {
for (auto i : { 2, 3, 5, 7, 11, 13 }) {
xs.emplace_back(createX(std::move(i)));
}
}
void loo() {
std::cout << "\n\nloo()" << std::endl;
ioo(toConst(xs));
joo(xs);
ioo(toConst(xs));
}
void moo() const {
std::cout << "\n\nmoo()" << std::endl;
ioo(toConst(xs));
joo(xs); // Should not be allowed
ioo(toConst(xs));
}
};
The ioo() and joo() functions are as follow:
void ioo(ConstXsPtr xs) {
for (auto p : xs) {
p->print();
// p->foo(); // ERROR :-)
}
}
void joo(XsPtr xs) {
for (auto p: xs) {
p->foo();
}
}
As you can see, in E::loo() and E::moo() I have to do some conversion with toConst():
ConstXsPtr toConst(XsPtr xs) {
ConstXsPtr cxs(xs.size());
std::copy(std::begin(xs), std::end(xs), std::begin(cxs));
return cxs;
}
But that means copying everything over and over.... :-/
Also, in moo(), which is const, I can call joo() which will modify xs's data. Not what I wanted. Here I would prefer a compilation error.
The full code is available at ideone.com.
The question is: is it possible to do the same but without copying the vector to its const version? Or, more generally, is there a good technique/pattern which is both efficient and easy to understand?
Thank you. :-)
I think the usual answer is that for a class template X<T>, any X<const T> could be specialized and therefore the compiler is not allow to simply assume it can convert a pointer or reference of X<T> to X<const T> and that there is not general way to express that those two actually are convertible. But then I though: Wait, there is a way to say X<T> IS A X<const T>. IS A is expressed via inheritance.
While this will not help you for std::shared_ptr or standard containers, it is a technique that you might want to use when you implement your own classes. In fact, I wonder if std::shared_ptr and the containers could/should be improved to support this. Can anyone see any problem with this?
The technique I have in mind would work like this:
template< typename T > struct my_ptr : my_ptr< const T >
{
using my_ptr< const T >::my_ptr;
T& operator*() const { return *this->p_; }
};
template< typename T > struct my_ptr< const T >
{
protected:
T* p_;
public:
explicit my_ptr( T* p )
: p_(p)
{
}
// just to test nothing is copied
my_ptr( const my_ptr& p ) = delete;
~my_ptr()
{
delete p_;
}
const T& operator*() const { return *p_; }
};
Live example
There is a fundamental issue with what you want to do.
A std::vector<T const*> is not a restriction of a std::vector<T*>, and the same is true of vectors containing smart pointers and their const versions.
Concretely, I can store a pointer to const int foo = 7; in the first container, but not the second. std::vector is both a range and a container. It is similar to the T** vs T const** problem.
Now, technically std::vector<T const*> const is a restriction of std::vector<T>, but that is not supported.
A way around this is to start workimg eith range views: non owning views into other containers. A non owning T const* iterator view into a std::vector<T *> is possible, and can give you the interface you want.
boost::range can do the boilerplate for you, but writing your own contiguous_range_view<T> or random_range_view<RandomAccessIterator> is not hard. It gets fancy ehen you want to auto detect the iterator category and enable capabilities based off that, which is why boost::range contains much more code.
Hiura,
I've tried to compile your code from repo and g++4.8 returned some errors.
changes in main.cpp:97 and the remaining lines calling view::create() with lambda function as the second argument.
+add+
auto f_lambda([](view::ConstRef_t<view::ElementType_t<Element>> const& e) { return ((e.getX() % 2) == 0); });
std::function<bool(view::ConstRef_t<view::ElementType_t<Element>>)> f(std::cref(f_lambda));
+mod+
printDocument(view::create(xs, f));
also View.hpp:185 required additional operator, namely:
+add+
bool operator==(IteratorBase const& a, IteratorBase const& b)
{
return a.self == b.self;
}
BR,
Marek Szews
Based on the comments and answers, I ended up creating a views for containers.
Basically I defined new iterators. I create a project on github here: mantognini/ContainerView.
The code can probably be improved but the main idea is to have two template classes, View and ConstView, on an existing container (e.g. std::vector<T>) that has a begin() and end() method for iterating on the underlying container.
With a little bit of inheritance (View is a ConstView) it helps converting read-write with to read-only view when needed without extra code.
Since I don't like pointers, I used template specialization to hide std::shared_ptr: a view on a container of std::shared_ptr<T> won't required extra dereferencing. (I haven't implemented it yet for raw pointers since I don't use them.)
Here is a basic example of my views in action.

How to create operator-> in iterator without a container?

template <class Enum>
class EnumIterator {
public:
const Enum* operator-> () const {
return &(Enum::OfInt(i)); // warning: taking address of temporary
}
const Enum operator* () const {
return Enum::OfInt(i); // There is no problem with this one!
}
private:
int i;
};
I get this warning above. Currently I'm using this hack:
template <class Enum>
class EnumIterator {
public:
const Enum* operator-> () {
tmp = Enum::OfInt(i);
return &tmp;
}
private:
int i;
Enum tmp;
};
But this is ugly because iterator serves as a missing container.
What is the proper way to iterate over range of values?
Update:
The iterator is specialized to a particular set objects which support named static constructor OfInt (code snippet updated).
Please do not nit-pick about the code I pasted, but just ask for clarification. I tried to extract a simple piece.
If you want to know T will be strong enum type (essentially an int packed into a class). There will be typedef EnumIterator < EnumX > Iterator; inside class EnumX.
Update 2:
consts added to indicate that members of strong enum class that will be accessed through -> do not change the returned temporary enum.
Updated the code with operator* which gives no problem.
Enum* operator-> () {
tmp = Enum::OfInt(i);
return &tmp;
}
The problem with this isn't that it's ugly, but that its not safe. What happens, for example in code like the following:
void f(EnumIterator it)
{
g(*it, *it);
}
Now g() ends up with two pointers, both of which point to the same internal temporary that was supposed to be an implementation detail of your iterator. If g() writes through one pointer, the other value changes, too. Ouch.
Your problem is, that this function is supposed to return a pointer, but you have no object to point to. No matter what, you will have to fix this.
I see two possibilities:
Since this thing seems to wrap an enum, and enumeration types have no members, that operator-> is useless anyway (it won't be instantiated unless called, and it cannot be called as this would result in a compile-time error) and can safely be omitted.
Store an object of the right type (something like Enum::enum_type) inside the iterator, and cast it to/from int only if you want to perform integer-like operations (e.g., increment) on it.
There are many kind of iterators.
On a vector for example, iterators are usually plain pointers:
template <class T>
class Iterator
{
public:
T* operator->() { return m_pointer; }
private:
T* m_pointer;
};
But this works because a vector is just an array, in fact.
On a doubly-linked list, it would be different, the list would be composed of nodes.
template <class T>
struct Node
{
Node* m_prev;
Node* m_next;
T m_value;
};
template <class T>
class Iterator
{
public:
T* operator->() { return m_node->m_value; }
private:
Node<T>* m_node;
};
Usually, you want you iterator to be as light as possible, because they are passed around by value, so a pointer into the underlying container makes sense.
You might want to add extra debugging capabilities:
possibility to invalidate the iterator
range checking possibility
container checking (ie, checking when comparing 2 iterators that they refer to the same container to begin with)
But those are niceties, and to begin with, this is a bit more complicated.
Note also Boost.Iterator which helps with the boiler-plate code.
EDIT: (update 1 and 2 grouped)
In your case, it's fine if your iterator is just an int, you don't need more. In fact for you strong enum you don't even need an iterator, you just need operator++ and operator-- :)
The point of having a reference to the container is usually to implement those ++ and -- operators. But from your element, just having an int (assuming it's large enough), and a way to get to the previous and next values is sufficient.
It would be easier though, if you had a static vector then you could simply reuse a vector iterator.
An iterator iterates on a specific container. The implementation depends on what kind of container it is. The pointer you return should point to a member of that container. You don't need to copy it, but you do need to keep track of what container you're iterating on, and where you're at (e.g. index for a vector) presumably initialized in the iterator's constructor. Or just use the STL.
What does OfInt return? It appears to be returning the wrong type in this case. It should be returning a T* instead it seems to be returning a T by value which you are then taking the address of. This may produce incorrect behavior since it will loose any update made through ->.
As there is no container I settled on merging iterator into my strong Enum.
I init raw int to -1 to support empty enums (limit == 0) and be able to use regular for loop with TryInc.
Here is the code:
template <uint limit>
class Enum {
public:
static const uint kLimit = limit;
Enum () : raw (-1) {
}
bool TryInc () {
if (raw+1 < kLimit) {
raw += 1;
return true;
}
return false;
}
uint GetRaw() const {
return raw;
}
void SetRaw (uint raw) {
this->raw = raw;
}
static Enum OfRaw (uint raw) {
return Enum (raw);
}
bool operator == (const Enum& other) const {
return this->raw == other.raw;
}
bool operator != (const Enum& other) const {
return this->raw != other.raw;
}
protected:
explicit Enum (uint raw) : raw (raw) {
}
private:
uint raw;
};
The usage:
class Color : public Enum <10> {
public:
static const Color red;
// constructors should be automatically forwarded ...
Color () : Enum<10> () {
}
private:
Color (uint raw) : Enum<10> (raw) {
}
};
const Color Color::red = Color(0);
int main() {
Color red = Color::red;
for (Color c; c.TryInc();) {
std::cout << c.GetRaw() << std::endl;
}
}