Stop an increasing infinite recursive template instantiation, that is not needed - c++

I'm implementing a graph class, with each vertex having a Label of not necessarily the same type. I want the user to be able to provide any Labels (at compile time), without the Graph or the Vertex to know what the type is. For this, I used templated polymorphism, which I've hidden inside a Label class, in order for the Labels to have value semantics. It works like a charm and the relevant code is this (ignore the commented parts for now):
//Label.hpp:
#include <memory>
class Label {
public:
template<class T> Label(const T& name) : m_pName(new Name<T>(name)) {}
Label(const Label& other) : m_pName(other.m_pName->copy()) {}
// Label(const Label& other, size_t extraInfo) : m_pName(other.m_pName->copyAndAddInfo(extraInfo)) {}
bool operator==(const Label& other) const { return *m_pName == *other.m_pName; }
private:
struct NameBase {
public:
virtual ~NameBase() = default;
virtual NameBase* copy() const = 0;
// virtual NameBase* copyAndAddInfo(size_t info) const = 0;
virtual bool operator==(const NameBase& other) const = 0;
};
template<class T> struct Name : NameBase {
public:
Name(T name) : m_name(std::move(name)) {}
NameBase* copy() const override { return new Name<T>(m_name); }
// NameBase* copyAndAddInfo(size_t info) const override {
// return new Name<std::pair<T, size_t>>(std::make_pair(m_name, info));
// }
bool operator==(const NameBase& other) const override {
const auto pOtherCasted = dynamic_cast<const Name<T>*>(&other);
if(pOtherCasted == nullptr) return false;
return m_name == pOtherCasted->m_name;
}
private:
T m_name;
};
std::unique_ptr<NameBase> m_pName;
};
One requirement of the user (aka me) is to be able to create disjoint unions of Graphs (he is already able to create dual Graphs, unions of Graphs (where vertices having the same Label, are mapped to the same vertex), etc.). The wish is that the labels of the new Graph are pairs of the old label and some integer, denoting from which graph the label came (this also ensures that the new labels are all different). For this, I thought that I could use the commented parts of the Label class, but the problem that my g++17 compiler has, is that the moment I define the first Label with some type T, it tries to instantiate everything that could be used:
Name<T>, Name<std::pair<T, size_t>>, Name<std::pair<std::pair<T, size_t>, size_t>>, ...
Try for example to compile this (just an example, that otherwise works):
// testLabel.cpp:
#include "Label.hpp"
#include <vector>
#include <iostream>
int main() {
std::vector<Label> labels;
labels.emplace_back(5);
labels.emplace_back(2.1);
labels.emplace_back(std::make_pair(true, 2));
Label testLabel(std::make_pair(true, 2));
for(const auto& label : labels)
std::cout<<(label == testLabel)<<std::endl;
return 0;
}
The compilation just freezes. (I do not get the message "maximum template recursion capacity exceeded", that I saw others get, but it obviously tries to instantiate everything). I've tried to separate the function in another class and explicitly initialize only the needed templates, in order to trick the compiler, but with no effect.
The desired behaviour (I do not know if possible), is to instantiate the used template classes (together with the member function declarations), but define the member functions lazily, i.e. only if they really get called. For example, if I call Label(3), there should be a class Name<int>, but the function
NameBase* Name<int>::copyAndAddInfo(size_t info) const;
shall only be defined if I call it, at some point. (thus, the Name<std::pair<int, size_t>> is only going to be instantiated on demand)
It feels like something which should be doable, since the compiler already defines templated functions on demand.
An idea whould be to completely change the implementation and use variants, but
I do not want to keep track of the types the user needs manually, and
I quite like this implementation approach and want to see its limits, before changing it.
Does anyone have any hints on how I could solve this problem?

To directly answer your question, the virtual and template combo makes it impossible for the compiler to lazily implement the body copyAndAddInfo. The virtual base type pointer hides the type information, so when the compiler sees other.m_pName->copyAndAddInfo, it couldn't know what type it needs to lazily implement.
EDIT:
Ok, so based on your rationale for using templates, it seems like you only want to accept labels of different types, and might not actually care if the disjoint union information is part of the type. If that's the case, you could move it from the name to the label, and make it run-time information:
class Label {
public:
template<class T> Label(const T& name) : m_pName(new Name<T>(name)) {}
Label(const Label& other) : m_pName(other.m_pName->copy()), m_extraInfo(other.m_extraInfo) { }
Label(const Label& other, size_t extraInfo) : m_pName(other.m_pName->copy()), m_extraInfo(other.m_extraInfo) {
m_extraInfo.push_back(extraInfo);
}
bool operator==(const Label& other) const {
return *m_pName == *other.m_pName && std::equal(
m_extraInfo.begin(), m_extraInfo.end(),
other.m_extraInfo.begin(), other.m_extraInfo.end()); }
private:
struct NameBase { /* same as before */ };
std::vector<size_t> m_extraInfo;
std::unique_ptr<NameBase> m_pName;
};
If the disjoint union info being part of the type is important, than please enjoy my original sarcastic answer below.
ORIGINAL ANSWER:
That said, if you're willing to put a cap on the recursion, I have an evil solution for you that works for up to N levels of nesting: use template tricks to count the level of nesting. Then use SFINAE to throw an error after N levels, instead of recursing forever.
First, to count the levels of nesting:
template <typename T, size_t Level>
struct CountNestedPairsImpl
{
static constexpr size_t value = Level;
};
template <typename T, size_t Level>
struct CountNestedPairsImpl<std::pair<T, size_t>, Level> : CountNestedPairsImpl<T, Level + 1>
{
using CountNestedPairsImpl<T, Level + 1>::value;
};
template <typename T>
using CountNestedPairs = CountNestedPairsImpl<T, 0>;
Then, use std::enable_if<> to generate different bodies based on the nesting level:
constexpr size_t NESTING_LIMIT = 4;
NameBase* copyAndAddInfo(size_t info) const override {
return copyAndAddInfoImpl(info);
}
template <typename U = T, typename std::enable_if<CountNestedPairs<U>::value < NESTING_LIMIT, nullptr_t>::type = nullptr>
NameBase* copyAndAddInfoImpl(size_t info) const {
return new Name<std::pair<T, size_t>>(std::make_pair(m_name, info));
}
template <typename U = T, typename std::enable_if<CountNestedPairs<U>::value >= NESTING_LIMIT, nullptr_t>::type = nullptr>
NameBase* copyAndAddInfoImpl(size_t info) const {
throw std::runtime_error("too much disjoint union nesting");
}
Why did I call this evil? It's going to generate every possible level of nesting allowed, so if you use NESTING_LIMIT=20 it will generate 20 classes per label type. But hey, at least it compiles!
https://godbolt.org/z/eaQTzB

Related

Can static polymorphism (templates) be used despite type erasure?

Having returned relatively recently to C++ after decades of Java, I am currently struggling with a template-based approach to data conversion for instances where type erasure has been applied. Please bear with me, my nomenclature may still be off for C++-natives.
This is what I am trying to achieve:
Implement dynamic variables which are able to hold essentially any value type
Access the content of those variables using various other representations (string, ints, binary, ...)
Be able to hold variable instances in containers, independent of their value type
Convert between variable value and representation using conversion functions
Be able to introduce new representations just by providing new conversion functions
Constraints: use only C++-11 features if possible, no use of libraries like boost::any etc.
A rough sketch of this might look like this:
#include <iostream>
#include <vector>
void convert(const std::string &f, std::string &t) { t = f; }
void convert(const int &f, std::string &t) { t = std::to_string(f); }
void convert(const std::string &f, int &t) { t = std::stoi(f); }
void convert(const int &f, int &t) { t = f; }
struct Variable {
virtual void get(int &i) = 0;
virtual void get(std::string &s) = 0;
};
template <typename T> struct VariableImpl : Variable {
T value;
VariableImpl(const T &v) : value{v} {};
void get(int &i) { convert(value, i); };
void get(std::string &s) { convert(value, s); };
};
int main() {
VariableImpl<int> v1{42};
VariableImpl<std::string> v2{"1234"};
std::vector<Variable *> vars{&v1, &v2};
for (auto &v : vars) {
int i;
v->get(i);
std::string s;
v->get(s);
std::cout << "int representation: " << i <<
", string representation: " << s << std::endl;
}
return 0;
}
The code does what it is supposed to do, but obvoiusly I would like to get rid of Variable::get(int/std::string/...) and instead template them, because otherwise every new representation requires a definition and an implementation with the latter being exactly the same as all the others.
I've played with various approaches so far, like virtual templated, methods, applying the CRDT with intermediate type, various forms of wrappers, yet in all of them I get bitten by the erased value type of VariableImpl. On one hand, I think there might not be a solution, because after type erasure, the compiler cannot possibly know what templated getters and converter calls it must generate. On the other hand I think i might be missing something really essential here and there should be a solution despite the constraints mentioned above.
This is a classical double dispatch problem. The usual solution to this problem is to have some kind of dispatcher class with multiple implementations of the function you want to dispatch (get in your case). This is called the visitor pattern. The well-known drawback of it is the dependency cycle it creates (each class in the hierarchy depends on all other classes in the hierarchy). Thus there's a need to revisit it each time a new type is added. No amount of template wizardry eliminates it.
You don't have a specialised Visitor class, your Variable serves as a Visitor of itself, but this is a minor detail.
Since you don't like this solution, there is another one. It uses a registry of functions populated at run time and keyed on type identification of their arguments. This is sometimes called "Acyclic Visitor".
Here's a half-baked C++11-friendly implementation for your case.
#include <map>
#include <vector>
#include <typeinfo>
#include <typeindex>
#include <utility>
#include <functional>
#include <string>
#include <stdexcept>
struct Variable
{
virtual void convertValue(Variable& to) const = 0;
virtual ~Variable() {};
virtual std::type_index getTypeIdx() const = 0;
template <typename K> K get() const;
static std::map<std::pair<std::type_index, std::type_index>,
std::function<void(const Variable&, Variable&)>>
conversionMap;
template <typename T, typename K>
static void registerConversion(K (*fn)(const T&));
};
template <typename T>
struct VariableImpl : Variable
{
T value;
VariableImpl(const T &v) : value{v} {};
VariableImpl() : value{} {}; // this is needed for a declaration of
// `VariableImpl<K> below
// It can be avoided but it is
// a story for another day
void convertValue(Variable& to) const override
{
auto typeIdxFrom = getTypeIdx();
auto typeIdxTo = to.getTypeIdx();
if (typeIdxFrom == typeIdxTo) // no conversion needed
{
dynamic_cast<VariableImpl<T>&>(to).value = value;
}
else
{
auto fcnIter = conversionMap.find({getTypeIdx(), to.getTypeIdx()});
if (fcnIter != conversionMap.end())
{
fcnIter->second(*this, to);
}
else
throw std::logic_error("no conversion");
}
}
std::type_index getTypeIdx() const override
{
return std::type_index(typeid(T));
}
};
template <typename K> K Variable::get() const
{
VariableImpl<K> vk;
convertValue(vk);
return vk.value;
}
template <typename T, typename K>
void Variable::registerConversion(K (*fn)(const T&))
{
// add a mutex if you ever spread this over multiple threads
conversionMap[{std::type_index(typeid(T)), std::type_index(typeid(K))}] =
[fn](const Variable& from, Variable& to) {
dynamic_cast<VariableImpl<K>&>(to).value =
fn(dynamic_cast<const VariableImpl<T>&>(from).value);
};
}
Now of course you need to call registerConversion e.g. at the beginning of main and pass it each conversion function.
Variable::registerConversion(int_to_string);
Variable::registerConversion(string_to_int);
This is not ideal, but hardly anything is ever ideal.
Having said all that, I would recommend you revisit your design. Do you really need all these conversions? Why not pick one representation and stick with it?
Implement dynamic variables which are able to hold essentially any value type
Be able to hold variable instances in containers, independent of their value type
These two requirements are quite challenging on its own. The class templates don't really encourage inheritance, and you already did the right thing to hold what you asked for: introduced a common base class for the class template, which you can later refer to in order to store pointers of the said type in a collection.
Access the content of those variables using various other representations (string, ints, binary, ...)
Be able to introduce new representations just by providing new conversion functions
This is where it breaks. Function templates assume common implementation for different types, while inheritance assumes different implementation for the same types.
You goal is to introduce different implementation for different types, and in order to make your requirements viable you have to switch to one of those two options instead (or put up with a number of functions for each case which you have already introduced yourself)
Edit:
One of the strategies you may employ to enforce inheritance approach is generalisation of the arguments to the extent where they can be used interchangeably by the abstract interface. E.g. you may wrap the converting arguments inside of a union like this:
struct Variable {
struct converter_type {
enum { INT, STRING } type;
union {
int* m_int;
std::string* m_string;
};
};
virtual void get(converter_type& var) = 0;
virtual ~Variable() = default;
};
And then take whatever part of it inside of the implementation:
void get(converter_type& var) override {
switch (var.type) {
case converter_type::INT:
convert(value, var.m_int);
break;
case converter_type::STRING:
convert(value, var.m_string);
break;
}
}
To be honest I don't think this is a less verbose approach compared to just having a number of functions for each type combination, but i think you got the idea that you can just wrap your arguments somehow to cement the abstract class interface.
Implement std::any. It is similar to boost::any.
Create a conversion dispatcher based off typeids. Store your any alongside the conversion dispatcher.
"new conversion functions" have to be passed to the dispatcher.
When asked to convert to a type, pass that typeid to the dispatcher.
So we start with these 3 types:
using any = std::any; // implement this
using converter = std::function<any(any const&)>;
using convert_table = std::map<std::type_index, converter>;
using convert_lookup = convert_table(*)();
template<class T>
convert_table& lookup_convert_table() {
static convert_table t;
return t;
}
struct converter_any: any {
template<class T,
typename std::enable_if<
!std::is_same<typename std::decay<T>::type, converter_any>::value, bool
>::type = true
>
converter_any( T&& t ):
any(std::forward<T>(t)),
table(&lookup_convert_table<typename std::decay<T>::type>())
{}
converter_any(converter_any const&)=default;
converter_any(converter_any &&)=default;
converter_any& operator=(converter_any const&)=default;
converter_any& operator=(converter_any&&)=default;
~converter_any()=default;
converter_any()=default;
convert_table const* table = nullptr;
template<class U>
U convert_to() const {
if (!table)
throw 1; // make a better exception than int
auto it = table->find(typeid(U));
if (it == table->end())
throw 2; // make a better exception than int
any const& self = *this;
return any_cast<U>((it->second)(self));
}
};
template<class Dest, class Src>
bool add_converter_to_table( Dest(*f)(Src const&) ) {
lookup_convert_table<Src>()[typeid(Dest)] = [f](any const& s)->any {
Src src = std::any_cast<Src>(s);
auto r = f(src);
return r;
};
return true;
}
now your code looks like:
const bool bStringRegistered =
add_converter_to_table(+[](std::string const& f)->std::string{ return f; })
&& add_converter_to_table(+[](std::string const& f)->int{ return std::stoi(f); });
const bool bIntRegistered =
add_converter_to_table(+[](int const& i)->int{ return i; })
&& add_converter_to_table(+[](int const& i)->std::string{ return std::to_string(i); });
int main() {
converter_any v1{42};
converter_any v2{std::string("1234")};
std::vector<converter_any> vars{v1, v2}; // copies!
for (auto &v : vars) {
int i = v.convert_to<int>();
std::string s = v.convert_to<std::string>();
std::cout << "int representation: " << i <<
", string representation: " << s << std::endl;
}
}
live example.
...
Ok, what did I do?
I used any to be a smart void* that can store anything. Rewriting this is a bad idea, use someone else's implementation.
Then, I augmented it with a manually written virtual function table. Which table I add is determined by the constructor of my converter_any; here, I know the type stored, so I can store the right table.
Typically when using this technique, I'd know what functions are in there. For your implementation we do not; so the table is a map from the type id of the destination, to a conversion function.
The conversion function takes anys and returns anys -- again, don't repeat this work. And now it has a fixed signature.
To add support for a type, you independently register conversion functions. Here, my conversion function registration helper deduces the from type (to determine which table to register it in) and the destination type (to determine which entry in the table), and then automatically writes the any boxing/unboxing code for you.
...
At a higher level, what I'm doing is writing my own type erasure and object model. C++ has enough power that you can write your own object models, and when you want features that the default object model doesn't solve, well, roll a new object model.
Second, I'm using value types. A Java programmer isn't used to value types having polymorphic behavior, but much of C++ works much better if you write your code using value types.
So my converter_any is a polymorphic value type. You can store copies of them in vectors etc, and it just works.

Is there a way to simultaneously assign a type to multiple templates in C++?

This question is based on the example code below, which is inspired by Sean Parent's talk.
The goal of the code below is to provide an object wrapper similar to boost::any. I wrote this code to educate myself of type erasure. So, there is no practical uses this code intends (considering there is already boost::any).
class ObjWrap {
public:
template <typename T>
ObjWrap(T O) : Self(new Obj<T>(std::move(O))) {}
template <typename T>
friend typename T * getObjPtr(ObjWrap O) {
return static_cast<T*>(O.Self->getObjPtr_());
}
private:
struct Concept {
virtual ~Concept() = 0;
virtual void* getObjPtr_() = 0;
};
template <typename T>
struct Obj : Concept {
Obj(T O) : Data(std::move(O)) {}
void* getObjPtr_() { return static_cast<void*>(&Data); }
T Data;
};
std::unique_ptr<Concept> Self;
};
Before I can really ask my question, let's examine the code in the following aspects:
Concept::getObjPtr_ returns void* because a) Concept cannot be a template otherwise unique_ptr<Concept> Self would not work; b) void* is the only way I know how to return Obj::Data in a type-agnostic way in C++. Please correct me if this is wrong...
T * getObjPtr(ObjWrap O) is a template that needs instantiation separately from the ObjWrap constructor.
The use of ObjWrap basically includes: a) make a new ObjWrap over an existing object; b) retrieve the underlying object given an ObjWrap. For example:
ObjWrap a(1);
ObjWrap b(std::string("b"));
int* p_a = getObjPtr<int>(a);
std::string* p_b = getObjPtr<std::string>(b);
This works but it is obvious that getObjPtr<int>(b) does not work as intended.
So, my question is:
Is there a way to fix the above code so that we can simply use int* p_a = getObjPtr(a) and std::string* p_b = getObjPtr(b) or better yet auto p_a = getObjPtr(a) and auto p_b = getObjPtr(b)? In other words, is there a way in C++ to instantiate two templates at the same time (if so, we can instantiate the ObjWrap constructor and T* getObjPtr(ObjWrap) at compile time of a ObjWrap object, e.g., ObjWrap a(1))?
Edit 1:
Making ObjWrap a templated class does not help since it defeats the purpose of type erasure.
template <typename T>
class ObjWrap {
/* ... */
};
ObjWrap<int> a(1); // this is no good for type erasure.
Edit 2:
I was reading the code and realize that it can be modified to reflect the idea a little better. So, please also look at the following code:
class ObjWrap {
public:
template <typename T>
ObjWrap(T O) : Self(new Obj<T>(std::move(O))) {}
template <typename T>
T * getObjPtr() {
return static_cast<T*>(Self->getObjPtr_());
}
private:
struct Concept {
virtual ~Concept() = 0;
virtual void* getObjPtr_() = 0;
};
template <typename T>
struct Obj : Concept {
Obj(T O) : Data(std::move(O)) {}
void* getObjPtr_() { return static_cast<void*>(&Data); }
T Data;
};
std::unique_ptr<Concept> Self;
};
int main() {
ObjWrap a(1);
ObjWrap b(std::string("b"));
int* p_a = a.getObjPtr<int>();
std::string* p_b = b.getObjPtr<std::string>();
std::cout << *p_a << " " << *p_b << "\n";
return 0;
}
The main difference between this version of the code versus the one above is that T * getObjPtr() is a member function that is encapsulated by the ObjWrap object.
Edit 3:
My question regarding type erasure is answered by accepted answer. However, the question on simultaneous type instantiation to multiple templates is yet to be answered. My guess is currently C++ does not allow it but it would be nice to hear from people with more experience on that.
There are a few things that may help.
First thing to say is that if Obj ever needs to expose the address of the object, it's not Sean Parent's 'inheritance is the root of all evil' type-erasing container.
The trick is to ensure that the interface of Obj offers all semantic actions and queries the wrapper will ever need.
In order to provide this, it's often a reasonable idea to cache the address of the object and its type_id in the concept.
Consider the following updated example, in which there is one public method - operator==. The rule is that two Objs are equal if they contain the same type of object and those objects compare equal.
Note that the address and type_id:
1) are implementation details and not exposed on the interface of Obj
2) are accessible without virtual calls, which short-circuits the not-equal case.
#include <memory>
#include <utility>
#include <typeinfo>
#include <utility>
#include <cassert>
#include <iostream>
class ObjWrap
{
public:
template <typename T>
ObjWrap(T O) : Self(new Model<T>(std::move(O))) {}
// objects are equal if they contain the same type of model
// and the models compare equal
bool operator==(ObjWrap const& other) const
{
// note the short-circuit when the types are not the same
// this means is_equal can guarantee that the address can be cast
// without a further check
return Self->info == other.Self->info
&& Self->is_equal(other.Self->addr);
}
bool operator!=(ObjWrap const& other) const
{
return !(*this == other);
}
friend std::ostream& operator<<(std::ostream& os, ObjWrap const& o)
{
return o.Self->emit(os);
}
private:
struct Concept
{
// cache the address and type here in the concept.
void* addr;
std::type_info const& info;
Concept(void* address, std::type_info const& info)
: addr(address)
, info(info)
{}
virtual ~Concept() = default;
// this is the concept's interface
virtual bool is_equal(void const* other_address) const = 0;
virtual std::ostream& emit(std::ostream& os) const = 0;
};
template <typename T>
struct Model : Concept
{
Model(T O)
: Concept(std::addressof(Data), typeid(T))
, Data(std::move(O)) {}
// no need to check the pointer before casting it.
// Obj takes care of that
/// #pre other_address is a valid pointer to a T
bool is_equal(void const* other_address) const override
{
return Data == *(static_cast<T const*>(other_address));
}
std::ostream& emit(std::ostream& os) const override
{
return os << Data;
}
T Data;
};
std::unique_ptr<Concept> Self;
};
int main()
{
auto x = ObjWrap(std::string("foo"));
auto y = ObjWrap(std::string("foo"));
auto z = ObjWrap(int(2));
assert(x == y);
assert(y != z);
std::cout << x << " " << y << " " << z << std::endl;
}
http://coliru.stacked-crooked.com/a/dcece2a824a42948
(etc. etc.) Please correct me if this is wrong...
Your premise is wrong at least in principle, if not also in practice. You're insisting on making getObjPtr() a virtual method, and using an abstract base class. But - you've not established this is necessary. Remember - using virtual methods is expensive! Why should I pay for virtuals just to get type erasure?
Is there a way to fix the above code so that we can simply use int* p_a = getObjPtr(a)
Take Sean Parent's talk title to heart (as opposed to the fact that he does use inheritance in the talk), drop the inheritance and the answer should be Yes. Edit: It's sufficient for the code that erases the type and the code that un-erases the type to know what the type is - as long as you don't need to act on the type-erased data in a type-specific way. In Sean Parent's talk, you need to be able to make non-trivial copies of it, to move it, to draw it etc. With std::any/boost::any you might need copying and moving, which may require virtuals - but that's the most general use case.
Even std::any limits what you can and can't do, as is discussed in this question:
why doesn't std::any_cast support implicit conversion?

Runtime type resolution on a non-default constructible class without polymorphysm

I have a bit of a puzzle. I have a template class graph with a template parameter - a class vertex, that can be either symmetric or asymmetric, compressed or raw, and I only know which at runtime.
So if I wanted to get the graph of appropriate type from disk, run Bellman Ford on it and then free the memory, I would need to repeat the template instantiation in all four branches of conditionals, like so:
#include "graph.h"
int main(){
// parse cmd-line args, to get `compressed` `symmetric`
// TODO get rid of conditionals.
if (compressed) {
if (symmetric) {
graph<compressedSymmetricVertex> G =
readCompressedGraph<compressedSymmetricVertex>(iFile, symmetric,mmap);
bellman_ford(G,P);
} else {
graph<compressedAsymmetricVertex> G =
readCompressedGraph<compressedAsymmetricVertex>(iFile,symmetric,mmap);
bellman_ford(G,P);
if(G.transposed) G.transpose();
G.del();
}
} else {
if (symmetric) {
graph<symmetricVertex> G =
readGraph<symmetricVertex>(iFile,compressed,symmetric,binary,mmap);
bellman_ford(G,P);
G.del();
} else {
graph<asymmetricVertex> G =
readGraph<asymmetricVertex>(iFile,compressed,symmetric,binary,mmap);
bellman_ford(G,P);
if(G.transposed) G.transpose();
G.del();
}
}
return 0;
}
QUESTION: How can I extract everything except the call to the readGraph functions outside the conditionals with the following restrictions.
I cannot modify the graph template. Otherwise I would have simply moved the Vertex type into a union.
I cannot use std::variant because graph<T> cannot be default constructible.
Call overhead is an issue. If there are subtyping polymoprhism based solutions that don't involve making compressedAsymmetricVertex a subtype of vertex, I'm all ears.
Edit: Here is a sample header graph.h:
#pragma once
template <typename T>
struct graph{ T Data; graph(int a): Data(a) {} };
template <typename T>
graph<T> readGraph<T>(char*, bool, bool, bool) {}
template <typename T>
graph<T> readCompressedGraph<T> (char*, bool, bool) {}
class compressedAsymmetricVertex {};
class compressedSymmetricVertex {};
class symmetricVertex{};
class asymmetricVertex {};
Since you did not spell out all the types, and did not explain what is going on with the binary parameter, I can only give an approximate solution. Refine it according to your exact needs. This should be in line with:
class GraphWorker
{
public:
GraphWorker(bool compressed, bool symmetric)
: m_compressed(compressed), m_symmetric(symmetric)
{}
virtual void work(const PType & P, const char * iFile, bool binary, bool mmap ) const = 0;
protected:
const bool m_compressed;
const bool m_symmetric;
};
template <class GraphType>
class ConcreteGraphWorker : public GraphWorker
{
public:
ConcreteGraphWorker(bool compressed, bool symmetric)
: GraphWorker(compressed, symmetric)
{}
void work(const PType & P, const char * iFile, bool binary, bool mmap) const override
{
graph<GraphType> G =
readGraph<GraphType>(iFile, m_compressed, m_symmetric,
binary, mmap);
bellman_ford(G,P);
G.del();
}
};
static const std::unique_ptr<GraphWorker> workers[2][2] = {
{
std::make_unique<ConcreteGraphWorker<asymmetricVertex>>(false, false),
std::make_unique<ConcreteGraphWorker<symmetricVertex>>(false, true),
},
{
std::make_unique<ConcreteGraphWorker<compressedAsymmetricVertex>>(true, false),
std::make_unique<ConcreteGraphWorker<compressedSymmetricVertex>>(true, true),
}
};
int main()
{
workers[compressed][symmetric]->work(P, iFile, binary, mmap);
}
Some comments: It is better to avoid bool altogether, and use specific enumeration types. This means that instead of my two-dimensional array, you should use something like:
std::map<std::pair<Compression, Symmetry>, std::unique_ptr<GraphWorker>> workers;
But since there could be other unknown dependencies, I have decided to stick with the confusing bool variables. Also, having workers as a static variable has its drawbacks, and since I don't know your other requirements I did not know what to do with it. Another issue is the protected Boolean variables in the base class. Usually, I'd go with accessors instead.
I'm not sure if all this jumping-through-hoops, just to avoid a couple of conditionals, is worth it. This is much longer and trickier than the original code, and unless there are more than 4 options, or the code in work() is much longer, I'd recommend to stick with the conditionals.
edit: I have just realized that using lambda functions is arguably clearer (it is up to debate). Here it is:
int main()
{
using workerType = std::function<void(PType & P, const char *, bool, bool)>;
auto makeWorker = [](bool compressed, bool symmetric, auto *nullGrpah)
{
auto worker = [=](PType & P, const char *iFile, bool binary, bool mmap)
{
// decltype(*nullGraph) is a reference, std::decay_t fixes that.
using GraphType = std::decay_t<decltype(*nullGrpah)>;
auto G = readGraph<GraphType>(iFile, compressed, symmetric,
binary, mmap);
bellman_ford(G,P);
G.del();
};
return workerType(worker);
};
workerType workers[2][2] {
{
makeWorker(false, false, (asymmetricVertex*)nullptr),
makeWorker(false, true, (symmetricVertex*)nullptr)
},
{
makeWorker(true, false, (compressedAsymmetricVertex*)nullptr),
makeWorker(true, true, (compressedSymmetricVertex*)nullptr)
}
};
workers[compressed][symmetric](P, iFile, binary, mmap);
}
The simple baseline is that whenever you want to cross from "type only known at runtime" to "type must be known at compile-time" (i.e. templates), you will need a series of such conditionals. If you cannot modify graph at all, then you will be stuck with needing four different G variables (and branches) whenever you want to handle a G object in a non-templated function, as all the graph template variants are unrelated types and cannot be treated uniformly (std::variant aside).
One solution would be to do this transition exactly once, right after reading in compressed and symmetric, and stay fully templated from there:
template<class VertexT>
graph<VertexT> readTypedGraph()
{
if constexpr (isCompressed<VertexT>::value)
return readCompressedGraph<VertexT>(/*...*/);
else
return readGraph<VertexT>(/*...*/);
}
template<class VertexT>
void main_T()
{
// From now on you are fully compile-time type-informed.
graph<VertexT> G = readTypedGraph<VertexT>();
bellman_ford(G);
transposeGraphIfTransposed(G);
G.del();
}
// non-template main
int main()
{
// Read parameters.
bool compressed = true;
bool symmetric = false;
// Switch to fully-templated code.
if (compressed)
if (symmetric)
main_T<compressedSymmetricVertex>();
else
main_T<compressedAsymmetricVertex>();
// else
// etc.
return 0;
}
Demo
You will probably have to write a lot of meta-functions (such as isCompressed) but can otherwise code as normal (albeit your IDE won't help you as much). You're not locked down in any way.

Using std::unique_ptr of a polymorphic class as key in std::unordered_map

My problem comes from a project that I'm supposed to finish. I have to create an std::unordered_map<T, unsigned int> where T is a pointer to a base, polymorphic class. After a while, I figured that it will also be a good practice to use an std::unique_ptr<T> as a key, since my map is meant to own the objects. Let me introduce some backstory:
Consider class hierarchy with polymorphic sell_obj as a base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>. Therefore, erasing a pair from that map will automatically free the memory pointed by key. The whole idea is to have keys pointing to books/tables and value of those keys will represent the amount of that product that our shop contains.
As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this:
namespace std{
template <> struct hash<book>{
size_t operator()(const book& b) const
{
return 1; // simplified
}
};
template <> struct hash<table>{
size_t operator()(const table& b) const
{
return 2; // simplified
}
};
// The standard provides a specilization so that std::hash<unique_ptr<T>> is the same as std::hash<T*>.
template <> struct hash<sell_obj*>{
size_t operator()(const sell_obj *s) const
{
const book *b_p = dynamic_cast<const book*>(s);
if(b_p != nullptr) return std::hash<book>()(*b_p);
else{
const table *t_p = static_cast<const table*>(s);
return std::hash<table>()(*t_p);
}
}
};
}
Now let's look at implementation of the map. We have a class called Shop which looks like this:
#include "sell_obj.h"
#include "book.h"
#include "table.h"
#include <unordered_map>
#include <memory>
class Shop
{
public:
Shop();
void add_sell_obj(sell_obj&);
void remove_sell_obj(sell_obj&);
private:
std::unordered_map<std::unique_ptr<sell_obj>, unsigned int> storeroom;
};
and implementation of two, crucial functions:
void Shop::add_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
storeroom[std::move(n_ptr)]++;
}
void Shop::remove_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
auto target = storeroom.find(std::move(n_ptr));
if(target != storeroom.end() && target->second > 0) target->second--;
}
in my main I try to run the following code:
int main()
{
book *b1 = new book("foo", "bar", 10);
sell_obj *ptr = b1;
Shop S_H;
S_H.add_sell_obj(*ptr); // works fine I guess
S_H.remove_sell_obj(*ptr); // usually (not always) crashes [SIGSEGV]
return 0;
}
my question is - where does my logic fail? I heard that it's fine to use std::unique_ptr in STL containters since C++11. What's causing the crash? Debugger does not provide any information besides the crash occurance.
If more information about the project will be needed, please point it out. Thank you for reading
There are quite a few problems with logic in the question. First of all:
Consider class hierarchy with polymorphic sell_obj as base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>.
In such cases std::unique_ptr<sell_obj*> is not what we would want. We would want std::unique_ptr<sell_obj>. Without the *. std::unique_ptr is already "a pointer".
As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this: [...]
This is also quite of an undesired approach. This would require changing that part of the code every time we add another subclass in the hierarchy. It would be best to delegate the hashing (and comparing) polymorphically to avoid such problems, exactly as #1201programalarm suggested.
[...] implementation of two, crucial functions:
void Shop::add_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
storeroom[std::move(n_ptr)]++;
}
void Shop::remove_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
auto target = storeroom.find(std::move(n_ptr));
if(target != storeroom.end() && target->second > 0) target->second--;
}
This is wrong for couple of reasons. First of all, taking an argument by non-const reference suggest modification of the object. Second of all, the creation of n_ptr from a pointer obtained by using & on an argumnet is incredibly risky. It assumes that the object is allocated on the heap and it is unowned. A situation that generally should not take place and is incredibly dangerous. In case where the passed object is on the stack and / or is already managed by some other owner, this is a recipe for a disaster (like a segfault).
What's more, it is more or less guaranteed to end up in a disaster, since both add_sell_obj() and remove_sell_obj() create std::unique_ptrs to potentially the same object. This is exactly the case from the original question's main(). Two std::unique_ptrs pointing to the same object result in double delete.
While it's not necessarily the best approach for this problem if one uses C++ (as compared to Java), there are couple of interesting tools that can be used for this task. The code below assumes C++20.
The class hierarchy
First of all, we need a base class that will be used when referring to all the objects stored in the shop:
struct sell_object { };
And then we need to introduce classes that will represent conrete objects:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
};
For simplicity (but to still have some distinctions) I chose for them to have just one, distinct field (title and number_of_legs).
The storage
The shop class that will represent storage for any sell_object needs to somehow store, well, any sell_object. For that we either need to use pointers or references to the base class. You can't have a container of references, so it's best to use pointers. Smart pointers.
Originally the question suggested the usage of std::unordered_map. Let us stick with it:
class shop {
std::unordered_map<
std::unique_ptr<sell_object>, int,
> storage;
public:
auto add(...) -> void {
...
}
auto remove(...) -> void {
...
}
};
It is worth mentioning that we chose std::unique_ptr as key for our map. That means that the storage is going to copy the passed objects and use the copies it owns to compare with elements we query (add or remove). No more than one equal object will be copied, though.
The fixed version of storage
There is a problem, however. std::unordered_map uses hashing and we need to provide a hash strategy for std::unique_ptr<sell_object>. Well, there already is one and it uses the hash strategy for T*. The problem is that we want to have custom hashing. Those particular std::unique_ptr<sell_object>s should be hashed according to the associated sell_objects.
Because of this, I opt to choose a different approach than the one proposed in the question. Instead of providing a global specialization in the std namespace, I will choose a custom hashing object and a custom comparator:
class shop {
struct sell_object_hash {
auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
return object->hash();
}
};
struct sell_object_equal {
auto operator()(
std::unique_ptr<sell_object> const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (*lhs <=> *rhs) == 0;
}
};
std::unordered_map<
std::unique_ptr<sell_object>, int,
sell_object_hash, sell_object_equal
> storage;
public:
auto add(...) -> void {
...
}
auto remove(...) -> void {
...
}
};
Notice a few things. First of all, the type of storage has changed. No longer it is an std::unordered_map<std::unique_ptr<T>, int>, but an std::unordered_map<std::unique_ptr<T>, int, sell_object_hash, sell_object_equal>. This is to indicate that we are using custom hasher (sell_object_hash) and custom comparator (sell_object_equal).
The lines we need to pay extra attention are:
return object->hash();
return (*lhs <=> *rhs) == 0;
Onto them:
return object->hash();
This is a delegation of hashing. Instead of being an observer and trying to have a type that for each and every possible type derived from sell_object implements a different hashing, we require that those objects supply the sufficient hashing themselves. In the original question, the std::hash specialization was the said "observer". It certainly did not scale as a solution.
In order to achieve the aforementioned, we modify the base class to impose the listed requirement:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
};
Thus we also need to change our book and table classes:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
auto hash() const -> std::size_t override {
return std::hash<std::string>()(title);
}
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
auto hash() const -> std::size_t override {
return std::hash<int>()(number_of_legs);
}
};
return (*lhs <=> *rhs) == 0;
This is a C++20 feature called the three-way comparison operator, sometimes called the spaceship operator. I opted into using it, since starting with C++20, most types that desire to be comparable will be using this operator. That means we also need our concrete classes to implement it. What's more, we need to be able to call it with base references (sell_object&). Yet another virtual function (operator, actually) needs to be added to the base class:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
};
Every subclass of sell_object is going to be required to be comparable with other sell_objects. The main reason is that we need to compare sell_objects in our storage map. For completeness, I used std::partial_ordering, since we require every sell_object to be comparable with every other sell_object. While comparing two books or two tables yields strong ordering (total ordering where two equivalent objects are indistinguishable), we also - by design - need to support comparing a book to a table. This is somewhat meaningless (always returns false). Fortunately, C++20 helps us here with std::partial_ordering::unordered. Those elements are not equal and neither of them is greater or less than the other. Perfect for such scenarios.
Our concrete classes need to change accordingly:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
auto hash() const -> std::size_t override {
return std::hash<std::string>()(title);
}
auto operator<=>(book const& other) const {
return title <=> other.title;
};
auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
if (auto book_ptr = dynamic_cast<book const*>(&other)) {
return *this <=> *book_ptr;
} else {
return std::partial_ordering::unordered;
}
}
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
auto hash() const -> std::size_t override {
return std::hash<int>()(number_of_legs);
}
auto operator<=>(table const& other) const {
return number_of_legs <=> other.number_of_legs;
};
auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
if (auto table_ptr = dynamic_cast<table const*>(&other)) {
return *this <=> *table_ptr;
} else {
return std::partial_ordering::unordered;
}
}
};
The overriden operator<=>s are required due to the base class' requirements. They are quite simple - if the other object (the one we are comparing this object to) is of the same type, we delegate to the <=> version that uses the concrete type. If not, we have a type mismatch and we report the unordered ordering.
For those of you who are curious why the <=> implementation that compares two, identical types is not = defaulted: it would use the base-class comparison first, which would delegate to the sell_object version. That would dynamic_cast again and delegate to the defaulted implementation. Which would compare the base class and... result in an infinite recursion.
add() and remove() implementation
Everything seems great, so we can move on to adding and removing items to and from our shop. However, we immediately arrive at a hard design decision. What arguments should add() and remove() accept?
std::unique_ptr<sell_object>? That would make their implementation trivial, but it would require the user to construct a potentially useless, dynamically allocated object just to call a function.
sell_object const&? That seems correct, but there are two problems with it: 1) we would still need to construct an std::unique_ptr with a copy of passed argument to find the appropriate element to remove; 2) we wouldn't be able to correctly implement add(), since we need the concrete type to construct an actual std::unique_ptr to put into our map.
Let us go with the second option and fix the first problem. We certainly do not want to construct a useless and expensive object just to look for it in the storage map. Ideally we would like to find a key (std::unique_ptr<sell_object>) that matches the passed object. Fortunately, transparent hashers and comparators come to the rescue.
By supplying additional overloads for hasher and comparator (and providing a public is_transparent alias), we allow for looking for a key that is equivalent, without needing the types to match:
struct sell_object_hash {
auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
return object->hash();
}
auto operator()(sell_object const& object) const -> std::size_t {
return object.hash();
}
using is_transparent = void;
};
struct sell_object_equal {
auto operator()(
std::unique_ptr<sell_object> const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (*lhs <=> *rhs) == 0;
}
auto operator()(
sell_object const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (lhs <=> *rhs) == 0;
}
auto operator()(
std::unique_ptr<sell_object> const& lhs,
sell_object const& rhs
) const -> bool {
return (*lhs <=> rhs) == 0;
}
using is_transparent = void;
};
Thanks to that, we can now implement shop::remove() like so:
auto remove(sell_object const& to_remove) -> void {
if (auto it = storage.find(to_remove); it != storage.end()) {
it->second--;
if (it->second == 0) {
storage.erase(it);
}
}
}
Since our comparator and hasher are transparent, we can find() an element that is equivalent to the argument. If we find it, we decrement the corresponding count. If it reaches 0, we remove the entry completely.
Great, onto the second problem. Let us list the requirements for the shop::add():
we need the concrete type of the object (merely a reference to the base class is not enough, since we need to create matching std::unique_ptr).
we need that type to be derived from sell_object.
We can achieve both with a constrained* template:
template <std::derived_from<sell_object> T>
auto add(T const& to_add) -> void {
if (auto it = storage.find(to_add); it != storage.end()) {
it->second++;
} else {
storage[std::make_unique<T>(to_add)] = 1;
}
}
This is, again, quite simple
*References: {1} {2}
Correct destruction semantics
There is only one more thing that separates us from the correct implementation. It's the fact that if we have a pointer (either smart or not) to a base class that is used to deallocate it, the destructor needs to be virtual.
This leads us to the final version of the sell_object class:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
virtual ~sell_object() = default;
};
See full implementation with example and additional printing utilities.

How is LLVM isa<> implemented?

From http://llvm.org/docs/CodingStandards.html#ci_rtti_exceptions
LLVM does make extensive use of a
hand-rolled form of RTTI that use
templates like isa<>, cast<>, and
dyn_cast<>. This form of RTTI is
opt-in and can be added to any class.
It is also substantially more
efficient than dynamic_cast<>.
How is isa and the others implemented?
First of all, the LLVM system is extremely specific and not at all a drop-in replacement for the RTTI system.
Premises
For most classes, it is unnecessary to generate RTTI information
When it is required, the information only makes sense within a given hierarchy
We preclude multi-inheritance from this system
Identifying an object class
Take a simple hierarchy, for example:
struct Base {}; /* abstract */
struct DerivedLeft: Base {}; /* abstract */
struct DerivedRight:Base {};
struct MostDerivedL1: DerivedLeft {};
struct MostDerivedL2: DerivedLeft {};
struct MostDerivedR: DerivedRight {};
We will create an enum specific to this hierarchy, with an enum member for each of the hierarchy member that can be instantiated (the others would be useless).
enum BaseId {
DerivedRightId,
MostDerivedL1Id,
MostDerivedL2Id,
MostDerivedRId
};
Then, the Base class will be augmented with a method that will return this enum.
struct Base {
static inline bool classof(Base const*) { return true; }
Base(BaseId id): Id(id) {}
BaseId getValueID() const { return Id; }
BaseId Id;
};
And each concrete class is augmented too, in this manner:
struct DerivedRight: Base {
static inline bool classof(DerivedRight const*) { return true; }
static inline bool classof(Base const* B) {
switch(B->getValueID()) {
case DerivedRightId: case MostDerivedRId: return true;
default: return false;
}
}
DerivedRight(BaseId id = DerivedRightId): Base(id) {}
};
Now, it is possible, simply, to query the exact type, for casting.
Hiding implementation details
Having the users murking with getValueID would be troublesome though, so in LLVM this is hidden with the use of classof methods.
A given class should implement two classof methods: one for its deepest base (with a test of the suitable values of BaseId) and one for itself (pure optimization). For example:
struct MostDerivedL1: DerivedLeft {
static inline bool classof(MostDerivedL1 const*) { return true; }
static inline bool classof(Base const* B) {
return B->getValueID() == MostDerivedL1Id;
}
MostDerivedL1(): DerivedLeft(MostDerivedL1Id) {}
};
This way, we can check whether a cast is possible or not through the templates:
template <typename To, typename From>
bool isa(From const& f) {
return To::classof(&f);
}
Imagine for a moment that To is MostDerivedL1:
if From is MostDerivedL1, then we invoke the first overload of classof, and it works
if From is anything other, then we invoke the second overload of classof, and the check uses the enum to determine if the concrete type match.
Hope it's clearer.
Just adding stuff to osgx's answer: basically each class should implement classof() method which does all the necessary stuff. For example, the Value's classof() routine looks like this:
// Methods for support type inquiry through isa, cast, and dyn_cast:
static inline bool classof(const Value *) {
return true; // Values are always values.
}
To check whether we have a class of the appropriate type, each class has it's unique ValueID. You can check the full list of ValueID's inside the include/llvm/Value.h file. This ValueID is used as follows (excerpt from Function.h):
/// Methods for support type inquiry through isa, cast, and dyn_cast:
static inline bool classof(const Function *) { return true; }
static inline bool classof(const Value *V) {
return V->getValueID() == Value::FunctionVal;
}
So, in short: every class should implement classof() method which performs the necessary decision. The implementation in question consists of the set of unique ValueIDs. Thus in order to implement classof() one should just compare the ValueID of the argument with own ValueID.
If I remember correctly, the first implementation of isa<> and friends were adopted from boost ~10 years ago. Right now the implementations diverge significantly :)
I should mention that http://llvm.org/docs/ProgrammersManual.html#isa - this document have some additional description.
The source code of isa, cast and dyn_cast is located in single file, and commented a lot.
http://llvm.org/doxygen/Casting_8h_source.html
00047 // isa<X> - Return true if the parameter to the template is an instance of the
00048 // template type argument. Used like this:
00049 //
00050 // if (isa<Type*>(myVal)) { ... }
00051 //
00052 template <typename To, typename From>
00053 struct isa_impl {
00054 static inline bool doit(const From &Val) {
00055 return To::classof(&Val);
00056 }
00057 };
00193 // cast<X> - Return the argument parameter cast to the specified type. This
00194 // casting operator asserts that the type is correct, so it does not return null
00195 // on failure. It does not allow a null argument (use cast_or_null for that).
00196 // It is typically used like this:
00197 //
00198 // cast<Instruction>(myVal)->getParent()
00199 //
00200 template <class X, class Y>
00201 inline typename cast_retty<X, Y>::ret_type cast(const Y &Val) {
00202 assert(isa<X>(Val) && "cast<Ty>() argument of incompatible type!");
00203 return cast_convert_val<X, Y,
00204 typename simplify_type<Y>::SimpleType>::doit(Val);
00205 }
00218 // dyn_cast<X> - Return the argument parameter cast to the specified type. This
00219 // casting operator returns null if the argument is of the wrong type, so it can
00220 // be used to test for a type as well as cast if successful. This should be
00221 // used in the context of an if statement like this:
00222 //
00223 // if (const Instruction *I = dyn_cast<Instruction>(myVal)) { ... }
00224 //
00225
00226 template <class X, class Y>
00227 inline typename cast_retty<X, Y>::ret_type dyn_cast(const Y &Val) {
00228 return isa<X>(Val) ? cast<X, Y>(Val) : 0;
00229 }