I was searching for an implementation of extension methods in c++ and came upon this comp.std.c++ discussion which mentions that polymorphic_map can be used to associated methods with a class, but, the provided link seems to be dead. Does anyone know what that answer was referring to, or if there is another way to extend classes in a similar manner to extension methods (perhaps through some usage of mixins?).
I know the canonical C++ solution is to use free functions; this is more out of curiosity than anything else.
Different languages approach development in different ways. In particular C# and Java have a strong point of view with respect to OO that leads to everything is an object mindset (C# is a little more lax here). In that approach, extension methods provide a simple way of extending an existing object or interface to add new features.
There are no extension methods in C++, nor are they needed. When developing C++, forget the everything is an object paradigm --which, by the way, is false even in Java/C# [*]. A different mindset is taken in C++, there are objects, and the objects have operations that are inherently part of the object, but there are also other operations that form part of the interface and need not be part of the class. A must read by Herb Sutter is What's In a Class?, where the author defends (and I agree) that you can easily extend any given class with simple free functions.
As a particular simple example, the standard templated class basic_ostream has a few member methods to dump the contents of some primitive types, and then it is enhanced with (also templated) free functions that extend that functionality to other types by using the existing public interface. For example, std::cout << 1; is implemented as a member function, while std::cout << "Hi"; is a free function implemented in terms of other more basic members.
Extensibility in C++ is achieved by means of free functions, not by ways of adding new methods to existing objects.
[*] Everything is not an object.
In a given domain will contain a set of actual objects that can be modeled and operations that can be applied to them, in some cases those operations will be part of the object, but in some other cases they will not. In particular you will find utility classes in the languages that claim that everything is an object and those utility classes are nothing but a layer trying to hide the fact that those methods don't belong to any particular object.
Even some operations that are implemented as member functions are not really operations on the object. Consider addition for a Complex number class, how is sum (or +) more of an operation on the first argument than the second? Why a.sum(b); or b.sum(a), should it not be sum( a, b )?
Forcing the operations to be member methods actually produces weird effects --but we are just used to them: a.equals(b); and b.equals(a); might have completely different results even if the implementation of equals is fully symmetric. (Consider what happens when either a or b is a null pointer)
Boost Range Library's approach use operator|().
r | filtered(p);
I can write trim for string as follows in the same way, too.
#include <string>
namespace string_extension {
struct trim_t {
std::string operator()(const std::string& s) const
{
...
return s;
}
};
const trim_t trim = {};
std::string operator|(const std::string& s, trim_t f)
{
return f(s);
}
} // namespace string_extension
int main()
{
const std::string s = " abc ";
const std::string result = s | string_extension::trim;
}
This is the closest thing that I have ever seen to extension methods in C++. Personally i like the way it can be used, and possibly this it the closest we can get to extension methods in this language. But there are some disadvantages:
It may be complicated to implement
Operator precedence may be not that nice some times, this may cause surprises
A solution:
#include <iostream>
using namespace std;
class regular_class {
public:
void simple_method(void) const {
cout << "simple_method called." << endl;
}
};
class ext_method {
private:
// arguments of the extension method
int x_;
public:
// arguments get initialized here
ext_method(int x) : x_(x) {
}
// just a dummy overload to return a reference to itself
ext_method& operator-(void) {
return *this;
}
// extension method body is implemented here. The return type of this op. overload
// should be the return type of the extension method
friend const regular_class& operator<(const regular_class& obj, const ext_method& mthd) {
cout << "Extension method called with: " << mthd.x_ << " on " << &obj << endl;
return obj;
}
};
int main()
{
regular_class obj;
cout << "regular_class object at: " << &obj << endl;
obj.simple_method();
obj<-ext_method(3)<-ext_method(8);
return 0;
}
This is not my personal invention, recently a friend of mine mailed it to me, he said he got it from a university mailing list.
The short answer is that you cannot do that. The long answer is that you can simulate it, but be aware that you'll have to create a lot of code as workaround (actually, I don't think there is an elegant solution).
In the discussion, a very complex workaround is provided using operator- (which is a bad idea, in my opinion). I guess that the solution provided in the dead link was more o less similar (since it was based on operator|).
This is based in the capability of being able to do more or less the same thing as an extension method with operators. For example, if you want to overload the ostream's operator<< for your new class Foo, you could do:
class Foo {
friend ostream &operator<<(ostream &o, const Foo &foo);
// more things...
};
ostream &operator<<(ostream &o, const Foo &foo)
{
// write foo's info to o
}
As I said, this is the only similar mechanism availabe in C++ for extension methods. If you can naturally translate your function to an overloaded operator, then it is fine. The only other possibility is to artificially overload an operator that has nothing to do with your objective, but this is going to make you write very confusing code.
The most similar approach I can think of would mean to create an extension class and create your new methods there. Unfortunately, this means that you'll need to "adapt" your objects:
class stringext {
public:
stringext(std::string &s) : str( &s )
{}
string trim()
{ ...; return *str; }
private:
string * str;
};
And then, when you want to do that things:
void fie(string &str)
{
// ...
cout << stringext( str ).trim() << endl;
}
As said, this is not perfect, and I don't think that kind of perfect solution exists.
Sorry.
To elaborate more on #Akira answer, operator| can be used to extend existing classes with functions that take parameters too. Here an example that I'm using to extend Xerces XML library with find functionalities that can be easily concatenated:
#pragma once
#include <string>
#include <stdexcept>
#include <xercesc/dom/DOMElement.hpp>
#define _U16C // macro that converts string to char16_t array
XERCES_CPP_NAMESPACE_BEGIN
struct FindFirst
{
FindFirst(const std::string& name);
DOMElement * operator()(const DOMElement &el) const;
DOMElement * operator()(const DOMElement *el) const;
private:
std::string m_name;
};
struct FindFirstExisting
{
FindFirstExisting(const std::string& name);
DOMElement & operator()(const DOMElement &el) const;
private:
std::string m_name;
};
inline DOMElement & operator|(const DOMElement &el, const FindFirstExisting &f)
{
return f(el);
}
inline DOMElement * operator|(const DOMElement &el, const FindFirst &f)
{
return f(el);
}
inline DOMElement * operator|(const DOMElement *el, const FindFirst &f)
{
return f(el);
}
inline FindFirst::FindFirst(const std::string & name)
: m_name(name)
{
}
inline DOMElement * FindFirst::operator()(const DOMElement &el) const
{
auto list = el.getElementsByTagName(_U16C(m_name));
if (list->getLength() == 0)
return nullptr;
return static_cast<DOMElement *>(list->item(0));
}
inline DOMElement * FindFirst::operator()(const DOMElement *el) const
{
if (el == nullptr)
return nullptr;
auto list = el->getElementsByTagName(_U16C(m_name));
if (list->getLength() == 0)
return nullptr;
return static_cast<DOMElement *>(list->item(0));
}
inline FindFirstExisting::FindFirstExisting(const std::string & name)
: m_name(name)
{
}
inline DOMElement & FindFirstExisting::operator()(const DOMElement & el) const
{
auto list = el.getElementsByTagName(_U16C(m_name));
if (list->getLength() == 0)
throw runtime_error(string("Missing element with name ") + m_name);
return static_cast<DOMElement &>(*list->item(0));
}
XERCES_CPP_NAMESPACE_END
It can be used this way:
auto packetRate = *elementRoot | FindFirst("Header") | FindFirst("PacketRate");
auto &decrypted = *elementRoot | FindFirstExisting("Header") | FindFirstExisting("Decrypted");
You can enable kinda extension methods for your own class/struct or for some specific type in some scope. See rough solution below.
class Extensible
{
public:
template<class TRes, class T, class... Args>
std::function<TRes(Args...)> operator|
(std::function<TRes(T&, Args...)>& extension)
{
return [this, &extension](Args... args) -> TRes
{
return extension(*static_cast<T*>(this), std::forward<Args>(args)...);
};
}
};
Then inherit your class from this and use like
class SomeExtensible : public Extensible { /*...*/ };
std::function<int(SomeExtensible&, int)> fn;
SomeExtensible se;
int i = (se | fn)(4);
Or you can declare this operator in cpp file or namespace.
//for std::string, for example
template<class TRes, class... Args>
std::function<TRes(Args...)> operator|
(std::string& s, std::function<TRes(std::string&, Args...)>& extension)
{
return [&s, &extension](Args... args) -> TRes
{
return extension(s, std::forward<Args>(args)...);
};
}
std::string s = "newStr";
std::function<std::string(std::string&)> init = [](std::string& s) {
return s = "initialized";
};
(s | init)();
Or even wrap it in macro (I know, it's generally bad idea, nevertheless you can):
#define ENABLE_EXTENSIONS_FOR(x) \
template<class TRes, class... Args> \
std::function<TRes(Args...)> operator| (x s, std::function<TRes(x, Args...)>& extension) \
{ \
return [&s, &extension](Args... args) -> TRes \
{ \
return extension(s, std::forward<Args>(args)...); \
}; \
}
ENABLE_EXTENSIONS_FOR(std::vector<int>&);
This syntactic sugar isn't available in C++, but you can define your own namespace and write pure static classes, using const references as the first parameter.
For example, I was struggling using the STL implementation for some array operations, and I didn't like the syntaxis, I was used to JavaScript's functional way of how array methods worked.
So, I made my own namespace wh with the class vector in it, since that's the class I was expecting to use these methods, and this is the result:
//#ifndef __WH_HPP
//#define __WH_HPP
#include <vector>
#include <functional>
#include <algorithm>
namespace wh{
template<typename T>
class vector{
public:
static T reduce(const std::vector<T> &array, const T &accumulatorInitiator, const std::function<T(T,T)> &functor){
T accumulator = accumulatorInitiator;
for(auto &element: array) accumulator = functor(element, accumulator);
return accumulator;
}
static T reduce(const std::vector<T> &array, const T &accumulatorInitiator){
return wh::vector<T>::reduce(array, accumulatorInitiator, [](T element, T acc){return element + acc;});
}
static std::vector<T> map(const std::vector<T> &array, const std::function<T(T)> &functor){
std::vector<T> ret;
transform(array.begin(), array.end(), std::back_inserter(ret), functor);
return ret;
}
static std::vector<T> filter(const std::vector<T> &array, const std::function<bool(T)> &functor){
std::vector<T> ret;
copy_if(array.begin(), array.end(), std::back_inserter(ret), functor);
return ret;
}
static bool all(const std::vector<T> &array, const std::function<bool(T)> &functor){
return all_of(array.begin(), array.end(), functor);
}
static bool any(const std::vector<T> &array, const std::function<bool(T)> &functor){
return any_of(array.begin(), array.end(), functor);
}
};
}
//#undef __WH_HPP
I wouldn't inherit nor compose a class with it, since I've never been able to do it peacefully without any side-effects, but I came up with this, just const references.
The problem of course, is the extremely verbose code you have to make in order to use these static methods:
int main()
{
vector<int> numbers = {1,2,3,4,5,6};
numbers = wh::vector<int>::filter(numbers, [](int number){return number < 3;});
numbers = wh::vector<int>::map(numbers,[](int number){return number + 3;});
for(const auto& number: numbers) cout << number << endl;
return 0;
}
If only there was syntactic sugar that could make my static methods have some kind of more common syntax like:
myvector.map([](int number){return number+2;}); //...
Related
Having returned relatively recently to C++ after decades of Java, I am currently struggling with a template-based approach to data conversion for instances where type erasure has been applied. Please bear with me, my nomenclature may still be off for C++-natives.
This is what I am trying to achieve:
Implement dynamic variables which are able to hold essentially any value type
Access the content of those variables using various other representations (string, ints, binary, ...)
Be able to hold variable instances in containers, independent of their value type
Convert between variable value and representation using conversion functions
Be able to introduce new representations just by providing new conversion functions
Constraints: use only C++-11 features if possible, no use of libraries like boost::any etc.
A rough sketch of this might look like this:
#include <iostream>
#include <vector>
void convert(const std::string &f, std::string &t) { t = f; }
void convert(const int &f, std::string &t) { t = std::to_string(f); }
void convert(const std::string &f, int &t) { t = std::stoi(f); }
void convert(const int &f, int &t) { t = f; }
struct Variable {
virtual void get(int &i) = 0;
virtual void get(std::string &s) = 0;
};
template <typename T> struct VariableImpl : Variable {
T value;
VariableImpl(const T &v) : value{v} {};
void get(int &i) { convert(value, i); };
void get(std::string &s) { convert(value, s); };
};
int main() {
VariableImpl<int> v1{42};
VariableImpl<std::string> v2{"1234"};
std::vector<Variable *> vars{&v1, &v2};
for (auto &v : vars) {
int i;
v->get(i);
std::string s;
v->get(s);
std::cout << "int representation: " << i <<
", string representation: " << s << std::endl;
}
return 0;
}
The code does what it is supposed to do, but obvoiusly I would like to get rid of Variable::get(int/std::string/...) and instead template them, because otherwise every new representation requires a definition and an implementation with the latter being exactly the same as all the others.
I've played with various approaches so far, like virtual templated, methods, applying the CRDT with intermediate type, various forms of wrappers, yet in all of them I get bitten by the erased value type of VariableImpl. On one hand, I think there might not be a solution, because after type erasure, the compiler cannot possibly know what templated getters and converter calls it must generate. On the other hand I think i might be missing something really essential here and there should be a solution despite the constraints mentioned above.
This is a classical double dispatch problem. The usual solution to this problem is to have some kind of dispatcher class with multiple implementations of the function you want to dispatch (get in your case). This is called the visitor pattern. The well-known drawback of it is the dependency cycle it creates (each class in the hierarchy depends on all other classes in the hierarchy). Thus there's a need to revisit it each time a new type is added. No amount of template wizardry eliminates it.
You don't have a specialised Visitor class, your Variable serves as a Visitor of itself, but this is a minor detail.
Since you don't like this solution, there is another one. It uses a registry of functions populated at run time and keyed on type identification of their arguments. This is sometimes called "Acyclic Visitor".
Here's a half-baked C++11-friendly implementation for your case.
#include <map>
#include <vector>
#include <typeinfo>
#include <typeindex>
#include <utility>
#include <functional>
#include <string>
#include <stdexcept>
struct Variable
{
virtual void convertValue(Variable& to) const = 0;
virtual ~Variable() {};
virtual std::type_index getTypeIdx() const = 0;
template <typename K> K get() const;
static std::map<std::pair<std::type_index, std::type_index>,
std::function<void(const Variable&, Variable&)>>
conversionMap;
template <typename T, typename K>
static void registerConversion(K (*fn)(const T&));
};
template <typename T>
struct VariableImpl : Variable
{
T value;
VariableImpl(const T &v) : value{v} {};
VariableImpl() : value{} {}; // this is needed for a declaration of
// `VariableImpl<K> below
// It can be avoided but it is
// a story for another day
void convertValue(Variable& to) const override
{
auto typeIdxFrom = getTypeIdx();
auto typeIdxTo = to.getTypeIdx();
if (typeIdxFrom == typeIdxTo) // no conversion needed
{
dynamic_cast<VariableImpl<T>&>(to).value = value;
}
else
{
auto fcnIter = conversionMap.find({getTypeIdx(), to.getTypeIdx()});
if (fcnIter != conversionMap.end())
{
fcnIter->second(*this, to);
}
else
throw std::logic_error("no conversion");
}
}
std::type_index getTypeIdx() const override
{
return std::type_index(typeid(T));
}
};
template <typename K> K Variable::get() const
{
VariableImpl<K> vk;
convertValue(vk);
return vk.value;
}
template <typename T, typename K>
void Variable::registerConversion(K (*fn)(const T&))
{
// add a mutex if you ever spread this over multiple threads
conversionMap[{std::type_index(typeid(T)), std::type_index(typeid(K))}] =
[fn](const Variable& from, Variable& to) {
dynamic_cast<VariableImpl<K>&>(to).value =
fn(dynamic_cast<const VariableImpl<T>&>(from).value);
};
}
Now of course you need to call registerConversion e.g. at the beginning of main and pass it each conversion function.
Variable::registerConversion(int_to_string);
Variable::registerConversion(string_to_int);
This is not ideal, but hardly anything is ever ideal.
Having said all that, I would recommend you revisit your design. Do you really need all these conversions? Why not pick one representation and stick with it?
Implement dynamic variables which are able to hold essentially any value type
Be able to hold variable instances in containers, independent of their value type
These two requirements are quite challenging on its own. The class templates don't really encourage inheritance, and you already did the right thing to hold what you asked for: introduced a common base class for the class template, which you can later refer to in order to store pointers of the said type in a collection.
Access the content of those variables using various other representations (string, ints, binary, ...)
Be able to introduce new representations just by providing new conversion functions
This is where it breaks. Function templates assume common implementation for different types, while inheritance assumes different implementation for the same types.
You goal is to introduce different implementation for different types, and in order to make your requirements viable you have to switch to one of those two options instead (or put up with a number of functions for each case which you have already introduced yourself)
Edit:
One of the strategies you may employ to enforce inheritance approach is generalisation of the arguments to the extent where they can be used interchangeably by the abstract interface. E.g. you may wrap the converting arguments inside of a union like this:
struct Variable {
struct converter_type {
enum { INT, STRING } type;
union {
int* m_int;
std::string* m_string;
};
};
virtual void get(converter_type& var) = 0;
virtual ~Variable() = default;
};
And then take whatever part of it inside of the implementation:
void get(converter_type& var) override {
switch (var.type) {
case converter_type::INT:
convert(value, var.m_int);
break;
case converter_type::STRING:
convert(value, var.m_string);
break;
}
}
To be honest I don't think this is a less verbose approach compared to just having a number of functions for each type combination, but i think you got the idea that you can just wrap your arguments somehow to cement the abstract class interface.
Implement std::any. It is similar to boost::any.
Create a conversion dispatcher based off typeids. Store your any alongside the conversion dispatcher.
"new conversion functions" have to be passed to the dispatcher.
When asked to convert to a type, pass that typeid to the dispatcher.
So we start with these 3 types:
using any = std::any; // implement this
using converter = std::function<any(any const&)>;
using convert_table = std::map<std::type_index, converter>;
using convert_lookup = convert_table(*)();
template<class T>
convert_table& lookup_convert_table() {
static convert_table t;
return t;
}
struct converter_any: any {
template<class T,
typename std::enable_if<
!std::is_same<typename std::decay<T>::type, converter_any>::value, bool
>::type = true
>
converter_any( T&& t ):
any(std::forward<T>(t)),
table(&lookup_convert_table<typename std::decay<T>::type>())
{}
converter_any(converter_any const&)=default;
converter_any(converter_any &&)=default;
converter_any& operator=(converter_any const&)=default;
converter_any& operator=(converter_any&&)=default;
~converter_any()=default;
converter_any()=default;
convert_table const* table = nullptr;
template<class U>
U convert_to() const {
if (!table)
throw 1; // make a better exception than int
auto it = table->find(typeid(U));
if (it == table->end())
throw 2; // make a better exception than int
any const& self = *this;
return any_cast<U>((it->second)(self));
}
};
template<class Dest, class Src>
bool add_converter_to_table( Dest(*f)(Src const&) ) {
lookup_convert_table<Src>()[typeid(Dest)] = [f](any const& s)->any {
Src src = std::any_cast<Src>(s);
auto r = f(src);
return r;
};
return true;
}
now your code looks like:
const bool bStringRegistered =
add_converter_to_table(+[](std::string const& f)->std::string{ return f; })
&& add_converter_to_table(+[](std::string const& f)->int{ return std::stoi(f); });
const bool bIntRegistered =
add_converter_to_table(+[](int const& i)->int{ return i; })
&& add_converter_to_table(+[](int const& i)->std::string{ return std::to_string(i); });
int main() {
converter_any v1{42};
converter_any v2{std::string("1234")};
std::vector<converter_any> vars{v1, v2}; // copies!
for (auto &v : vars) {
int i = v.convert_to<int>();
std::string s = v.convert_to<std::string>();
std::cout << "int representation: " << i <<
", string representation: " << s << std::endl;
}
}
live example.
...
Ok, what did I do?
I used any to be a smart void* that can store anything. Rewriting this is a bad idea, use someone else's implementation.
Then, I augmented it with a manually written virtual function table. Which table I add is determined by the constructor of my converter_any; here, I know the type stored, so I can store the right table.
Typically when using this technique, I'd know what functions are in there. For your implementation we do not; so the table is a map from the type id of the destination, to a conversion function.
The conversion function takes anys and returns anys -- again, don't repeat this work. And now it has a fixed signature.
To add support for a type, you independently register conversion functions. Here, my conversion function registration helper deduces the from type (to determine which table to register it in) and the destination type (to determine which entry in the table), and then automatically writes the any boxing/unboxing code for you.
...
At a higher level, what I'm doing is writing my own type erasure and object model. C++ has enough power that you can write your own object models, and when you want features that the default object model doesn't solve, well, roll a new object model.
Second, I'm using value types. A Java programmer isn't used to value types having polymorphic behavior, but much of C++ works much better if you write your code using value types.
So my converter_any is a polymorphic value type. You can store copies of them in vectors etc, and it just works.
Imagine a larger project, containing some parameter struct:
struct pars {
int foo;
};
With this struct as parameter, other functionality is implemented, e.g.:
// (de)serialization into different formats
static pars FromString(string const &text);
static string ToString(pars const &data);
static pars FromFile(string const &filename);
// [...]
// comparison / calculation / verification
static bool equals(pars l, pars r);
static pars average(pars a, pars b);
static bool isValid(pars p);
// [...]
// you-name-it
Now imagine a new member needs to be added to that struct:
struct pars {
int foo;
int bar; // new member
};
Is there a design pattern to break the build or issue warnings until all neccessary code places are adapted?
Example:
If I were to change int foo into string foo, I would not miss any code line which needs to be changed.
If int foo would need to change into unsigned int foo, I could rename foo to foo_u and have the compiler point me to where adaptations are neccessary.
One partial solution is to make the members private and settable only from the constructor, which has to be called with all parameters:
pars::pars(int _foo, int _bar)
: foo(_foo), bar(_bar)
{ }
This ensures the correct creation of pars, but not the usage - so this catches missing adaptations in FromString(), but not in ToString().
Unit tests would reveal such problems only during the test (I'm searching for a compile time method), and also only the (de)serialization part, and not that new bar is being considered everywhere (in the comparison / calculation / verification / ... functions as well).
Decouple the streaming operations from the source or destinations of the streams.
A very simple example:
#include <sstream>
#include <fstream>
struct pars
{
int foo;
int bar;
static constexpr auto current_version = 2;
};
std::istream &deserialise(std::istream &is, pars &model)
{
int version;
is >> version;
is >> model.foo;
if (version > 1) {
is >> model.bar;
}
return is;
}
std::ostream &serialise(std::ostream &os, const pars &model)
{
os << model.current_version << " ";
os << model.foo << " ";
// a version 2 addition
os << model.bar<< " ";
return os;
}
static pars FromString(std::string const &text)
{
std::istringstream iss(text);
auto result = pars();
deserialise(iss, result);
return result;
}
static std::string ToString(pars const &data)
{
std::ostringstream oss;
serialise(oss, data);
return oss.str();
}
static pars FromFile(std::string const &filename)
{
auto file = std::ifstream(filename);
auto result = pars();
deserialise(file, result);
return result;
}
Also have a look at:
boost.serialization http://www.boost.org/doc/libs/1_64_0/libs/serialization/doc/index.html
cereal https://github.com/USCiLab/cereal
etc.
A pattern that enforces this would be a for-each-member operation.
Pick a name, like members_of. Using ADL and a tag, make members_of(tag<T>) return a tuple of integral constant member pointers to the members of T.
This has to be written once. Then it can be used many spots.
I will write it in C++17 as in 14 and earlier it is just more verbose.
template<class T>struct tag_t{constexpr tag_t(){}};
template<class T>constexpr tag_t<t> tag{};
template<auto X>using val_t=std::integral_constant<decltype(X), X>;
template<auto X>constexpr val_k<X> val{};
struct pars {
int foo;
friend constexpr auto members_of( tag_t<pars> ){
return std::make_tuple( val<&pars::foo> );
}
};
When you add a member you must also add it to the friend members_of.
template<class...Fs>
struct overload:Fs...{
using Fs::operator()...;
overload(Fs...fs):Fs(std::move(fs))... {}
};
overload lets you overload lambdas.
Finally write a foreach_tuple_element.
static pars FromString(string const &text){
pars retval;
foreach_tuple_element( members_of(tag<pars>), overload{
[&](val_t<&pars::foo>){
// code to handle pars.foo
}
});
return retval;
}
when you add a new member bar to both pars and members_of, the above code breaks as the foreach cannot find an overload for val_t<&pars::bar>.
static pars FromString(string const &text){
pars retval;
foreach_tuple_element( members_of(tag<pars>), overload{
[&](val_t<&pars::foo>){
// code to handle pars.foo
},
[&](val_t<&pars::bar>){
// code to handle pars.bar
}
});
return retval;
}
and now it would compile.
For serialization / deserialization specifically, you want a single method for both (where the type of one arg says if it is in or out), and string to/from is just a special case of serialization/deserialization.
template<class A, class Self,
std::enable_if_t<std::is_same<pars, std::decay_t<Self>>{}, int> =0
>
friend void Archive(A& a, Self& self) {
ArchiveBlock(a, archive_tag("pars"), 3, [&]{
Archive(a, self.foo);
Archive(a, self.bar);
});
}
this is an example of how a unified serialize/deserialize method (without the above member pointers) works. You override Archive on your output stream and on primitive const&, on your input stream and primitive&.
For almost everything else, you use common structure for both reading and writing from the Archive. This keeps the structure of your input and output identical.
ArchiveBlock( Archive&, tag, tag version, lambda ) wraps the lambda in whatever archiving block structure you have. As an example, your archive blocks might have length information in their header, allowing earlier deserializers to skip over added data at the end. It would also both read and write blocks; on writing, it would write out the block header and whatever else before writing the body (maybe keeping track of length and backing up to record length once they know it). On reading, it would ensure the tag exists (and deal with missing tags however you choose; skip?) and fast forward over newer block contents if you want to support older readers reading what newer writers write.
In more general cases where you need to keep code aligned with data this answer might solve things. Serialization and deserialization are very special cases, because unlike most bits of C++ code you have to future-proof the binary layout of everything. It is like writing library interfaces; there is lots more care required.
My problem comes from a project that I'm supposed to finish. I have to create an std::unordered_map<T, unsigned int> where T is a pointer to a base, polymorphic class. After a while, I figured that it will also be a good practice to use an std::unique_ptr<T> as a key, since my map is meant to own the objects. Let me introduce some backstory:
Consider class hierarchy with polymorphic sell_obj as a base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>. Therefore, erasing a pair from that map will automatically free the memory pointed by key. The whole idea is to have keys pointing to books/tables and value of those keys will represent the amount of that product that our shop contains.
As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this:
namespace std{
template <> struct hash<book>{
size_t operator()(const book& b) const
{
return 1; // simplified
}
};
template <> struct hash<table>{
size_t operator()(const table& b) const
{
return 2; // simplified
}
};
// The standard provides a specilization so that std::hash<unique_ptr<T>> is the same as std::hash<T*>.
template <> struct hash<sell_obj*>{
size_t operator()(const sell_obj *s) const
{
const book *b_p = dynamic_cast<const book*>(s);
if(b_p != nullptr) return std::hash<book>()(*b_p);
else{
const table *t_p = static_cast<const table*>(s);
return std::hash<table>()(*t_p);
}
}
};
}
Now let's look at implementation of the map. We have a class called Shop which looks like this:
#include "sell_obj.h"
#include "book.h"
#include "table.h"
#include <unordered_map>
#include <memory>
class Shop
{
public:
Shop();
void add_sell_obj(sell_obj&);
void remove_sell_obj(sell_obj&);
private:
std::unordered_map<std::unique_ptr<sell_obj>, unsigned int> storeroom;
};
and implementation of two, crucial functions:
void Shop::add_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
storeroom[std::move(n_ptr)]++;
}
void Shop::remove_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
auto target = storeroom.find(std::move(n_ptr));
if(target != storeroom.end() && target->second > 0) target->second--;
}
in my main I try to run the following code:
int main()
{
book *b1 = new book("foo", "bar", 10);
sell_obj *ptr = b1;
Shop S_H;
S_H.add_sell_obj(*ptr); // works fine I guess
S_H.remove_sell_obj(*ptr); // usually (not always) crashes [SIGSEGV]
return 0;
}
my question is - where does my logic fail? I heard that it's fine to use std::unique_ptr in STL containters since C++11. What's causing the crash? Debugger does not provide any information besides the crash occurance.
If more information about the project will be needed, please point it out. Thank you for reading
There are quite a few problems with logic in the question. First of all:
Consider class hierarchy with polymorphic sell_obj as base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>.
In such cases std::unique_ptr<sell_obj*> is not what we would want. We would want std::unique_ptr<sell_obj>. Without the *. std::unique_ptr is already "a pointer".
As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this: [...]
This is also quite of an undesired approach. This would require changing that part of the code every time we add another subclass in the hierarchy. It would be best to delegate the hashing (and comparing) polymorphically to avoid such problems, exactly as #1201programalarm suggested.
[...] implementation of two, crucial functions:
void Shop::add_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
storeroom[std::move(n_ptr)]++;
}
void Shop::remove_sell_obj(sell_obj& s_o)
{
std::unique_ptr<sell_obj> n_ptr(&s_o);
auto target = storeroom.find(std::move(n_ptr));
if(target != storeroom.end() && target->second > 0) target->second--;
}
This is wrong for couple of reasons. First of all, taking an argument by non-const reference suggest modification of the object. Second of all, the creation of n_ptr from a pointer obtained by using & on an argumnet is incredibly risky. It assumes that the object is allocated on the heap and it is unowned. A situation that generally should not take place and is incredibly dangerous. In case where the passed object is on the stack and / or is already managed by some other owner, this is a recipe for a disaster (like a segfault).
What's more, it is more or less guaranteed to end up in a disaster, since both add_sell_obj() and remove_sell_obj() create std::unique_ptrs to potentially the same object. This is exactly the case from the original question's main(). Two std::unique_ptrs pointing to the same object result in double delete.
While it's not necessarily the best approach for this problem if one uses C++ (as compared to Java), there are couple of interesting tools that can be used for this task. The code below assumes C++20.
The class hierarchy
First of all, we need a base class that will be used when referring to all the objects stored in the shop:
struct sell_object { };
And then we need to introduce classes that will represent conrete objects:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
};
For simplicity (but to still have some distinctions) I chose for them to have just one, distinct field (title and number_of_legs).
The storage
The shop class that will represent storage for any sell_object needs to somehow store, well, any sell_object. For that we either need to use pointers or references to the base class. You can't have a container of references, so it's best to use pointers. Smart pointers.
Originally the question suggested the usage of std::unordered_map. Let us stick with it:
class shop {
std::unordered_map<
std::unique_ptr<sell_object>, int,
> storage;
public:
auto add(...) -> void {
...
}
auto remove(...) -> void {
...
}
};
It is worth mentioning that we chose std::unique_ptr as key for our map. That means that the storage is going to copy the passed objects and use the copies it owns to compare with elements we query (add or remove). No more than one equal object will be copied, though.
The fixed version of storage
There is a problem, however. std::unordered_map uses hashing and we need to provide a hash strategy for std::unique_ptr<sell_object>. Well, there already is one and it uses the hash strategy for T*. The problem is that we want to have custom hashing. Those particular std::unique_ptr<sell_object>s should be hashed according to the associated sell_objects.
Because of this, I opt to choose a different approach than the one proposed in the question. Instead of providing a global specialization in the std namespace, I will choose a custom hashing object and a custom comparator:
class shop {
struct sell_object_hash {
auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
return object->hash();
}
};
struct sell_object_equal {
auto operator()(
std::unique_ptr<sell_object> const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (*lhs <=> *rhs) == 0;
}
};
std::unordered_map<
std::unique_ptr<sell_object>, int,
sell_object_hash, sell_object_equal
> storage;
public:
auto add(...) -> void {
...
}
auto remove(...) -> void {
...
}
};
Notice a few things. First of all, the type of storage has changed. No longer it is an std::unordered_map<std::unique_ptr<T>, int>, but an std::unordered_map<std::unique_ptr<T>, int, sell_object_hash, sell_object_equal>. This is to indicate that we are using custom hasher (sell_object_hash) and custom comparator (sell_object_equal).
The lines we need to pay extra attention are:
return object->hash();
return (*lhs <=> *rhs) == 0;
Onto them:
return object->hash();
This is a delegation of hashing. Instead of being an observer and trying to have a type that for each and every possible type derived from sell_object implements a different hashing, we require that those objects supply the sufficient hashing themselves. In the original question, the std::hash specialization was the said "observer". It certainly did not scale as a solution.
In order to achieve the aforementioned, we modify the base class to impose the listed requirement:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
};
Thus we also need to change our book and table classes:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
auto hash() const -> std::size_t override {
return std::hash<std::string>()(title);
}
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
auto hash() const -> std::size_t override {
return std::hash<int>()(number_of_legs);
}
};
return (*lhs <=> *rhs) == 0;
This is a C++20 feature called the three-way comparison operator, sometimes called the spaceship operator. I opted into using it, since starting with C++20, most types that desire to be comparable will be using this operator. That means we also need our concrete classes to implement it. What's more, we need to be able to call it with base references (sell_object&). Yet another virtual function (operator, actually) needs to be added to the base class:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
};
Every subclass of sell_object is going to be required to be comparable with other sell_objects. The main reason is that we need to compare sell_objects in our storage map. For completeness, I used std::partial_ordering, since we require every sell_object to be comparable with every other sell_object. While comparing two books or two tables yields strong ordering (total ordering where two equivalent objects are indistinguishable), we also - by design - need to support comparing a book to a table. This is somewhat meaningless (always returns false). Fortunately, C++20 helps us here with std::partial_ordering::unordered. Those elements are not equal and neither of them is greater or less than the other. Perfect for such scenarios.
Our concrete classes need to change accordingly:
class book : public sell_object {
std::string title;
public:
book(std::string title) : title(std::move(title)) { }
auto hash() const -> std::size_t override {
return std::hash<std::string>()(title);
}
auto operator<=>(book const& other) const {
return title <=> other.title;
};
auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
if (auto book_ptr = dynamic_cast<book const*>(&other)) {
return *this <=> *book_ptr;
} else {
return std::partial_ordering::unordered;
}
}
};
class table : public sell_object {
int number_of_legs = 0;
public:
table(int number_of_legs) : number_of_legs(number_of_legs) { }
auto hash() const -> std::size_t override {
return std::hash<int>()(number_of_legs);
}
auto operator<=>(table const& other) const {
return number_of_legs <=> other.number_of_legs;
};
auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
if (auto table_ptr = dynamic_cast<table const*>(&other)) {
return *this <=> *table_ptr;
} else {
return std::partial_ordering::unordered;
}
}
};
The overriden operator<=>s are required due to the base class' requirements. They are quite simple - if the other object (the one we are comparing this object to) is of the same type, we delegate to the <=> version that uses the concrete type. If not, we have a type mismatch and we report the unordered ordering.
For those of you who are curious why the <=> implementation that compares two, identical types is not = defaulted: it would use the base-class comparison first, which would delegate to the sell_object version. That would dynamic_cast again and delegate to the defaulted implementation. Which would compare the base class and... result in an infinite recursion.
add() and remove() implementation
Everything seems great, so we can move on to adding and removing items to and from our shop. However, we immediately arrive at a hard design decision. What arguments should add() and remove() accept?
std::unique_ptr<sell_object>? That would make their implementation trivial, but it would require the user to construct a potentially useless, dynamically allocated object just to call a function.
sell_object const&? That seems correct, but there are two problems with it: 1) we would still need to construct an std::unique_ptr with a copy of passed argument to find the appropriate element to remove; 2) we wouldn't be able to correctly implement add(), since we need the concrete type to construct an actual std::unique_ptr to put into our map.
Let us go with the second option and fix the first problem. We certainly do not want to construct a useless and expensive object just to look for it in the storage map. Ideally we would like to find a key (std::unique_ptr<sell_object>) that matches the passed object. Fortunately, transparent hashers and comparators come to the rescue.
By supplying additional overloads for hasher and comparator (and providing a public is_transparent alias), we allow for looking for a key that is equivalent, without needing the types to match:
struct sell_object_hash {
auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
return object->hash();
}
auto operator()(sell_object const& object) const -> std::size_t {
return object.hash();
}
using is_transparent = void;
};
struct sell_object_equal {
auto operator()(
std::unique_ptr<sell_object> const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (*lhs <=> *rhs) == 0;
}
auto operator()(
sell_object const& lhs,
std::unique_ptr<sell_object> const& rhs
) const -> bool {
return (lhs <=> *rhs) == 0;
}
auto operator()(
std::unique_ptr<sell_object> const& lhs,
sell_object const& rhs
) const -> bool {
return (*lhs <=> rhs) == 0;
}
using is_transparent = void;
};
Thanks to that, we can now implement shop::remove() like so:
auto remove(sell_object const& to_remove) -> void {
if (auto it = storage.find(to_remove); it != storage.end()) {
it->second--;
if (it->second == 0) {
storage.erase(it);
}
}
}
Since our comparator and hasher are transparent, we can find() an element that is equivalent to the argument. If we find it, we decrement the corresponding count. If it reaches 0, we remove the entry completely.
Great, onto the second problem. Let us list the requirements for the shop::add():
we need the concrete type of the object (merely a reference to the base class is not enough, since we need to create matching std::unique_ptr).
we need that type to be derived from sell_object.
We can achieve both with a constrained* template:
template <std::derived_from<sell_object> T>
auto add(T const& to_add) -> void {
if (auto it = storage.find(to_add); it != storage.end()) {
it->second++;
} else {
storage[std::make_unique<T>(to_add)] = 1;
}
}
This is, again, quite simple
*References: {1} {2}
Correct destruction semantics
There is only one more thing that separates us from the correct implementation. It's the fact that if we have a pointer (either smart or not) to a base class that is used to deallocate it, the destructor needs to be virtual.
This leads us to the final version of the sell_object class:
struct sell_object {
virtual auto hash() const -> std::size_t = 0;
virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
virtual ~sell_object() = default;
};
See full implementation with example and additional printing utilities.
tl;dr: Is there a way to add a default argument from the current scope to all implicit constructors in C++?
I am currently designing an interface for an embedded language in C++. The goal is to make the creation of syntactically correct expressions both typesafe and convenient. Right now, I think that learning a heavyweight implementation like boost::proto will itnroduce a too large latency into the development, so I attempt to roll my own implementation.
Here is a small demo:
#include <iostream>
#include <string>
#include <sstream>
class ExprBuilder
{
public:
ExprBuilder(const int val) : val(std::to_string(val)) {}
ExprBuilder(const std::string val) : val(val) {}
ExprBuilder(const char* val) : val(val) {}
ExprBuilder(const ExprBuilder& lhs, const ExprBuilder& arg) {
std::stringstream ss;
ss << "(" << lhs.val << " " << arg.val << ")";
val = ss.str();
}
const ExprBuilder operator()(const ExprBuilder& l) const {
return ExprBuilder(*this, l);
}
template<typename... Args>
const ExprBuilder operator()(const ExprBuilder& arg, Args... args) const
{
return (*this)(arg)(args...) ;
}
std::string val;
};
std::ostream& operator<<(std::ostream& os, const ExprBuilder& e)
{
os << e.val;
return os;
}
int main() {
ExprBuilder f("f");
std::cout << f(23, "foo", "baz") << std::endl;
}
As you can see, it is fairly simple to embedd expressions due to C++ overloading and implicit conversions.
I am facing a practical problem however: In the example above, all data was allocated in the form of std::string objects. In practice, I need something more complex (AST nodes) that are allocated on the heap and managed by a dedicated owner (legacy code, cannot be changed). So I have to pass a unique argument (said owner) and use it for the allocations. I'd rather not use a static field here.
What I am searching is a way to use ask the user to provide such an owner everytime the builder is used, but in a convenient way. Something like a dynamically scoped variable would be great. Is there a way to obtain the following in C++:
class ExprBuilder
{
...
ExprBuilder(const ExprBuilder& lhs, const ExprBuilder& arg) {
return ExprBuilder(owner.allocate(lhs, rhs)); // use the owner implicitly
}
...
};
int main() {
Owner owner; // used in all ExprBuilder instances in the current scope
ExprBuilder f("f");
std::cout << f(23, "foo", "baz") << std::endl;
}
Is this possible?
edit: I'd like to clarify why I do (up until now) not consider a global variable. The owner has to be manually released by the user of the builder at some point, hence I cannot create one ad hoc. Hence, the user might "forget" the owner altogether. To avoid this, I am searching a way to enforce the presence of the owner by the typechecker.
This is hardly possible without global / static variables, because without global / static information, the local variables Owner owner and ExprBuilder f cannot know anything about each other.
I think the cleanest way is to add a
static Owner* current_owner;
to the ExprBuilder class. Then you can add a new class ScopedCurrentOwnerLock, which sets the current_owner in the constructor and sets it to nullptr in the destructor. Then you can use it similar to a mutex lock:
class ScopedCurrentOwnerLock {
public:
ScopedCurrentOwnerLock(Owner const& owner) {
ExprBuilder::current_owner = &owner;
}
~ScopedCurrentOwnerLock() {
ExprBuilder::current_owner = nullptr;
}
};
int main() {
Owner owner;
ScopedCurrentOwnerLock lock(owner);
ExprBuilder f("f");
}
If you have access to the Owner code, you can omit the ScopedCurrentOwnerLock class and directly set / unset the pointer in the constructor/destructor of Owner.
Please be aware of the following two problems with this solution:
If the owner goes out of scope before the lock goes out of scope, you have an invalid pointer.
The static pointer has unpredictable behaviour if you have multiple locks at the same time, e. g. due to multithreading.
All your ExprBuilders have a dependancy on Owner, and you rightly don't want global state. So you have to pass owner to every constructor.
If you really don't want to add owner, to all your instantiations in a block, you can create a factory to pass it for you.
struct ExprBuilderFactory
{
Owner & owner;
ExprBuilder operator()(int val) { return ExprBuilder(owner, val); }
ExprBuilder operator()(char * val) { return ExprBuilder(owner, val); }
// etc
}
int main() {
Owner owner;
ExprBuilderFactory factory{ owner };
ExprBuilder f = factory("f");
}
I want to define structs to hold various application parameters:
struct Params
{
String fooName;
int barCount;
bool widgetFlags;
// ... many more
};
but I want to be able to enumerate, get and set these fields by name, eg so that I can expose them to automation APIs and for ease in serialisation:
Params p;
cout << p.getField("barCount");
p.setField("fooName", "Roger");
for (String fieldname : p.getFieldNames()) {
cout << fieldname << "=" << p.getField(fieldName);
}
Is there a good way of defining a binding from a string label to a get/set function? Along the lines of this (very much pseudocode):
Params() {
addBinding("barCount", setter(&Params::barCount), getter(&Params::barCount));
...
I know that other options are to auto-generate the struct from an external metadata file, and another is to store the struct as a table of (key,value) pairs, but I would rather keep the data in a struct.
I do have a Variant type which all fields are convertible to.
C++ doesn't have reflection so this isn't something you can do cleanly. Also, by referring to members as strings, you have to try to side-step the strongly typed nature of the language. Using a serialization library like Boost Serializer or Google Protobuf might be more useful.
That said, if we allow some horribleness, one could do something with an XMacro. (Disclaimer: I wouldn't recommend actually doing this). First you put all the information you need into a macro
#define FIELD_PARAMS \
FIELD_INFO(std::string, Name, "Name") \
FIELD_INFO(int, Count, "Count")
Or alternatively into a header file
<defs.h>
FIELD_INFO(std::string, Name, "Name") \
FIELD_INFO(int, Count, "Count")
Then you'll define FIELD_INFO inside your class to either mean the member declaration, or adding them to a map
struct Params{
Params() {
#define FIELD_INFO(TYPE,NAME,STRNAME) names_to_members.insert(std::make_pair(STRNAME,&NAME));
FIELD_PARAMS
#undef FIELD_INFO
}
template <typename T>
T& get(std::string field){
return *(T*)names_to_members[field];
}
std::map<std::string, void*> names_to_members;
#define FIELD_INFO(TYPE,NAME,STRNAME) TYPE NAME;
FIELD_PARAMS
#undef FIELD_INFO
};
And then you could use it like this
int main (int argc, char** argv){
Params myParams;
myParams.get<std::string>("Name") = "Mike";
myParams.get<int>("Count") = 38;
std::cout << myParams.get<std::string>("Name"); // or myParams.Name
std::cout << std::endl;
std::cout << myParams.get<int>("Count"); // or myParams.Count
return 0;
}
Unfortunately you still need to tell the compiler what the type is. If you have a good variant class and libraries that play well with it, you may be able to get around this.
I'm using a slightly different storage for this: here. The tags I use are ints for some reason, but you could use std::string keys just as well.
There is no really good way (with "good" being a very subjective aspect anyway), because whatever technique you choose is not part of the C++ language itself, but if your goal is serialisation, have a look at Boost Serialization.
I've managed to come up with something that satisfies my particular need. Ari's answer was closest in terms of mapping strings to references to member variables, though it relied on casting from void*. I've got something that's a bit more type-safe:
There's an interface for an individual PropertyAccessor that has a templated class derived from it which binds to a reference to a specific member variable and converts to and from the Variant representation:
class IPropertyAccessor
{
public:
virtual ~IPropertyAccessor() {}
virtual Variant getValueAsVariant() const =0;
virtual void setValueAsVariant(const Variant& variant) =0;
};
typedef std::shared_ptr<IPropertyAccessor> IPropertyAccessorPtr;
template <class T>
class PropertyAccessor : public IPropertyAccessor
{
public:
PropertyAccessor(T& valueRef_) : valueRef(valueRef_) {}
virtual Variant getValueAsVariant() const {return VariantConverter<T>().toVariant(valueRef); }
virtual void setValueAsVariant(const Variant& variant) {return VariantConverter<T>().toValue(variant); }
T& valueRef;
};
// Helper class to create a propertyaccessor templated on a type
template <class T>
static IPropertyAccessorPtr createAccessor(T& valueRef_)
{
return std::make_shared<PropertyAccessor<T>>(valueRef_);
}
The class exposing a collection can now define an ID -> PropertyAccessor and bind its values by reference:
#define REGISTER_PROPERTY(field) accessorMap.insert(AccessorMap::value_type(#field, createAccessor(field)))
class TestPropertyCollection
{
public:
typedef std::map<PropertyID, IPropertyAccessorPtr> AccessorMap;
TestPropertyCollection()
{
REGISTER_PROPERTY(stringField1);
// expands to
// accessorMap.insert(AccessorMap::value_type("stringField", createAccessor(stringField)));
REGISTER_PROPERTY(stringField2);
REGISTER_PROPERTY(intField1);
}
bool getPropertyVariant(const PropertyID& propertyID, Variant& retVal)
{
auto it = accessorMap.find(propertyID);
if (it != accessorMap.end()) {
auto& accessor = it->second;
retVal = accessor->getValueAsVariant();
return true;
}
return false;
}
String stringField1;
String stringField2;
int intField1;
AccessorMap accessorMap
};