I'm trying to understand when is the right time to use some of the structures that come with boost and had a question regarding the use of boost::optional with a reference.
Suppose I have the following class, using boost::optional:
class MyClass {
public:
MyClass() {}
initialise(Helper& helper) {
this->helper = helper;
}
boost::optional<Helper&> getHelper() {
return helper;
}
private:
boost::optional<Helper&> helper;
}
Why would I use the above instead of:
class MyClass {
public:
MyClass() : helper(nullptr) {}
initialise(Helper& helper) {
this->helper = &helper;
}
Helper* getHelper() {
return helper;
}
private:
Helper* helper;
}
They both convey the same intent, i.e. that getHelper could return null, and the caller still needs to test if a helper was returned.
Should you only be using boost::optional if you need to know the difference between 'a value', nullptr and 'not a value'?
Compared to a raw pointer, an optional reference may suggest that (1) pointer arithmetic is not used, and (2) ownership of the referent is maintained elsewhere (so delete will clearly not be used with the variable).
Great question, and John Zwinck's answer above is right. However, some people (e.g., many on the standardization committee), doubt whether these reasons are enough to justify the existence of optional<T&>, when optional<T&> can have such confusing semantics. Consider what should happen when you assign to one of these guys. Should it re-seat the reference (i.e., make it point to a different object), or assign through the reference, like a real T& does? A case can be made for either, which would cause confusion and subtle bugs. Support for optional<T&> was removed from the proposal that recently got accepted into the C++14.
In short, if you want to make your code portable to C++14's std::optional, prefer T* over boost::optional<T&>.
Related
I stumbled upon an implemenation of Optional<T> which is based on LLVM's Optional.h class and couldn't quite figure out why it is implemented the way it is.
To keep it short, I'm only pasting the parts I don't understand:
template <typename T>
class Optional
{
private:
inline void* getstg() const { return const_cast<void*>(reinterpret_cast<const void*>(&_stg)); }
typedef typename std::aligned_storage<sizeof(T), std::alignment_of<T>::value>::type storage_type;
storage_type _stg;
bool _hasValue;
public:
Optional(const T &y) : _hasValue(true)
{
new (getstg()) T(y);
}
T* Get() { return reinterpret_cast<T*>(getstg()); }
}
And the most naive implementation I could think of:
template <typename T>
class NaiveOptional
{
private:
T* _value;
bool _hasValue;
public:
NaiveOptional(const T &y) : _hasValue(true), _value(new T(y))
{
}
T* Get() { return _value; }
}
Questions:
How do I interpret the storage_type? what was the author's intention?
What is the semantics of this line: new (getstg()) T(y); ?
Why doesn't the naive implementation work (or, what pros does the Optional<T> class have over NaiveOptional<T>) ?
The short answer is "performance".
Longer answer:
storage_type provides an in-memory region that is (a) big enough to fit the type T and (b) is aligned properly for type T. Unaligned memory access is slower. See also the doc.
new (getstg()) T(y) is a placement new. It does not allocate memory, but instead it constructs an object in memory region passed to it. The doc (on all forms of new - search for the "placement new").
The naive implementation does work, but it has worse performance. It uses dynamic memory allocation, which often can be a bottleneck. The Optional<T> implementation does not use dynamic memory allocation (see the point above).
Std::optional is supposed to be returned from functions. That means you would have to store the content referenced by the pointer somewhere. This defeats the purpose of this class
Further, you cant just use plain T, because otherwise you would have to construct it in some way. Optional allows the content to be uninitialized, some types can't be default constructed.
To make the class more flexible in terms of supported types, a fitting and correctly aligned storage is used. And only if optional is active, the true type will be constructed on it
What you had in mind is probably something like std::variant?
Probably I am not the first person finding out that std::exception_ptr could be used to implement an any type (performance considerations being put aside), as it is probably the only type in C++ that can hold anything. Googling did not, however, bring any result in this direction.
Does anybody know whether the following approach has been used anywhere for anything useful?
#include <exception>
#include <iostream>
struct WrongTypeError : std::exception { };
class Any {
public:
template <class T>
void set (T t) {
try { throw t; }
catch (...) { m_contained = std::current_exception(); }
}
template <class T>
T const & get () {
try { std::rethrow_exception (m_contained); }
catch (T const & t) { return t; }
catch (...) { throw WrongTypeError {}; }
}
private:
std::exception_ptr m_contained = nullptr;
};
int main () {
auto a = Any {};
a.set (7);
std::cout << a.get<int> () << std::endl;
a.set (std::string {"Wonderful weather today"});
std::cout << a.get<std::string> () << std::endl;
return 0;
}
as it is probably the only type in C++ that can hold anything.
I'm afraid this is not the case. boost::any can hold any type, and even copies (assume the type is copyable) it correctly as well. It is implemented (broadly speaking) using a base class and a templated child:
class any_base {
...
}
template <class T>
class any_holder : public any_base
{
private:
T m_data;
}
From this you can imagine that you can stuff any type into an any_holder (with the right interface), and then you can hold an any_holder by pointer to any_base. This technique is a type of type erasure; once we have an any_base pointer we are holding an object but don't know anything about the type. You could say this is total type erasure, something like std::function provides partial type erasure (and may use similar techniques under the hood, I'm not sure off the top of my head).
boost::any provides additional interface to support its usage of holding any type, and it probably provides better performance as throwing exceptions is crazy slow. Also, as I mentioned before, it correctly copies the underlying object, which is pretty cool. exception_ptr is a shared ownership pointer, so I believe it makes shallow copies instead.
Boost any website: http://www.boost.org/doc/libs/1_59_0/doc/html/any.html
It's being considered for the standard I believe: http://en.cppreference.com/w/cpp/experimental/any
It seems like the implementation is similar to boost but adds a small object optimization.
exception_ptr is quite a strange beast as far as I can tell, I've come across it before and googled it and there's surprisingly little information out there. I'm pretty sure however that it's magical, i.e. it cannot be implemented in userspace. I say this because when you throw it, the type seems to magically unerase itself, this isn't generally possible.
You are certainly the first person I have come across who has thought of it.
I'm definitely impressed by your lateral thinking skills :)
There is a problem with this approach however, (other than the obvious performance problem).
This stems from the fact that throw is permitted to make a copy of the object thrown.
Firstly this places a restriction on what you may store in your 'any' class, secondly it will have further performance implications and thirdly, each time you access your object, the compiler is not obliged to give you the same one. It's allowed to give you a copy. This means that at the very least you should only store immutable objects this way. (Note: when I say 'should' I really mean 'absolutely should not!') :)
You could get around this by crafting a class that allocates memory to store the object, records it's type and deletes it properly... But if you did that you'd be better off without the exception complication anyway.
In any case, this is what boost::any does under the covers.
So, I have something along the lines of these structs:
struct Generic {}
struct Specific : Generic {}
At some point I have the the need to downcast, ie:
Specific s = (Specific) GetGenericData();
This is a problem because I get error messages stating that no user-defined cast was available.
I can change the code to be:
Specific s = (*(Specific *)&GetGenericData())
or using reinterpret_cast, it would be:
Specific s = *reinterpret_cast<Specific *>(&GetGenericData());
But, is there a way to make this cleaner? Perhaps using a macro or template?
I looked at this post C++ covariant templates, and I think it has some similarities, but not sure how to rewrite it for my case. I really don't want to define things as SmartPtr. I would rather keep things as the objects they are.
It looks like GetGenericData() from your usage returns a Generic by-value, in which case a cast to Specific will be unsafe due to object slicing.
To do what you want to do, you should make it return a pointer or reference:
Generic* GetGenericData();
Generic& GetGenericDataRef();
And then you can perform a cast:
// safe, returns nullptr if it's not actually a Specific*
auto safe = dynamic_cast<Specific*>(GetGenericData());
// for references, this will throw std::bad_cast
// if you try the wrong type
auto& safe_ref = dynamic_cast<Specific&>(GetGenericDataRef());
// unsafe, undefined behavior if it's the wrong type,
// but faster if it is
auto unsafe = static_cast<Specific*>(GetGenericData());
I assume here that your data is simple.
struct Generic {
int x=0;
int y=0;
};
struct Specific:Generic{
int z=0;
explicit Specific(Generic const&o):Generic(o){}
// boilerplate, some may not be needed, but good habit:
Specific()=default;
Specific(Specific const&)=default;
Specific(Specific &&)=default;
Specific& operator=(Specific const&)=default;
Specific& operator=(Specific &&)=default;
};
and bob is your uncle. It is somewhat important that int z hae a default initializer, so we don't have to repeat it in the from-parent ctor.
I made thr ctor explicit so it will be called only explicitly, instead of by accident.
This is a suitable solution for simple data.
So the first step is to realize you have a dynamic state problem. The nature of the state you store changes based off dynamic information.
struct GenericState { virtual ~GenericState() {} }; // data in here
struct Generic;
template<class D>
struct GenericBase {
D& self() { return *static_cast<D&>(*this); }
D const& self() const { return *static_cast<D&>(*this); }
// code to interact with GenericState here via self().pImpl
// if you have `virtual` behavior, have a non-virtual method forward to
// a `virtual` method in GenericState.
};
struct Generic:GenericBase<Generic> {
// ctors go here, creates a GenericState in the pImpl below, or whatever
~GenericState() {} // not virtual
private:
friend struct GenericBase<Generic>;
std::unique_ptr<GenericState> pImpl;
};
struct SpecificState : GenericState {
// specific stuff in here, including possible virtual method overrides
};
struct Specific : GenericBase<Specific> {
// different ctors, creates a SpecificState in a pImpl
// upcast operators:
operator Generic() && { /* move pImpl into return value */ }
operator Generic() const& { /* copy pImpl into return value */ }
private:
friend struct GenericBase<Specific>;
std::unique_ptr<SpecificState> pImpl;
};
If you want the ability to copy, implement a virtual GenericState* clone() const method in GenericState, and in SpecificState override it covariantly.
What I have done here is regularized the type (or semiregularized if we don't support move). The Specific and Generic types are unrelated, but their back end implementation details (GenericState and SpecificState) are related.
Interface duplication is avoided mostly via CRTP and GenericBase.
Downcasting now can either involve a dynamic check or not. You go through the pImpl and cast it over. If done in an rvalue context, it moves -- if in an lvalue context, it copies.
You could use shared pointers instead of unique pointers if you prefer. That would permit non-copy non-move based casting.
Ok, after some additional study, I am wondering if what is wrong with doing this:
struct Generic {}
struct Specific : Generic {
Specific( const Generic &obj ) : Generic(obj) {}
}
Correct me if I am wrong, but this works using the implicit copy constructors.
Assuming that is the case, I can avoid having to write one and does perform the casting automatically, and I can now write:
Specific s = GetGenericData();
Granted, for large objects, this is probably not a good idea, but for smaller ones, will this be a "correct" solution?
I hope the headline isn't too confusing. What I have is a class StorageManager containing a list of objects of classes derived from Storage. Here is an example.
struct Storage {}; // abstract
class StorageManager
{
private:
map<string, unique_ptr<Storage>> List; // store all types of storage
public:
template <typename T>
void Add(string Name) // add new storage with name
{
List.insert(make_pair(Name, unique_ptr<Storage>(new T())));
}
Storage* Get(string Name) // get storage by name
{
return List[Name].get();
}
};
Say Position is a special storage type.
struct Position : public Storage
{
int X;
int Y;
};
Thanks to the great answers on my last question the Add function already works. What I want to improve is the Get function. It reasonable returns a pointer Storage* what I can use like the following.
int main()
{
StorageManager Manager;
Manager.Add<Position>("pos"); // add a new storage of type position
auto Strge = Manager.Get("pos"); // get pointer to base class storage
auto Pstn = (Position*)Strge; // convert pointer to derived class position
Pstn->X = 5;
Pstn->Y = 42;
}
It there a way to get rid of this pointer casting by automatically returning a pointer to the derived class? Maybe using templates?
use:
template< class T >
T* Get(std::string const& name)
{
auto i = List.find(name);
return i == List.end() ? nullptr : static_cast<T*>(i->second.get());
}
And then in your code:
Position* p = Manager.Get<Position>("pos");
I don't see what you can do for your Get member function besides what #BigBoss already pointed out, but you can improve your Add member to return the used storage.
template <typename T>
T* Add(string Name) // add new storage with name
{
T* t = new T();
List.insert(make_pair(Name, unique_ptr<Storage>(t)));
return t;
}
// create the pointer directly in a unique_ptr
template <typename T>
T* Add(string Name) // add new storage with name
{
std::unique_ptr<T> x{new T{}};
T* t = x.get();
List.insert(make_pair(Name, std::move(x)));
return t;
}
EDIT The temporary prevents us from having to dynamic_cast.
EDIT2 Implement MatthieuM's suggestion.
You can also further improve the function by accepting a value of the
type to be inserted, with a default argument, but that might incur an
additional copy.
When you have a pointer or reference to an object of some class, all you know is that the actual runtime object it references is either of that class or of some derived class. auto cannot know the runtime type of an object at compile time, because the piece of code containing the auto variable could be in a function that is run twice -- once handling an object of one runtime type, another handling an object with a different runtime type! The type system can't tell you what exact types are in play in a language with polymorphism -- it can only provide some constraints.
If you know that the runtime type of an object is some particular derived class (as in your example), you can (and must) use a cast. (It's considered preferable to use a cast of the form static_cast<Position*>, since casts are dangerous, and this makes it easier to search for casts in your code.)
But generally speaking, doing this a lot is a sign of poor design. The purpose of declaring a base class and deriving other class types from it is to enable objects of all of these those types to be treated the same way, without casting to a particular type.
If you want to always have the correct derived type at compile time without ever using casts, you have no choice but to use a separate collection of that type. In this case, there is probably no point deriving Position from Storage.
If you can rearrange things so that everything that a caller of StorageManager::Get() needs to do with a Position can be done by calling functions that don't specify Position-specific information (such as co-ordinates), you can make these functions into virtual functions in Storage, and implement Position-specific versions of them in Position. For example, you could make a function Storage::Dump() which writes its object to stdout. Position::Dump() would output X and Y, while the implementations of Dump() for other conceivable derived classes would output different information.
Sometimes you need to be able to work with an object that could be one of several essentially unrelated types. I suspect that may be the case here. In that case, boost::variant<> is a good way to go. This library provides a powerful mechanism called the Visitor pattern, which allows you to specify what action should be taken for each of the types that a variant object could possibly be.
Apart from the fact that this looks like a terrible idea... let's see what we can do to improve the situation.
=> It's a bad idea to require default construction
template <typename T>
T& add(std::string const& name, std::unique_ptr<T> element) {
T& t = *element;
auto result = map.insert(std::make_pair(name, std::move(element)));
if (result.second == false) {
// FIXME: somehow add the name here, for easier diagnosis
throw std::runtime_error("Duplicate element");
}
return t;
}
=> It's a bad idea to downcast blindly
template <typename T>
T* get(std::string const& name) const {
auto it = map.find(name);
return it != map.end() ? dynamic_cast<T*>(it->second.get()) : nullptr;
}
But frankly, this system is quite full of holes. And probably unnecessary in the first place. I encourage you to review the general problem an come up with a much better design.
Are there any established patterns for checking class invariants in C++?
Ideally, the invariants would be automatically checked at the beginning and at the end of each public member function. As far as I know, C with classes provided special before and after member functions, but unfortunately, design by contract wasn't quite popular at the time and nobody except Bjarne used that feature, so he removed it.
Of course, manually inserting check_invariants() calls at the beginning and at the end of each public member function is tedious and error-prone. Since RAII is the weapon of choice to deal with exceptions, I came up with the following scheme of defining an invariance checker as the first local variable, and that invariance checker checks the invariants both at construction and destruction time:
template <typename T>
class invariants_checker
{
const T* p;
public:
invariants_checker(const T* p) : p(p)
{
p->check_invariants();
}
~invariants_checker()
{
p->check_invariants();
}
};
void Foo::bar()
{
// class invariants checked by construction of _
invariants_checker<Foo> _(this);
// ... mutate the object
// class invariants checked by destruction of _
}
Question #0: I suppose there is no way to declare an unnamed local variable? :)
We would still have to call check_invariants() manually at the end of the Foo constructor and at the beginning of the Foo destructor. However, many constructor bodies and destructor bodies are empty. In that case, could we use an invariants_checker as the last member?
#include <string>
#include <stdexcept>
class Foo
{
std::string str;
std::string::size_type cached_length;
invariants_checker<Foo> _;
public:
Foo(const std::string& str)
: str(str), cached_length(str.length()), _(this) {}
void check_invariants() const
{
if (str.length() != cached_length)
throw std::logic_error("wrong cached length");
}
// ...
};
Question #1: Is it valid to pass this to the invariants_checker constructor which immediately calls check_invariants via that pointer, even though the Foo object is still under construction?
Question #2: Do you see any other problems with this approach? Can you improve it?
Question #3: Is this approach new or well-known? Are there better solutions available?
Answer #0: You can have unnamed local variables, but you give up control over the life time of the object - and the whole point of the object is because you have a good idea when it goes out of scope. You can use
void Foo::bar()
{
invariants_checker<Foo>(this); // goes out of scope at the semicolon
new invariants_checker<Foo>(this); // the constructed object is never destructed
// ...
}
but neither is what you want.
Answer #1: No, I believe it's not valid. The object referenced by this is only fully constructed (and thus starts to exist) when the constructor finished. You're playing a dangerous game here.
Answer #2 & #3: This approach is not new, a simple google query for e.g. "check invariants C++ template" will yield a lot of hits on this topic. In particular, this solution can be improved further if you don't mind overloading the -> operator, like this:
template <typename T>
class invariants_checker {
public:
class ProxyObject {
public:
ProxyObject(T* x) : m(x) { m->check_invariants(); }
~ProxyObject() { m->check_invariants(); }
T* operator->() { return m; }
const T* operator->() const { return m; }
private:
T* m;
};
invariants_checker(T* x) : m(x) { }
ProxyObject operator->() { return m; }
const ProxyObject operator->() const { return m; }
private:
T* m;
};
The idea is that for the duration of a member function call, you create an anonymous proxy object which performs the check in its constructor and destructor. You can use the above template like this:
void f() {
Foo f;
invariants_checker<Foo> g( &f );
g->bar(); // this constructs and destructs the ProxyObject, which does the checking
}
Ideally, the invariants would be automatically checked at the beginning and at the end of each public member function
I think this is overkill; I instead check invariants judiciously. The data members of your class are private (right?), so only its member functions can change the data memebers and therefore invalidate invariants. So you can get away with checking an invariant just after a change to a data member that particiaptes in that invariant.
Question #0: I suppose there is no way to declare an unnamed local variable? :)
You can usually whip up something using macros and __LINE__, but if you just pick a strange enough name, it should already do, since you shouldn't have more than one (directly) in the same scope. This
class invariants_checker {};
template<class T>
class invariants_checker_impl : public invariants_checker {
public:
invariants_checker_impl(T* that) : that_(that) {that_->check_invariants();}
~invariants_checker_impl() {that_->check_invariants();}
private:
T* that_;
};
template<class T>
inline invariants_checker_impl<T> get_invariant_checker(T* that)
{return invariants_checker_impl<T>(that);}
#define CHECK_INVARIANTS const invariants_checker&
my_fancy_invariants_checker_object_ = get_invariant_checker(this)
works for me.
Question #1: Is it valid to pass this to the invariants_checker constructor which immediately calls check_invariants via that pointer, even though the Foo object is still under construction?
I'm not sure whether it invokes UB technical. In practice it would certainly be safe to do so - where it not for the fact that, in practice, a class member that has to be declared at a specific position in relation to other class members is going to be a problem sooner or later.
Question #2: Do you see any other problems with this approach? Can you improve it?
See #2. Take a moderately sized class, add half a decade of extending and bug-fixing by two dozen developers, and I consider the chances to mess this up at at least once at about 98%.
You can somewhat mitigate this by adding a shouting comment to the data member. Still.
Question #3: Is this approach new or well-known? Are there better solutions available?
I hadn't seen this approach, but given your description of before() and after() I immediately thought of the same solution.
I think Stroustrup had an article many (~15?) years ago, where he described a handle class overloading operator->() to return a proxy. This could then, in its ctor and dtor, perform before- and after-actions while being oblivious to the methods being invoked through it.
Edit: I see that Frerich has added an answer fleshing this out. Of course, unless your class already needs to be used through such a handle, this is a burden onto your class' users. (IOW: It won't work.)
#0: No, but things could be slightly better with a macro (if you're ok with that)
#1: No, but it depends. You cannot do anything that would cause this to be dereferenced in before the body (which yours would, but just before, so it could work). This means that you can store this, but not access fields or virtual functions. Calling check_invariants() is not ok if it's virtual. I think it would work for most implementations, but not guaranteed to work.
#2: I think it will be tedious, and not worth it. This have been my experience with invariant checking. I prefer unit tests.
#3: I've seen it. It seems like the right way to me if you're going to do it.
unit testing is better alternative that leads to smaller code with better performance
I clearly see the issue that your destructor is calling a function that will often throw, that's a no-no in C++ isn't it?