Why is LLVM's Optional<T> implemented this way? - c++

I stumbled upon an implemenation of Optional<T> which is based on LLVM's Optional.h class and couldn't quite figure out why it is implemented the way it is.
To keep it short, I'm only pasting the parts I don't understand:
template <typename T>
class Optional
{
private:
inline void* getstg() const { return const_cast<void*>(reinterpret_cast<const void*>(&_stg)); }
typedef typename std::aligned_storage<sizeof(T), std::alignment_of<T>::value>::type storage_type;
storage_type _stg;
bool _hasValue;
public:
Optional(const T &y) : _hasValue(true)
{
new (getstg()) T(y);
}
T* Get() { return reinterpret_cast<T*>(getstg()); }
}
And the most naive implementation I could think of:
template <typename T>
class NaiveOptional
{
private:
T* _value;
bool _hasValue;
public:
NaiveOptional(const T &y) : _hasValue(true), _value(new T(y))
{
}
T* Get() { return _value; }
}
Questions:
How do I interpret the storage_type? what was the author's intention?
What is the semantics of this line: new (getstg()) T(y); ?
Why doesn't the naive implementation work (or, what pros does the Optional<T> class have over NaiveOptional<T>) ?

The short answer is "performance".
Longer answer:
storage_type provides an in-memory region that is (a) big enough to fit the type T and (b) is aligned properly for type T. Unaligned memory access is slower. See also the doc.
new (getstg()) T(y) is a placement new. It does not allocate memory, but instead it constructs an object in memory region passed to it. The doc (on all forms of new - search for the "placement new").
The naive implementation does work, but it has worse performance. It uses dynamic memory allocation, which often can be a bottleneck. The Optional<T> implementation does not use dynamic memory allocation (see the point above).

Std::optional is supposed to be returned from functions. That means you would have to store the content referenced by the pointer somewhere. This defeats the purpose of this class
Further, you cant just use plain T, because otherwise you would have to construct it in some way. Optional allows the content to be uninitialized, some types can't be default constructed.
To make the class more flexible in terms of supported types, a fitting and correctly aligned storage is used. And only if optional is active, the true type will be constructed on it
What you had in mind is probably something like std::variant?

Related

C++11 make_shared instancing

Apologies for the long question, but some context is necessary. I have a bit of code that seems to be a useful pattern for the project I'm working on:
class Foo
{
public:
Foo( int bar = 1 );
~Foo();
typedef std::shared_ptr< Foo > pointer_type;
static pointer_type make( int bar = 1 )
{
return std::make_shared< Foo >( bar );
}
...
}
As you can see, it provides a straightforward way of constructing any class as a PointerType which encapsulates a shared_ptr to that type:
auto oneFoo = Foo::make( 2 );
And therefore you get the advantages of shared_ptr without putting references to make_shared and shared_ptr all over the code base.
Encapsulating the smart pointer type within the class provides several advantages:
It lets you control the copyability and moveability of the pointer types.
It hides the shared_ptr details from callers, so that non-trivial object constructions, such as those that throw exceptions, can be placed within the Instance() call.
You can change the underlying smart pointer type when you're working with projects that use multiple smart pointer implementations. You could switch to a unique_ptr or even to raw pointers for a particular class, and calling code would remain the same.
It concentrates the details about (smart) pointer construction and aliasing within the class that knows most about how to do it.
It lets you decide which classes can use smart pointers and which classes must be constructed on the stack. The existence of the PointerType field provides a hint to callers about what types of pointers can be created that correspond for the class. If there is no PointerType defined for a class, this would indicate that no pointers to that class may be created; therefore that particular class must be created on the stack, RAII style.
However, I see no obvious way of applying this bit of code to all the classes in my project without typing the requisite typedef and static PointerType Instance() functions directly. I suspect there should be some consistent, C++11 standard, cross-platform way of doing this with policy-based templates, but a bit of experimentation has not turned up an obvious way of applying this trivially to a bunch of classes in a way that compiles cleanly on all modern C++ compilers.
Can you think of an elegant way to add these concepts to a bunch of classes, without a great deal of cutting and pasting? An ideal solution would conceptually limit what types of pointers can be created for which types of classes (one class uses shared_ptr and another uses raw pointers), and it would also handle instancing of any supported type by its own preferred method. Such a solution might even handle and/or limit coercion, by failing appropriately at compile time, between non-standard and standard smart and dumb pointer types.
One way is to use the curiously recurring template pattern.
template<typename T>
struct shared_factory
{
using pointer_type = std::shared_ptr<T>;
template<typename... Args>
static pointer_type make(Args&&... args)
{
return std::make_shared<T>(std::forward<Args>(args)...);
}
};
struct foo : public shared_factory<foo>
{
foo(char const*, int) {}
};
I believe this gives you what you want.
foo::pointer_type f = foo::make("hello, world", 42);
However...
I wouldn't recommend using this approach. Attempting to dictate how users of a type instantiate the type is unnecessarily restrictive. If they need a std::shared_ptr, they can create one. If they need a std::unique_ptr, they can create one. If they want to create an object on the stack, they can. I see nothing to be gained by mandating how your users' objects are created and managed.
To address your points:
It lets you control the copyability and moveability of the pointer types.
Of what benefit is this?
It hides the shared_ptr details from callers, so that non-trivial object constructions, such as those that throw exceptions, can be placed within the Instance() call.
I'm not sure what you mean here. Hopefully not that you can catch the exception and return a nullptr. That would be Java-grade bad.
You can change the underlying smart pointer type when you're working with projects that use multiple smart pointer implementations. You could switch to a unique_ptr or even to raw pointers for a particular class, and calling code would remain the same.
If you are working with multiple kinds of smart pointer, perhaps it would be better to let the user choose the appropriate kind for a given situation. Besides, I'd argue that having the same calling code but returning different kinds of handle is potentially confusing.
It concentrates the details about (smart) pointer construction and aliasing within the class that knows most about how to do it.
In what sense does a class know "most" about how to do pointer construction and aliasing?
It lets you decide which classes can use smart pointers and which classes must be constructed on the stack. The existence of the PointerType field provides a hint to callers about what types of pointers can be created that correspond for the class. If there is no PointerType defined for a class, this would indicate that no pointers to that class may be created; therefore that particular class must be created on the stack, RAII style.
Again, I disagree fundamentally with the idea that objects of a certain type must be created and managed in a certain way. This is one of the reasons why the singleton pattern is so insidious.
I wouldn't advise adding those static functions. Among other drawbacks, they really get pretty burdensome to create and maintain when there are multiple constructors. This is a case where auto can help as well as a typedef outside the class. Plus, you can use the std namespace (but please not in the header):
class Foo
{
public:
Foo();
~Foo();
Foo( int bar = 1 );
...
}
typedef std::shared_ptr<Foo> FooPtr;
In the C++ file:
using namespace std;
auto oneFoo = make_shared<Foo>( 2 );
FooPtr anotherFoo = make_shared<Foo>( 2 );
I think you'll find this to not be too burdensome on typing. Of course, this is all a matter of style.
This is a refinement of Joseph's answer for the sake of making the kind of pointer more configurable:
#include <memory>
template <typename T, template <typename...> class PtrT = std::shared_ptr>
struct ptr_factory {
using pointer_type = PtrT<T>;
template <typename... Args>
static pointer_type make(Args&&... args) {
return pointer_type{new T{args...}};
}
};
template <typename T>
struct ptr_factory<T, std::shared_ptr> {
using pointer_type = std::shared_ptr<T>;
template <typename... Args>
static pointer_type make(Args&&... args) {
return std::make_shared<T>(args...);
}
};
struct foo : public ptr_factory<foo> {
foo(char const*, int) {}
};
struct bar : public ptr_factory<bar, std::unique_ptr> {
bar(char const*, int) {}
};
ptr_factory defaults to using std::shared_ptr, but can be configured to use different smart pointer templates, thanks to template template parameters, as illustrated by struct bar.

Automatic Return Type for Pointers in C++

I hope the headline isn't too confusing. What I have is a class StorageManager containing a list of objects of classes derived from Storage. Here is an example.
struct Storage {}; // abstract
class StorageManager
{
private:
map<string, unique_ptr<Storage>> List; // store all types of storage
public:
template <typename T>
void Add(string Name) // add new storage with name
{
List.insert(make_pair(Name, unique_ptr<Storage>(new T())));
}
Storage* Get(string Name) // get storage by name
{
return List[Name].get();
}
};
Say Position is a special storage type.
struct Position : public Storage
{
int X;
int Y;
};
Thanks to the great answers on my last question the Add function already works. What I want to improve is the Get function. It reasonable returns a pointer Storage* what I can use like the following.
int main()
{
StorageManager Manager;
Manager.Add<Position>("pos"); // add a new storage of type position
auto Strge = Manager.Get("pos"); // get pointer to base class storage
auto Pstn = (Position*)Strge; // convert pointer to derived class position
Pstn->X = 5;
Pstn->Y = 42;
}
It there a way to get rid of this pointer casting by automatically returning a pointer to the derived class? Maybe using templates?
use:
template< class T >
T* Get(std::string const& name)
{
auto i = List.find(name);
return i == List.end() ? nullptr : static_cast<T*>(i->second.get());
}
And then in your code:
Position* p = Manager.Get<Position>("pos");
I don't see what you can do for your Get member function besides what #BigBoss already pointed out, but you can improve your Add member to return the used storage.
template <typename T>
T* Add(string Name) // add new storage with name
{
T* t = new T();
List.insert(make_pair(Name, unique_ptr<Storage>(t)));
return t;
}
// create the pointer directly in a unique_ptr
template <typename T>
T* Add(string Name) // add new storage with name
{
std::unique_ptr<T> x{new T{}};
T* t = x.get();
List.insert(make_pair(Name, std::move(x)));
return t;
}
EDIT The temporary prevents us from having to dynamic_cast.
EDIT2 Implement MatthieuM's suggestion.
You can also further improve the function by accepting a value of the
type to be inserted, with a default argument, but that might incur an
additional copy.
When you have a pointer or reference to an object of some class, all you know is that the actual runtime object it references is either of that class or of some derived class. auto cannot know the runtime type of an object at compile time, because the piece of code containing the auto variable could be in a function that is run twice -- once handling an object of one runtime type, another handling an object with a different runtime type! The type system can't tell you what exact types are in play in a language with polymorphism -- it can only provide some constraints.
If you know that the runtime type of an object is some particular derived class (as in your example), you can (and must) use a cast. (It's considered preferable to use a cast of the form static_cast<Position*>, since casts are dangerous, and this makes it easier to search for casts in your code.)
But generally speaking, doing this a lot is a sign of poor design. The purpose of declaring a base class and deriving other class types from it is to enable objects of all of these those types to be treated the same way, without casting to a particular type.
If you want to always have the correct derived type at compile time without ever using casts, you have no choice but to use a separate collection of that type. In this case, there is probably no point deriving Position from Storage.
If you can rearrange things so that everything that a caller of StorageManager::Get() needs to do with a Position can be done by calling functions that don't specify Position-specific information (such as co-ordinates), you can make these functions into virtual functions in Storage, and implement Position-specific versions of them in Position. For example, you could make a function Storage::Dump() which writes its object to stdout. Position::Dump() would output X and Y, while the implementations of Dump() for other conceivable derived classes would output different information.
Sometimes you need to be able to work with an object that could be one of several essentially unrelated types. I suspect that may be the case here. In that case, boost::variant<> is a good way to go. This library provides a powerful mechanism called the Visitor pattern, which allows you to specify what action should be taken for each of the types that a variant object could possibly be.
Apart from the fact that this looks like a terrible idea... let's see what we can do to improve the situation.
=> It's a bad idea to require default construction
template <typename T>
T& add(std::string const& name, std::unique_ptr<T> element) {
T& t = *element;
auto result = map.insert(std::make_pair(name, std::move(element)));
if (result.second == false) {
// FIXME: somehow add the name here, for easier diagnosis
throw std::runtime_error("Duplicate element");
}
return t;
}
=> It's a bad idea to downcast blindly
template <typename T>
T* get(std::string const& name) const {
auto it = map.find(name);
return it != map.end() ? dynamic_cast<T*>(it->second.get()) : nullptr;
}
But frankly, this system is quite full of holes. And probably unnecessary in the first place. I encourage you to review the general problem an come up with a much better design.

Writing containers that can handle implicit sharing, but turn it off for non-copyable types (like unique_ptr)?

I dug up an old Grid class, which is just a simple 2-D container templated with a type. To make one you would do this:
Grid<SomeType> myGrid (QSize (width, height));
I tried to make it "Qt-ish"...for instance it does size operations in terms of QSize, and you index into it with myGrid[QPoint (x, y)]. It can take boolean masks and do operations on elements whose mask bit was set. There's also a specialization where if your elements are QColor it can generate a QImage for you.
But one major Qt idiom I adopted was that it did implicit sharing under the hood. This turned out to be very useful in the QColor-based grids for the Thinker-Qt-based program I had.
However :-/ I also happened to have some cases where I'd written the likes of:
Grid< auto_ptr<SomeType> > myAutoPtrGrid (QSize (width, height));
When I moved up from auto_ptr to C++11's unique_ptr, the compiler rightfully complained. Implicit sharing requires the ability to make an identical copy if needed...and auto_ptr had swept this bug under the rug by conflating copying with transfer-of-ownership. Non-copyable types and implicit sharing simply do not mix, and unique_ptr is kind enough to tell us.
(Note: It so happened that I hadn't noticed the problem in practice, because the use cases for the auto_ptr were passing grids by reference...never by value. Still, this was bad code...and the proactive nature of C++11 is pointing out the potential problem before it happens.)
Ok, so...how might I design a generic container that can flip implicit sharing on and off? I really did want many of the Grid features when I was using the auto_ptr and it's great if copying is disabled for non-copyable types...that catches errors! But having the implicit sharing work is nice as a default, when the type happens to be copyable.
Some ideas:
I could make separate types (NonCopyableGrid, CopyableGrid)...or (UniqueGrid, Grid) depending on your tastes...
I could pass a flag into the Grid constructor
I could use static factory methods (Grid::newNonCopyable, Grid::newCopyable) but which would call the relevant constructor under the hood...maybe more descriptive
If possible, I might "detect" copyability on the contained type, and then either leverage a QSharedDataPointer in the implementation or not, depending?
Any good reasons to pick one of these methods over the others, or have people adopted something altogether better for this kind of situation?
If you were going to do it in a single container, I think the easiest way would be to use std::is_copy_constructable to choose whether your data struct inherited from QSharedData, and to replace QSharedDataPointer with std::unique_ptr (QScopedPointer doesn't support move semantics)
This is only a rough example of what I'm thinking as I don't have Qt and C++11 available together:
template<class T>
class Grid
{
struct EmptyStruct
{
};
typedef typename std::conditional<
std::is_copy_constructible<T>::value,
QSharedData,
EmptyStruct
>::type GridDataBase;
struct GridData : public GridDataBase
{
// data goes here
};
typedef typename std::conditional<
std::is_copy_constructible<T>::value,
QSharedDataPointer<GridData>,
std::unique_ptr<GridData>
>::type GridDataPointer;
public:
Grid() : data_(new GridData) {}
private:
GridDataPointer data_;
};
Disclaimer
I don't really understand your Grid template or your use cases. However I do understand containers in general. So maybe this answer applies to your Grid<T> and maybe it doesn't.
Since you've already stated the intent that Grid< unique_ptr<T> > would indicate unique ownership and a non-copyable T, what about doing something similar with copy on write?
What about explicitly stating when you want to use copy on write with:
Grid< cow_ptr<T> >
A cow_ptr<T> would offer reference counting copies, but on a "non-const dereference" would do a copy of T if the refcount is not 1. So Grid need not worry about memory management to such an extent. It would need only to handle its data buffer, and perhaps move or copy its members around in Grid's copy and/or move members.
A cow_ptr<T> is fairly easily cobbled together by wrapping std::shared_ptr<T>. Here is a partial implementation I put together about a month ago when dealing with a similar issue:
template <class T>
class cow_ptr
{
std::shared_ptr<T> ptr_;
public:
template <class ...Args,
class = typename std::enable_if
<
std::is_constructible<std::shared_ptr<T>, Args...>::value
>::type
>
explicit cow_ptr(Args&& ...args)
: ptr_(std::forward<Args>(args)...)
{}
explicit operator bool() const noexcept {return ptr_ != nullptr;}
T const* read() const noexcept {return ptr_.get();}
T * write()
{
if (ptr_.use_count() > 1)
ptr_.reset(ptr_->clone());
return ptr_.get();
}
T const& operator*() const noexcept {return *read();}
T const* operator->() const noexcept {return read();}
void reset() {ptr_.reset();}
template <class Y>
void
reset(Y* p)
{
ptr_.reset(p);
}
};
I chose to make the "write" syntax very explicit, since COW tends to be more effective when there are very few writes, but many reads/copies. To gain const access, you use it just like any other pointer:
p->inspect(); // compile time error if inspect() isn't const
But to do some modifying operation you have to call it out with the write member function:
p.write()->modify();
shared_ptr has a bunch of really handy constructors and I didn't want to have to replicate all of them in cow_ptr. So the one cow_ptr constructor you see is a poor man's implementation of inheriting constructors that also works for data members.
You may need to fill this out with other smart pointer functionality such as relational operators. You may also want to change how cow_ptr copies a T. I'm currently assuming a virtual clone() function but you could easily substitute into write the use of T's copy constructor instead.
If an explicit Grid< cow_ptr<T> > doesn't really fit your needs, that's all good. I figured I'd share just in case it did.

WrapperPointer class and deallocation of stack-allocated objects in C++

I am designing a wrapper class (a bit similar to std::autoPtr but I have different purpose) for scalar values:
template <typename T>
class ScalarPtr
{
private:
T* m_data;
...
public:
ScalarPtr(T *data): m_data(data)
{ ... }
T& operator* ();
T* operator -> ();
~ScalarPtr()
{
if(m_data)
delete m_data; ...
}
};
Now the problem is that when I also want to use this class for stack-allocated memory objects like this:
float temp=...
ScalarPtr<float> fltPtr(&temp);
The naive way is to pass boolean in constructor to specify whether to deallocate or not but is there any better way?
I am not sure if there is a better approach other than the boolean flag.
As you are aware(and hence ask the Q)this makes the interface rather non-intutive to the end user.
The purpose of the wrapper/resource managing class is to implement an RAII, where the resource itself takes care of releasing its resources(in this case dynamic memory) implicitly. Given that the stack variables are automatically destroyed beyond their scopes,its seems rather odd to use a resource managing wrapper for them. I would rather not prefer to do so.
But, Given that you want to maintain a uniform acess to your class through this wrapper class, the simplest yet not so elegant way seems to be the boolean flag.

How do boost::variant and boost::any work?

How do variant and any from the boost library work internally? In a project I am working on, I currently use a tagged union. I want to use something else, because unions in C++ don't let you use objects with constructors, destructors or overloaded assignment operators.
I queried the size of any and variant, and did some experiments with them. In my platform, variant takes the size of its longest possible type plus 8 bytes: I think it my just be 8 bytes o type information and the rest being the stored value. On the other hand, any just takes 8 bytes. Since i'm on a 64-bit platform, I guess any just holds a pointer.
How does Any know what type it holds? How does Variant achieve what it does through templates? I would like to know more about these classes before using them.
If you read the boost::any documentation they provide the source for the idea: http://www.two-sdg.demon.co.uk/curbralan/papers/ValuedConversions.pdf
It's basic information hiding, an essential C++ skill to have. Learn it!
Since the highest voted answer here is totally incorrect, and I have my doubts that people will actually go look at the source to verify that fact, here's a basic implementation of an any like interface that will wrap any type with an f() function and allow it to be called:
struct f_any
{
f_any() : ptr() {}
~f_any() { delete ptr; }
bool valid() const { return ptr != 0; }
void f() { assert(ptr); ptr->f(); }
struct placeholder
{
virtual ~placeholder() {}
virtual void f() const = 0;
};
template < typename T >
struct impl : placeholder
{
impl(T const& t) : val(t) {}
void f() const { val.f(); }
T val;
};
// ptr can now point to the entire family of
// struct types generated from impl<T>
placeholder * ptr;
template < typename T >
f_any(T const& t) : ptr(new impl<T>(t)) {}
// assignment, etc...
};
boost::any does the same basic thing except that f() actually returns typeinfo const& and provides other information access to the any_cast function to work.
The key difference between boost::any and boost::variant is that any can store any type, while variant can store only one of a set of enumerated types. The any type stores a void* pointer to the object, as well as a typeinfo object to remember the underlying type and enforce some degree of type safety. In boost::variant, it computes the maximum sized object, and uses "placement new" to allocate the object within this buffer. It also stores the type or the type index.
Note that if you have Boost installed, you should be able to see the source files in "any.hpp" and "variant.hpp". Just search for "include/boost/variant.hpp" and "include/boost/any.hpp" in "/usr", "/usr/local", and "/opt/local" until you find the installed headers, and you can take a look.
Edit
As has been pointed out in the comments below, there was a slight inaccuracy in my description of boost::any. While it can be implemented using void* (and a templated destroy callback to properly delete the pointer), the actualy implementation uses any<T>::placeholder*, with any<T>::holder<T> as subclasses of any<T>::placeholder for unifying the type.
boost::any just snapshots the typeinfo while the templated constructor runs: it has a pointer to a non-templated base class that provides access to the typeinfo, and the constructor derived a type-specific class satisfying that interface. The same technique can actually be used to capture other common capabilities of a set of types (e.g. streaming, common operators, specific functions), though boost doesn't offer control of this.
boost::variant is conceptually similar to what you've done before, but by not literally using a union and instead taking a manual approach to placement construction/destruction of objects in its buffer (while handling alignment issues explicitly) it works around the restrictions that C++ has re complex types in actual unions.