std::shared_ptr in a union - c++

I'm implementing a "variant" class which must have the smallest possible memory footprint and store some objects with a shared pointer mechanism.
For this, I would like to make a union within the class of all variable types. This includes some shared_ptr's.
The operator= and copy constructors must change the data type of the variant, hence switching to another member in the union. Upon switching to a shared_ptr, this one should be reset to null without deleting/deowning the pointer. Is there a way to do this?
Of course, there would be other ways to implement this but they are generally more complex, less safe or more memory consuming in my case. Any suggestion welcome though.
Thanks!

Resetting to null isn't sufficient; the implementations of
std::shared_ptr I know also have a pointer to the reference
count, which must be deleted as well.
You need to keep track of what is currently in the union, and
use explicit calls to the destructor and placement new for
construction any time the type changes (and in the constructors
and the destructor).

Related

Stack allocating intermediate objects in contructors

When a constructor allocates intermediate objects that need to be passed to other constructors with longer lifetimes, can the intermediate objects be stack-allocated?
For example, I have a class Reader that has various utilities build atop an std::wistream that has several constructors for various use cases:
Reader(std::unique_ptr<std::istream> bytestream)
Reader(char buffer[], size_t count)
Reader(const std::string str&)
The only relevant member data that Reader has is:
std::unique_ptr<std::wistream> m_character_stream
Note: wistream, not istream. The constructors construct the wistream in various ways depending on their argument types.
For example, the first constructor form, looks like this:
Reader::Reader(std::unique_ptr<std::istream> bytestream) {
auto conversion = std::wbuffer_convert<std::codecvt_utf8<wchar_t>, wchar_t, std::char_traits<wchar_t>>(bytestream->rdbuf());
m_character_stream = std::make_unique<std::wistream>(&conversion);
}
My questions are:
I guess since I'm being passed a unique_ptr, that I have no choice but to std::move it to some otherwise useless member variable to keep it alive until we're destructed? Even though the rest of my class will never use that variable directly, but only indirectly through m_character_stream?
The conversion object is stack-allocated. Is that going to be a problem? I assume that when the constructor returns, that this object will be deleted. Will that cause std::wistream to malfunction? Does that mean I have to store conversion as otherwise useless member data to keep it alive as well? If so, is there a common pattern or naming convention for useless member data that exist only to keep things alive?
Since I have multiple constructors, I'd rather not have a bunch of constructor-specific member data attached to my class since that data won't be initialized most of the time. This just all smells wrong, but this is my first C++ project, so move semantics, ownership semantics, smart pointers, RAII, and all that crazy stuff is all pretty new to me and I'm trying to wrap my brain around it all.
I come from a Java/Python/Go background.
Yes, you have to keep bytestream alive and the typical way is storing it in a unique_ptr member variable.
Yes it is a problem that m_character_stream is going to use the pointer you pass beyond the lifetime of conversion. So yes, make conversion a member variable.
C++ does not have garbage collection. Lifetime management is absolutely essential when programming in C++. Read about RAII.

C++:using pointer to unordered_map or just defining it as a member variable from this type in a class?

I have a problem which I cannot understand:
Let's Say I have a class System with several member fields, and one of them is of type unordered_map, so when I declare the class in the header file, I write at the beginning of the header #include <unordered_map>.
Now, I have two ways of declaring this field:
1.std::unordered_map<std::string,int> umap;
2.std::unordered_map<std::string,int>* p_umap;
Now in the constructor of the class, if I choose the first option, there is no need to initialize that field in the initializer list since the constructor of class System will call the default constructor for the field umap as part of constructing an instance of type class System.
If I choose the second option, I should initialize the field p_umap in the constructor (in the initialize list) with the operator new and in the destructor, to delete this dynamic allocation.
What is the difference between these two options? If you have a class that one of it's fields is of type unordered_map, how do you declare this field? As a pointer or as a variable of type unordered_map?
In a situation like the one you are describing, it seems like the first option is preferable. Most likely, in fact, the unordered map is intended to be owned by the class it is a data member of. In other words, its lifetime should not be extended beyond the lifetime of the encapsulating class, and the encapsulating class has the responsibility of creating and destroying the unordered map.
While with option 1 all this work is done automatically, with option 2 you would have to take care of it manually (and take care of correct copy-construction, copy-assignment, exception-safety, lack of memory leaks, and so on). Surely you could use smart pointers (e.g. std::unique_ptr<>) to encapsulate this responsibility into a wrapper that would take care of deleting the wrapped object when the smart pointer itself goes out of scope (this idiom is called RAII, which is an acronym for Resource Acquisition Is Initialization).
However, it seems to me like you do not really need a pointer at all here. You have an object whose lifetime is completely bounded by the lifetime of the class that contains it. In these situations, you should just not use pointers and prefer declaring the variable as:
std::unordered_map<std::string, int> umap;
Make it not a pointer until you need to make it a pointer.
Pointers are rife with user error.
For example, you forgot to mention that your class System would also need to implement
System( const Sysytem& )
and
System& operator= ( const System& )
or Bad Behavior will arise when you try to copy your object.
The difference is in how you want to be able to access umap. Pointers can allow for a bit more flexibility, but they obviously add complexity in terms of allocation (stack vs heap, destructors and such). If you use a pointer to umap, you can do some pretty convoluted stuff such as making two System's with the same umap. In the end though, go with KISS unless there's a compelling reason not to.
There is no need to define it as pointer. If you do it, you must also make sure to implement copy constructor and assignment operator, or disable them completely.
If there is no specific reason to make it a pointer (and you don't show any) just make it a normal member variable.

Will memcpy or memmove cause problems copying classes?

Suppose I have any kind of class or structure. No virtual functions or anything, just some custom constructors, as well as a few pointers that would require cleanup in the destructor.
Would there be any adverse affects to using memcpy or memmove on this structure? Will deleting a moved structure cause problems? The question assumes that the memory alignment is also correct, and we are copying to safe memory.
In the general case, yes, there will be problems. Both memcpy and memmove are bitwise operations with no further semantics. That might not be sufficient to move the object*, and it is clearly not enough to copy.
In the case of the copy it will break as multiple objects will be referring to the same dynamically allocated memory, and more than one destructor will try to release it. Note that solutions like shared_ptr will not help here, as sharing ownership is part of the further semantics that memcpy/memmove don't offer.
For moving, and depending on the type you might get away with it in some cases. But it won't work if the objects hold pointers/references to the elements being moved (including self-references) as the pointers will be bitwise copied (again, no further semantics of copying/moving) and will refer to the old locations.
The general answer is still the same: don't.
* Don't take move here in the exact C++11 sense. I have seen an implementation of the standard library containers that used special tags to enable moving objects while growing buffers through the use of memcpy, but it required explicit annotations in the stored types that marked the objects as safely movable through memcpy, after the objects were placed in the new buffer the old buffer was discarded without calling any destructors (C++11 move requires leaving the object in a destructible state, which cannot be achieved through this hack)
Generally using memcpy on a class based object is not a good idea. The most likely problem would be copying a pointer and then deleting it. You should use a copy constructor or assignment operator instead.
No, don't do this.
If you memcpy a structure whose destructor deletes a pointer within itself, you'l wind up doing a double delete when the second instance of the structure is destroyed in whatever manner.
The C++ idiom is copy constructor for classes and std::copy or any of its friends for copying ranges/sequences/containers.
If you are using C++11, you can use std::is_trivially_copyable to determine if an object can be copied or moved using memcpy or memmove. From the documentation:
Objects of trivially-copyable types are the only C++ objects that may
be safely copied with std::memcpy or serialized to/from binary files
with std::ofstream::write()/std::ifstream::read(). In general, a
trivially copyable type is any type for which the underlying bytes can
be copied to an array of char or unsigned char and into a new object
of the same type, and the resulting object would have the same value
as the original.
Many classes don't fit this description, and you must beware that classes can change. I would suggest that if you are going to use memcpy/memmove on C++ objects, that you somehow protect unwanted usage. For example, if you're implementing a container class, it's easy for the type that the container holds to be modified, such that it is no longer trivially copyable (eg. somebody adds a virtual function). You could do this with a static_assert:
template<typename T>
class MemcopyableArray
{
static_assert(std::is_trivially_copyable<T>::value, "MemcopyableArray used with object type that is not trivially copyable.");
// ...
};
Aside from safety, which is the most important issue as the other answers have already pointed out, there may also be an issue of performance, especially for small objects.
Even for simple POD types, you may discover that doing proper initialization in the initializer list of your copy constructor (or assignments in the assignment operator depending on your usage) is actually faster than even an intrinsic version of memcpy. This may very well be due to memcpy's entry code which may check for word alignment, overlaps, buffer/memory access rights, etc.... In Visual C++ 10.0 and higher, to give you a specific example, you would be surprised by how much preamble code that tests various things executes before memcpy even begins its logical function.

C++ - when should I use a pointer member in a class

One of the thing that has been confusing for me while learning C++ (and Direct3D, but that some time ago) is when you should use a pointer member in a class. For example, I can use a non-pointer declaration:
private:
SomeClass instance_;
Or I could use a pointer declaration
private:
Someclass * instance_
And then use new() on it in the constructor.
I understand that if SomeClass could be derived from another class, a COM object or is an ABC then it should be a pointer. Are there any other guidelines that I should be aware of?
A pointer has following advantages:
a) You can do a lazy initialization, that means to init / create the object only short before the first real usage.
b) The design: if you use pointers for members of an external class type, you can place a forward declaration above your class and thus don't need to include the headers of that types in your header - instead of that you include the third party headers in your .cpp - that has the advantage to reduce the compile time and prevents side effects by including too many other headers.
class ExtCamera; // forward declaration to external class type in "ExtCamera.h"
class MyCamera {
public:
MyCamera() : m_pCamera(0) { }
void init(const ExtCamera &cam);
private:
ExtCamera *m_pCamera; // do not use it in inline code inside header!
};
c) A pointer can be deleted anytime - so you have more control about the livetime and can re-create an object - for example in case of a failure.
The advantages of using a pointer are outlined by 3DH: lazy initialization, reduction in header dependencies, and control over the lifetime of the object.
The are also disadvantages. When you have a pointer data member, you probably have to write your own copy constructor and assignment operator, to make sure that a copy of the object is created properly. Of course, you also must remember to delete the object in the destructor. Also, if you add a pointer data member to an existing class, you must remember to update the copy constructor and operator=. In short, having a pointer data member is more work for you.
Another disadvantage is really the flip side of the control over the lifetime of the object pointed to by the pointer. Non-pointer data members are destroyed automagically when the object is destroyed, meaning that you can always be sure that they exist as long as the object exists. With the pointer, you have to check for it being nullptr, meaning also that you have to make sure to set it to nullptr whenever it doesn't point to anything. Having to deal with all this may easily lead to bugs.
Finally, accessing non-pointer members is likely to be faster, because they are contiguous in memory. On the other hand, accessing pointer data member pointing to an object allocated on the heap is likely to cause a cache miss, making it slower.
There is no single answer to your question. You have to look at your design, and decide whether the advantages of pointer data members outweigh the additional headache. If reducing compile time and header dependencies is important, use the pimpl idiom. If your data member may not be necessary for your object in certain cases, use a pointer, and allocate it when needed. If these do not sound like compelling reasons, and you do not want to do extra work, then do not use a pointer.
If lazy initialization and the reduction of header dependencies are important, then you should first consider using a smart pointer, like std::unique_ptr or std::shared_ptr, instead of a raw pointer. Smart pointers save you from many of the headaches of using raw pointers described above.
Of course, there are still caveats. std::unique_ptr cleans up after itself, so you do not need to add or modify the destructor of your class. However, it is non-copiable, so having a unique pointer as a data member makes your class non-copiable as well.
With std::shared_ptr, you do not have to worry about the destructor or copying or assignment. However, the shared pointer incurs a performance penalty for reference counting.
Allocate it on the stack if you can, from the free-store if you have to. There is a similar question here, where you will find all the "why's".
The reason you see lots of pointer usage when it comes to games and stuff is because DirectX is a COM interface, and in honesty, most games programmers from back in the day aren't really C++ programmers, they are C-with-classes programmers, and in C pointer usage is very common.
Another reason to use pointers would be dynamic binding. If you have a base class with a virtual method and some derived classes, you can only get dynamic binding using pointers.

Should I prefer pointers or references in member data?

This is a simplified example to illustrate the question:
class A {};
class B
{
B(A& a) : a(a) {}
A& a;
};
class C
{
C() : b(a) {}
A a;
B b;
};
So B is responsible for updating a part of C. I ran the code through lint and it whinged about the reference member: lint#1725.
This talks about taking care over default copy and assignments which is fair enough, but default copy and assignment is also bad with pointers, so there's little advantage there.
I always try to use references where I can since naked pointers introduce uncertaintly about who is responsible for deleting that pointer. I prefer to embed objects by value but if I need a pointer, I use auto_ptr in the member data of the class that owns the pointer, and pass the object around as a reference.
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My own rule of thumb :
Use a reference member when you want the life of your object to be dependent on the life of other objects : it's an explicit way to say that you don't allow the object to be alive without a valid instance of another class - because of no assignment and the obligation to get the references initialization via the constructor. It's a good way to design your class without assuming anything about it's instance being member or not of another class. You only assume that their lives are directly linked to other instances. It allows you to change later how you use your class instance (with new, as a local instance, as a class member, generated by a memory pool in a manager, etc.)
Use pointer in other cases : When you want the member to be changed later, use a pointer or a const pointer to be sure to only read the pointed instance. If that type is supposed to be copyable, you cannot use references anyway. Sometimes you also need to initialize the member after a special function call ( init() for example) and then you simply have no choice but to use a pointer. BUT : use asserts in all your member function to quickly detect wrong pointer state!
In cases where you want the object lifetime to be dependent on an external object's lifetime, and you also need that type to be copyable, then use pointer members but reference argument in constructor That way you are indicating on construction that the lifetime of this object depends on the argument's lifetime BUT the implementation use pointers to still be copyable. As long as these members are only changed by copy, and your type don't have a default constructor, the type should fullfil both goals.
Avoid reference members, because they restrict what the implementation of a class can do (including, as you mention, preventing the implementation of an assignment operator) and provide no benefits to what the class can provide.
Example problems:
you are forced to initialise the reference in each constructor's initialiser list: there's no way to factor out this initialisation into another function (until C++0x, anyway edit: C++ now has delegating constructors)
the reference cannot be rebound or be null. This can be an advantage, but if the code ever needs changing to allow rebinding or for the member to be null, all uses of the member need to change
unlike pointer members, references can't easily be replaced by smart pointers or iterators as refactoring might require
Whenever a reference is used it looks like value type (. operator etc), but behaves like a pointer (can dangle) - so e.g. Google Style Guide discourages it
Objects rarely should allow assign and other stuff like comparison. If you consider some business model with objects like 'Department', 'Employee', 'Director', it is hard to imagine a case when one employee will be assigned to other.
So for business objects it is very good to describe one-to-one and one-to-many relationships as references and not pointers.
And probably it is OK to describe one-or-zero relationship as a pointer.
So no 'we can't assign' then factor.
A lot of programmers just get used with pointers and that's why they will find any argument to avoid use of reference.
Having a pointer as a member will force you or member of your team to check the pointer again and again before use, with "just in case" comment. If a pointer can be zero then pointer probably is used as kind of flag, which is bad, as every object have to play its own role.
Use references when you can, and pointers when you have to.
In a few important cases, assignability is simply not needed. These are often lightweight algorithm wrappers that facilitate calculation without leaving the scope. Such objects are prime candidates for reference members since you can be sure that they always hold a valid reference and never need to be copied.
In such cases, make sure to make the assignment operator (and often also the copy constructor) non-usable (by inheriting from boost::noncopyable or declaring them private).
However, as user pts already commented, the same is not true for most other objects. Here, using reference members can be a huge problem and should generally be avoided.
As everyone seems to be handing out general rules, I'll offer two:
Never, ever use use references as class members. I have never done so in my own code (except to prove to myself that I was right in this rule) and cannot imagine a case where I would do so. The semantics are too confusing, and it's really not what references were designed for.
Always, always, use references when passing parameters to functions, except for the basic types, or when the algorithm requires a copy.
These rules are simple, and have stood me in good stead. I leave making rules on using smart pointers (but please, not auto_ptr) as class members to others.
Yes to: Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My rules of thumb for data members:
never use a reference, because it prevents assignment
if your class is responsible for deleting, use boost's scoped_ptr (which is safer than an auto_ptr)
otherwise, use a pointer or const pointer
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Yes. Readability of your code. A pointer makes it more obvious that the member is a reference (ironically :)), and not a contained object, because when you use it you have to de-reference it. I know some people think that is old fashioned, but I still think that it simply prevent confusion and mistakes.
I advise against reference data members becasue you never know who is going to derive from your class and what they might want to do. They might not want to make use of the referenced object, but being a reference you have forced them to provide a valid object.
I've done this to myself enough to stop using reference data members.