destruct and construct a class in order to clean its members

destruct and construct a class in order to clean its members - c++

I have noticed that, in our codebase, there are some classes that are only used to hold data in their members. In order to clean the members, we call each class' clear() command. However, I've also noticed that, clean methods call other cleans and, which in cascade, results in calling clear() of std data types like string and vectors. Therefore, there seems to be a huge amount of redundant code written.
To minimize the amount of work in the reduction process I am planning to turn each clear method to the following. What are other details I may be missing and your suggestions?
void aClass::clear() {
this->~aClass();
*this = aClass();
}
Thank you

You are missing that your call to the destructor is wrong here. You could simply assign:
*this = aClass();
However, then
This is not what a clear method is supposed to do. You are creating a new object not clearing it. If you clear a string then no reallocations need to happen. Deleting a string and creating a new one is really more work, not less.
As a corrollary to the first point, clear does not take parameters, while your constructor probably does. Again, default constructing a new instance is not what is expected from clear and sometimes it isn't even possible (when there is no default constructor).
Moreover, lets say you do implement clear like this. Then the method adds nothing that the user of the class could not do with a call to a constructor. Instead x.clear(); they could write x = aClass();.
Summary: If you have lots of code to clear objects in your code base, then this is very likely for a reason. Clearing an instance can usually be done with much less computational effort than creating a new instance. On the other hand if this is not the case, then there is no point in implementing a clear method (provided that there is an accessible constructor to create a new instance).

What are other details I may be missing
The destructor still clears the standard containers that are destroyed. Not only that, but it also re-creates them at a cost.
References to the cleared object will become invalid. This can easily lead to undefined behaviour unless you are careful.
and your suggestions?
Write a normal function that doesn't involve the destruction of the object.

Nooooooooooooooooooooo.
Don't do that.
The destructor isn't just a function that calls some operations on your members.
It ends the lifetime of your object.
Period. Full stop.
You cannot call the destructor then keep using the object. Invoking its assignment operator with a temporary doesn't bring it back to life, either; that just, well, invokes the assignment operator to change some values. Values that no longer exist.
There are a few cases where invoking the destructor is reasonable, but they all involve "very manual memory management" using placement new. There are basically no other acceptable times to do it.
Review the notion of object lifetime in C++.
If your copy assignment operator is correctly written, then just *this = aClass(); is already fine for resetting your objects. You might also consider aClass().swap(*this), with the appropriate move-capable swap being added to your class.
These are not "redundant" operations.

Related

What is the practice for handling a complex resource pointer in C++ when allocating a new version of the resource?

In most of my programming now a day I put everything in a smart pointer and forget about it. The resource is properly managed 99.9% of the time. It's really great and way better than a garbage collection mechanism.
However, once in a while, the resource being held by a smart pointer needs to be explicitly freed before we can reallocate a new instance of it. Something like this:
r = std::make_shared<my_resource>(with_this_id);
r->do_work();
...
r->do_more_work();
...
r->do_even_more_work();
r.reset();
r = std::make_shared<my_resource>(with_this_id);
...
If I miss the r.reset() call, the resource may either be using a large mount of memory or disk space and re-allocating without first resetting is likely to cause problems on smaller computers. Either that, or the resource is locked so it can't be reallocated until explicitly freed.
Is there a pattern/algorithm/something which handles such a case in a cleaner manner?

I see basically two ways to approach this. The first is to wrap the reset-assign sequence into one function and never assign directly. I would probably do this like such
template<typename T, typename ...Args>
void reset_and_assign(std::shared_ptr<T> &ptr, Args ... &&args) { //in-out parameter to avoid copy since you cannot rvo on a parameter
ptr.reset();
ptr.reset(new T(std::forward<Args> args...));
}
This is pretty easy and fast to do but it won't save you from accidentally calling the assignment directly and I don't see a way to do this if you keep using shared_ptr.
The other alternative is writing a wrapper around shared_ptr that just forwards most function calls and changes reset and assignment such that it will first deallocate and then create the new resource. This is a bit work to do and it is easy to get a bug into this (especially if you mess up some universal references if you try to save yourself some constructors). It will also be annoying to interact with other code that uses std smart pointers and will be a major refactoring process. But you cannot mess it up by accidentally calling an assignment (at least probably not).
Also note that the standard library does reset intentionally in this order such that we don't delete the old resource if the new allocation throws.

Although it may sound bizarre, I believe this can be better expressed as move semantics. Here is why:
The new resource object you create is a replacement of the old one. Assume that the replacement is transparent to your users (i.e. they perceive the new resource object and the old one as the same). Therefore, it is as if you performed a imagined copy of the old object to obtain the new one, and then destroy the old one. This matches the intention of move semantics.
So my_resource should have a move constructor:
my_resource(my_resource&& old) {
old.reset();
/* code to init the new resource object */
}
// replace
ptr = std::make_unique<my_resource>(std::move(*ptr));

Normally you just put the reset() inside the destructor of the my_resource class.
If this isn't suitable for whatever reason, upon instantiation, you can put custom destructor function into the std::shared_ptr that calls the reset() and then deletes the resource.
If the shared_ptr doesn't reach the refcount 0 - are you sure you use it correctly? There is std::weak_ptr - a utility class for shared_ptr - made for the whole purpose of when you need both safety and timely deallocation.
Also, why use make_shared and instatiate new my_resource instead of simply using some init function of my_resource that will automatically call reset() if needed?

Move semantics, standard collections, and construction time address

Of course I would like to know some magic fix to this but I am open to restructuring.
So I have a class DeviceDependent, with the following constructor
DeviceDependent(Device& device);
which stores a reference to the device. The device can change state, which will necessitate a change in all DeviceDependent instances dependent on that device. (You guessed it this is my paltry attempt to ride the directX beast)
To handle this I have the functions DeviceDependent::createDeviceResources(), DeviceDependent::onDeviceLost().
I planned to register each DeviceDependentinstance to the device specified in the DeviceDependent constructor. The Device would keep a std::vector<DeviceDependent*> of all DeviceDependent instances so registered. It would then iterate through that vector and called the above functions when appropriate.
This seemed simple enough, but what I especially liked about it was that I could have a std::vector<DeviceDependent (or child)> somewhere else in the code and iterate over them quickly. For instance I have a class Renderable which as the name suggest represents a renderable object, I need to iterate over this once a frame at least and because of this I did not want the objects to be scattered throughout memory.
Down to business, here is the problem:
When I create the solid objects I relied on move semantics. This was purely by instinct I did not consider copying large objects like these to add them to the std::vector<DeviceDependent (or child)> collection. (and still abhor the idea)
However, with move semantics (and I have tested this for those who don't believe it) the address of the object changes. What's more it changes after the default constructor is called. That means my code inside the constructor of DeviceDependant calling device.registerDeviceDependent(this) compiles and runs fine, but the device accumulates a list of pointers which are invalidated as soon as the object is moved into the vector.
I want to know if there is someway I can stick to this plan and make it work.
Things I thought of:
Making the 'real' vector a collection of shared pointers, no issue copying. The object presumably will not change address. I don't like this plan because I am afraid that leaving things out on the heap will harm iteration performance.
Calling register after the object has been moved, it's what I'm doing provisionally but I don't like it because I feel the constructor is the proper place to do this. There
should not exist an instance of DeviceDependent that is not on some device's manifest.
Writing my own move constructor or move assignment functions. This way I could remove the old address from the device and change it to the new one. I don't want to do this because I don't want to keep updating it as the class evolves.

This has nothing to do with move constructors. The issue is std::vector. When you add a new item to that vector, it may reallocate its memory, and that will cause all the DeviceDependant objects to be transferred to a new memory block internal to the vector. Then new versions of each item will be constructed, and the old ones deleted. Whether the construction is copy-construction or move-construction is irrelevant; the objects effectively change their address either way.
To make your code correct, DeviceDependant objects need to unregister themselves in their destructor, and register themselves in both copy- and move-constructors. You should do this regardless of what else you decide about storage, if you have not deleted those constructors. Otherwise those constructors, if called, will do the wrong thing.
One approach not on your list would be to prevent the vector reallocating by calling reserve() with the maximum number of items you will store. This is only practical if you know a reasonable upper-bound to the number of DeviceDependant objects. However, you may find that reserving an estimate, while not eliminating the vector reallocations entirely, makes it rare enough that the cost of un-registering and re-registering becomes insignificant.
It sounds like your goal is getting cache-coherency for the DeviceDependants. You might find that using a std::deque as main storage avoids the re-allocations while still giving enough cache-coherency. Or you could gain cache-coherency by writing a custom allocator or operator new().
As an aside, it sounds like your design is being driven by performance costs that you are only guessing at. If you actually measure it, you might find that using std::vector> is fine, and doesn't significantly the time it takes to iterate over them. (Note you don't need shared pointers here, since the vector is the only owner, so you can avoid the overheads of reference-counting.)

Is it safe to log the value of this in constructor

I am working on tracing the constructor and its destructed instance and for that I am planning to log the value of "this" in constructor and destructor. I don't know whether it is safe to log value of "this" in constructor. If it is not safe then I wan't to know the scenarios where it will fail ?

If by "logging" you mean "writing out the value as e.g. a hexadecimal address to a log file", it is fine and safe. If not, please clarify.
Objects are not fully constructed until the constructor call is finished. So before that (i.e. from within the constructor) it is not safe to publish this to the rest of the program. Because that might result in someone trying to actually use the half-constructed object. This may lead to subtle and hard to find bugs.
Publishing this may mean one of the following things:
passing it as a parameter to an external (non-member) function,
storing it in a data structure available to other objects,
(for the sake of completeness: returning it from a function call - which does not apply in this specific case, because you can't return anything from a constructor).
Writing out the address of this to a file is thus not publishing it to the rest of your program* so it should be fine.
*well, unless you do some very arcane things afterwards, like loading back the address from the file in a different thread/process and casting it back to an object pointer... which is already unsafe enough by itself :-)

Memory is allocated first, then this is set, then the constructor(s) is called. So you're fine to use this during the constructor, as it points to the right place - the construction won't change this. However if construction fails (throws) the memory will disappear and the value pointed to by this will be garbage so you shouldn't store it and use it for anything outside the constructor until you know the construction will succeed.

Why would you think it is not safe? it is no different to logging the address of any objects in fact so long as those objects are valid.
The long and short of it is that it is safe in the scenarios you are intending to use it for.

Best practice when calling initialize functions multiple times?

This may be a subjective question, but I'm more or less asking it and hoping that people share their experiences. (As that is the biggest thing which I lack in C++)
Anyways, suppose I have -for some obscure reason- an initialize function that initializes a datastructure from the heap:
void initialize() {
initialized = true;
pointer = new T;
}
now When I would call the initialize function twice, an memory leak would happen (right?). So I can prevent this is multiple ways:
ignore the call (just check wether I am initialized, and if I am don't do anything)
Throw an error
automatically "cleanup" the code and then reinitialize the thing.
Now what is generally the "best" method, which helps keeping my code manegeable in the future?
EDIT: thank you for the answers so far. However I'd like to know how people handle this is a more generic way. - How do people handle "simple" errors which can be ignored. (like, calling the same function twice while only 1 time it makes sense).

You're the only one who can truly answer the question : do you consider that the initialize function could eventually be called twice, or would this mean that your program followed an unexpected execution flow ?
If the initialize function can be called multiple times : just ignore the call by testing if the allocation has already taken place.
If the initialize function has no decent reason to be called several times : I believe that would be a good candidate for an exception.
Just to be clear, I don't believe cleanup and regenerate to be a viable option (or you should seriously consider renaming the function to reflect this behavior).

This pattern is not unusual for on-demand or lazy initialization of costly data structures that might not always be needed. Singleton is one example, or for a class data member that meets those criteria.
What I would do is just skip the init code if the struct is already in place.
void initialize() {
if (!initialized)
{
initialized = true;
pointer = new T;
}
}
If your program has multiple threads you would have to include locking to make this thread-safe.

I'd look at using boost or STL smart pointers.

I think the answer depends entirely on T (and other members of this class). If they are lightweight and there is no side-effect of re-creating a new one, then by all means cleanup and re-create (but use smart pointers). If on the other hand they are heavy (say a network connection or something like that), you should simply bypass if the boolean is set...
You should also investigate boost::optional, this way you don't need an overall flag, and for each object that should exist, you can check to see if instantiated and then instantiate as necessary... (say in the first pass, some construct okay, but some fail..)

The idea of setting a data member later than the constructor is quite common, so don't worry you're definitely not the first one with this issue.
There are two typical use cases:
On demand / Lazy instantiation: if you're not sure it will be used and it's costly to create, then better NOT to initialize it in the constructor
Caching data: to cache the result of a potentially expensive operation so that subsequent calls need not compute it once again
You are in the "Lazy" category, in which case the simpler way is to use a flag or a nullable value:
flag + value combination: reuse of existing class without heap allocation, however this requires default construction
smart pointer: this bypass the default construction issue, at the cost of heap allocation. Check the copy semantics you need...
boost::optional<T>: similar to a pointer, but with deep copy semantics and no heap allocation. Requires the type to be fully defined though, so heavier on dependencies.
I would strongly recommend the boost::optional<T> idiom, or if you wish to provide dependency insulation you might fall back to a smart pointer like std::unique_ptr<T> (or boost::scoped_ptr<T> if you do not have access to a C++0x compiler).

I think that this could be a scenario where the Singleton pattern could be applied.

Can I make a bitwise copy of a C++ object?

Can C++ objects be copied using bitwise copy? I mean using memcopy_s? Is there a scenario in which that can go wrong?

If they're Plain Old Data (POD) types, then this should work. Any class that has instances of other classes inside it will potentially fail, since you're copying them without invoking their copy constructors. The most likely way it will fail is one of their destructors will free some memory, but you've duplicated pointers that point to it, so you then try to use it from one of your copied objects and get a segfault. In short, don't do it unless it's a POD and you're sure it will always be a POD.

No, doing so can cause a lot of problems. You should always copy C++ types by using the assignment operator or copy constructor.
Using a bitwise copy breaks any kind of resource management because at the end of the day you are left with 2 objects for which 1 constructor has run and 2 destructors will run.
Consider as an example a ref counted pointer.
void foo() {
RefPointer<int> p1(new int());
RefPointer<int> p2;
memcpy(&p2,p1,sizeof(RefPointer<int>));
}
Now both p1 and p2 are holding onto the same data yet the internal ref counting mechanism has not been notified. Both destructors will run thinking they are the sole owner of the data potentially causing the value to be destructed twice.

It depends on the implementation of the C++ object you are trying to copy. In general the owner of the C++ object's memory is the object itself, so trying to "move" or "copy" it with something like memcopy_s is going behind its back which is going to get you in trouble more often than not.
Usually if a C++ object is intended to be copied or moved, there are APIs within the class itself that facilitate this.

If it is a single object, why not use assignment operator (I suppose the compiler-generated assignment operator could be implemented in terms of memcpy if that is so advantageous, and the compiler knows better whether your class is a POD.)
If you want to copy an array of objects, you can use std::copy. Depending on the implementation, this may end up using memmove (one more thing that you can mess up - the buffers may overlap; I don't know whether the nonstandard memcpy_s somehow checks for that) if the involved types allow that. Again, the decision is done by the compiler, which will get it right even if the types are modified.

In general if your structure contains pointers, you can't memcpy it because the structure would most likely allocate new memory spaces and point to those. A memcpy can't handle that.
If however your class only has primitive, non-pointer types, you should be able to.

In addition to the problem of unbalanced resource management calls in the two instance you end up with after a memcopy (as #JaredPar and #rmeador pointed), if the object supports a notion of an instance ID doing a memcopy will leave you with two instances with the same ID. This can lead to all sorts of "interesting" problems to hunt later on, especially if your objects are mapped to a database.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js