Can I make a bitwise copy of a C++ object? - c++

Can C++ objects be copied using bitwise copy? I mean using memcopy_s? Is there a scenario in which that can go wrong?

If they're Plain Old Data (POD) types, then this should work. Any class that has instances of other classes inside it will potentially fail, since you're copying them without invoking their copy constructors. The most likely way it will fail is one of their destructors will free some memory, but you've duplicated pointers that point to it, so you then try to use it from one of your copied objects and get a segfault. In short, don't do it unless it's a POD and you're sure it will always be a POD.

No, doing so can cause a lot of problems. You should always copy C++ types by using the assignment operator or copy constructor.
Using a bitwise copy breaks any kind of resource management because at the end of the day you are left with 2 objects for which 1 constructor has run and 2 destructors will run.
Consider as an example a ref counted pointer.
void foo() {
RefPointer<int> p1(new int());
RefPointer<int> p2;
memcpy(&p2,p1,sizeof(RefPointer<int>));
}
Now both p1 and p2 are holding onto the same data yet the internal ref counting mechanism has not been notified. Both destructors will run thinking they are the sole owner of the data potentially causing the value to be destructed twice.

It depends on the implementation of the C++ object you are trying to copy. In general the owner of the C++ object's memory is the object itself, so trying to "move" or "copy" it with something like memcopy_s is going behind its back which is going to get you in trouble more often than not.
Usually if a C++ object is intended to be copied or moved, there are APIs within the class itself that facilitate this.

If it is a single object, why not use assignment operator (I suppose the compiler-generated assignment operator could be implemented in terms of memcpy if that is so advantageous, and the compiler knows better whether your class is a POD.)
If you want to copy an array of objects, you can use std::copy. Depending on the implementation, this may end up using memmove (one more thing that you can mess up - the buffers may overlap; I don't know whether the nonstandard memcpy_s somehow checks for that) if the involved types allow that. Again, the decision is done by the compiler, which will get it right even if the types are modified.

In general if your structure contains pointers, you can't memcpy it because the structure would most likely allocate new memory spaces and point to those. A memcpy can't handle that.
If however your class only has primitive, non-pointer types, you should be able to.

In addition to the problem of unbalanced resource management calls in the two instance you end up with after a memcopy (as #JaredPar and #rmeador pointed), if the object supports a notion of an instance ID doing a memcopy will leave you with two instances with the same ID. This can lead to all sorts of "interesting" problems to hunt later on, especially if your objects are mapped to a database.

Related

destruct and construct a class in order to clean its members

I have noticed that, in our codebase, there are some classes that are only used to hold data in their members. In order to clean the members, we call each class' clear() command. However, I've also noticed that, clean methods call other cleans and, which in cascade, results in calling clear() of std data types like string and vectors. Therefore, there seems to be a huge amount of redundant code written.
To minimize the amount of work in the reduction process I am planning to turn each clear method to the following. What are other details I may be missing and your suggestions?
void aClass::clear() {
this->~aClass();
*this = aClass();
}
Thank you
You are missing that your call to the destructor is wrong here. You could simply assign:
*this = aClass();
However, then
This is not what a clear method is supposed to do. You are creating a new object not clearing it. If you clear a string then no reallocations need to happen. Deleting a string and creating a new one is really more work, not less.
As a corrollary to the first point, clear does not take parameters, while your constructor probably does. Again, default constructing a new instance is not what is expected from clear and sometimes it isn't even possible (when there is no default constructor).
Moreover, lets say you do implement clear like this. Then the method adds nothing that the user of the class could not do with a call to a constructor. Instead x.clear(); they could write x = aClass();.
Summary: If you have lots of code to clear objects in your code base, then this is very likely for a reason. Clearing an instance can usually be done with much less computational effort than creating a new instance. On the other hand if this is not the case, then there is no point in implementing a clear method (provided that there is an accessible constructor to create a new instance).
What are other details I may be missing
The destructor still clears the standard containers that are destroyed. Not only that, but it also re-creates them at a cost.
References to the cleared object will become invalid. This can easily lead to undefined behaviour unless you are careful.
and your suggestions?
Write a normal function that doesn't involve the destruction of the object.
Nooooooooooooooooooooo.
Don't do that.
The destructor isn't just a function that calls some operations on your members.
It ends the lifetime of your object.
Period. Full stop.
You cannot call the destructor then keep using the object. Invoking its assignment operator with a temporary doesn't bring it back to life, either; that just, well, invokes the assignment operator to change some values. Values that no longer exist.
There are a few cases where invoking the destructor is reasonable, but they all involve "very manual memory management" using placement new. There are basically no other acceptable times to do it.
Review the notion of object lifetime in C++.
If your copy assignment operator is correctly written, then just *this = aClass(); is already fine for resetting your objects. You might also consider aClass().swap(*this), with the appropriate move-capable swap being added to your class.
These are not "redundant" operations.

When to delete copy constructor

i'm building a simple Neural Network, i have two main classess: NeuralNetwork And Level. I don't have neurons since it's a simple feedworward with all units in a level sharing the same activation function.
I've organized my levels in this way:
Class NeuralNetwork has a vector of levels (not pointers, values.), for fast access them and every object of class Level has a pointer to the prec and next level, some matrixes and stuff.
The question which is more general is:
What copy/move constructors/assignments operators for a class organized as a double linked list like Level should do?
Copy the entire structure following next and back pointers and returning the istance of the just copied object.
Copy the single level leaving the pointers next\prec to nullptr returning a singleton level with just the copies of the matrixes ecc..
Delete the copy constructor/assignment operator.
What your class does is up to you. With that said, people will generally expect generic containers such as linked lists to be copyable.
When designing such classes, more generally, ask yourself the following:
What does copying this class mean?
Does it make sense to copy this class?
Will user's be surprised if this class is copied?
If it's not clear what copying this class means, don't make it copyable. If it doesn't mean sense to copy this class, don't make it copyable. If people will be surprised to see the class getting copied (think unique_ptr), don't make it copyable without some serious thought. These aren't hard rules, these are just some thinking points to help you work out what's appropriate.
If you don't intend to make something copyable, it does indeed make sense to delete the associated operators (this acts as documentation if nothing else).
If you do make your class copyable, then it's up to you on how you implement it. You can make shared instances that copy on write, you can eagerly copy, you can do whatever you want; it all depends on what you your users (including you) will expect to happen, and what the trade-offs are for each.
"Class NeuralNetwork has a vector of levels (not pointers, values.), for fast access them and every object of class Level has a pointer to the prec and next level"
That's a bit pointless. The previous layer is *(this-1) and the next layer is *(this+1). That's because vector stores its elements contiguously. Of course, there's the minor challenge of knowing whether there is a previous or next layer, but that question doesn't tend to come up. The input layer is a special layer since you set its values directly. All the next layers can safely pull their input from the previous layer, so no layer needs to push its input to the next layer.
Training is a bit harder because there you have a backpropagation phase and need to walk in both directions. However, here you control both the inputs and the desired outputs, so you explicitly use layers.front() and layers.back(), never going past them.
Now, when you copy the whole vector, each layer is a copy and has a new this, but since the new vector is again contiguous the *(this-1) / *(this+1) rule for neigbours still holds.

Move semantics, standard collections, and construction time address

Of course I would like to know some magic fix to this but I am open to restructuring.
So I have a class DeviceDependent, with the following constructor
DeviceDependent(Device& device);
which stores a reference to the device. The device can change state, which will necessitate a change in all DeviceDependent instances dependent on that device. (You guessed it this is my paltry attempt to ride the directX beast)
To handle this I have the functions DeviceDependent::createDeviceResources(), DeviceDependent::onDeviceLost().
I planned to register each DeviceDependentinstance to the device specified in the DeviceDependent constructor. The Device would keep a std::vector<DeviceDependent*> of all DeviceDependent instances so registered. It would then iterate through that vector and called the above functions when appropriate.
This seemed simple enough, but what I especially liked about it was that I could have a std::vector<DeviceDependent (or child)> somewhere else in the code and iterate over them quickly. For instance I have a class Renderable which as the name suggest represents a renderable object, I need to iterate over this once a frame at least and because of this I did not want the objects to be scattered throughout memory.
Down to business, here is the problem:
When I create the solid objects I relied on move semantics. This was purely by instinct I did not consider copying large objects like these to add them to the std::vector<DeviceDependent (or child)> collection. (and still abhor the idea)
However, with move semantics (and I have tested this for those who don't believe it) the address of the object changes. What's more it changes after the default constructor is called. That means my code inside the constructor of DeviceDependant calling device.registerDeviceDependent(this) compiles and runs fine, but the device accumulates a list of pointers which are invalidated as soon as the object is moved into the vector.
I want to know if there is someway I can stick to this plan and make it work.
Things I thought of:
Making the 'real' vector a collection of shared pointers, no issue copying. The object presumably will not change address. I don't like this plan because I am afraid that leaving things out on the heap will harm iteration performance.
Calling register after the object has been moved, it's what I'm doing provisionally but I don't like it because I feel the constructor is the proper place to do this. There
should not exist an instance of DeviceDependent that is not on some device's manifest.
Writing my own move constructor or move assignment functions. This way I could remove the old address from the device and change it to the new one. I don't want to do this because I don't want to keep updating it as the class evolves.
This has nothing to do with move constructors. The issue is std::vector. When you add a new item to that vector, it may reallocate its memory, and that will cause all the DeviceDependant objects to be transferred to a new memory block internal to the vector. Then new versions of each item will be constructed, and the old ones deleted. Whether the construction is copy-construction or move-construction is irrelevant; the objects effectively change their address either way.
To make your code correct, DeviceDependant objects need to unregister themselves in their destructor, and register themselves in both copy- and move-constructors. You should do this regardless of what else you decide about storage, if you have not deleted those constructors. Otherwise those constructors, if called, will do the wrong thing.
One approach not on your list would be to prevent the vector reallocating by calling reserve() with the maximum number of items you will store. This is only practical if you know a reasonable upper-bound to the number of DeviceDependant objects. However, you may find that reserving an estimate, while not eliminating the vector reallocations entirely, makes it rare enough that the cost of un-registering and re-registering becomes insignificant.
It sounds like your goal is getting cache-coherency for the DeviceDependants. You might find that using a std::deque as main storage avoids the re-allocations while still giving enough cache-coherency. Or you could gain cache-coherency by writing a custom allocator or operator new().
As an aside, it sounds like your design is being driven by performance costs that you are only guessing at. If you actually measure it, you might find that using std::vector> is fine, and doesn't significantly the time it takes to iterate over them. (Note you don't need shared pointers here, since the vector is the only owner, so you can avoid the overheads of reference-counting.)

Is it safe to log the value of this in constructor

I am working on tracing the constructor and its destructed instance and for that I am planning to log the value of "this" in constructor and destructor. I don't know whether it is safe to log value of "this" in constructor. If it is not safe then I wan't to know the scenarios where it will fail ?
If by "logging" you mean "writing out the value as e.g. a hexadecimal address to a log file", it is fine and safe. If not, please clarify.
Objects are not fully constructed until the constructor call is finished. So before that (i.e. from within the constructor) it is not safe to publish this to the rest of the program. Because that might result in someone trying to actually use the half-constructed object. This may lead to subtle and hard to find bugs.
Publishing this may mean one of the following things:
passing it as a parameter to an external (non-member) function,
storing it in a data structure available to other objects,
(for the sake of completeness: returning it from a function call - which does not apply in this specific case, because you can't return anything from a constructor).
Writing out the address of this to a file is thus not publishing it to the rest of your program* so it should be fine.
*well, unless you do some very arcane things afterwards, like loading back the address from the file in a different thread/process and casting it back to an object pointer... which is already unsafe enough by itself :-)
Memory is allocated first, then this is set, then the constructor(s) is called. So you're fine to use this during the constructor, as it points to the right place - the construction won't change this. However if construction fails (throws) the memory will disappear and the value pointed to by this will be garbage so you shouldn't store it and use it for anything outside the constructor until you know the construction will succeed.
Why would you think it is not safe? it is no different to logging the address of any objects in fact so long as those objects are valid.
The long and short of it is that it is safe in the scenarios you are intending to use it for.

Know what references an object

I have an object which implements reference counting mechanism. If the number of references to it becomes zero, the object is deleted.
I found that my object is never deleted, even when I am done with it. This is leading to memory overuse. All I have is the number of references to the object and I want to know the places which reference it so that I can write appropriate cleanup code.
Is there some way to accomplish this without having to grep in the source files? (That would be very cumbersome.)
A huge part of getting reference counting (refcounting) done correctly in C++ is to use Resource Allocation Is Initialization so it's much harder to accidentally leak references. However, this doesn't solve everything with refcounts.
That said, you can implement a debug feature in your refcounting which tracks what is holding references. You can then analyze this information when necessary, and remove it from release builds. (Use a configuration macro similar in purpose to how DEBUG macros are used.)
Exactly how you should implement it is going to depend on all your requirements, but there are two main ways to do this (with a brief overview of differences):
store the information on the referenced object itself
accessible from your debugger
easier to implement
output to a special trace file every time a reference is acquired or released
still available after the program exits (even abnormally)
possible to use while the program is running, without running in your debugger
can be used even in special release builds and sent back to you for analysis
The basic problem, of knowing what is referencing a given object, is hard to solve in general, and will require some work. Compare: can you tell me every person and business that knows your postal address or phone number?
One known weakness of reference counting is that it does not work when there are cyclic references, i.e. (in the simplest case) when one object has a reference to another object which in turn has a reference to the former object. This sounds like a non-issue, but in data structures such as binary trees with back-references to parent nodes, there you are.
If you don't explicitly provide for a list of "reverse" references in the referenced (un-freed) object, I don't see a way to figure out who is referencing it.
In the following suggestions, I assume that you don't want to modify your source, or if so, just a little.
You could of course walk the whole heap / freestore and search for the memory address of your un-freed object, but if its address turns up, it's not guaranteed to actually be a memory address reference; it could just as well be any random floating point number, of anything else. However, if the found value lies inside a block a memory that your application allocated for an object, chances improve a little that it's indeed a pointer to another object.
One possible improvement over this approach would be to modify the memory allocator you use -- e.g. your global operator new -- so that it keeps a list of all allocated memory blocks and their sizes. (In a complete implementation of this, operator delete would have remove the list entry for the freed block of memory.) Now, at the end of your program, you have a clue where to search for the un-freed object's memory address, since you have a list of memory blocks that your program actually used.
The above suggestions don't sound very reliable to me, to be honest; but maybe defining a custom global operator new and operator delete that does some logging / tracing goes in the right direction to solve your problem.
I am assuming you have some class with say addRef() and release() member functions, and you call these when you need to increase and decrease the reference count on each instance, and that the instances that cause problems are on the heap and referred to with raw pointers. The simplest fix may be to replace all pointers to the controlled object with boost::shared_ptr. This is surprisingly easy to do and should enable you to dispense with your own reference counting - you can just make those functions I mentioned do nothing. The main change required in your code is in the signatures of functions that pass or return your pointers. Other places to change are in initializer lists (if you initialize pointers to null) and if()-statements (if you compare pointers with null). The compiler will find all such places after you change the declarations of the pointers.
If you do not want to use the shared_ptr - maybe you want to keep the reference count intrinsic to the class - you can craft your own simple smart pointer just to deal with your class. Then use it to control the lifetime of your class objects. So for example, instead of pointer assignment being done with raw pointers and you "manually" calling addRef(), you just do an assignment of your smart pointer class which includes the addRef() automatically.
I don't think it's possible to do something without code change. With code change you can for example remember the pointers of the objects which increase reference count, and then see what pointer is left and examine it in the debugger. If possible - store more verbose information, such as object name.
I have created one for my needs. You can compare your code with this one and see what's missing. It's not perfect but it should work in most of the cases.
http://sites.google.com/site/grayasm/autopointer
when I use it I do:
util::autopointer<A> aptr=new A();
I never do it like this:
A* ptr = new A();
util::autopointer<A> aptr = ptr;
and later to start fulling around with ptr; That's not allowed.
Further I am using only aptr to refer to this object.
If I am wrong I have now the chance to get corrections. :) See ya!