Is it safe to use memmove/memcpy to initialize an object with constructor parameters?
No-one seems to use this method but it works fine when I tried it.
Does parameters being passed in a stack cause problems?
Say I have a class foo as follows,
class foo
{
int x,y;
float z;
foo();
foo(int,int,float);
};
Can I initialize the variables using memmove as follows?
foo::foo(int x,int y,float z)
{
memmove(this,&x, sizeof(foo));
}
This is undefined behavior.
The shown code does not attempt to initialize class variables. It attempts to memmove() onto the class pointer, and assumes that the size of the class is 2*sizeof(int)+sizeof(float). The C++ standard does not guarantee that.
Furthermore, the shown code also assumes the layout of the parameters that are passed to the constructor will be the same layout as the layout of the members of this POD. That, again, is not specified by the C++ standard.
It is safe to use memmove to initialize individual class members. For example, the following is safe:
foo::foo(int x_,int y_,float z_)
{
memmove(&x, &x_, sizeof(x));
memmove(&y, &y_, sizeof(y));
memmove(&z, &z_, sizeof(z));
}
Of course, this does nothing useful, but this would be safe.
No it is not safe, because based on the standard the members are not guaranteed to be immediately right after each other due to alignment/padding.
After your update, this is even worse because the location of passed arguments and their order are not safe to use.
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. - Donald Knuth
You should not try to optimize a code you are not sure you need to. I would suggest you to profile your code before you are able to perform this kind of optimizations. This way you don't lose time improving the performance of some code that is not going to impact the overall performance of your application.
Usually, compilers are smart enough to guess what are you trying to do with your code, and generate high efficient code that will keep the same functionality. For that purpose, you should be sure that you are enabling compiler optimizations (-Olevel flag or toggling individual ones through compiler command arguments).
For example, I've seen that some compilers transform std::copy into a memcpy when the compiler is sure that doing so is straightforward (e.g. data is contiguous).
No it is not safe. It is undefined behavior.
And the code
foo::foo(int x,int y,float z)
{
memmove(this,&x, sizeof(foo));
}
is not even saving you any typing compared to using an initializer list
foo::foo(int x,int y,float z) : x(x), y(y), z(z)
{ }
Related
A local variable (say an int) can be stored in a processor register, at least as long as its address is not needed anywhere. Consider a function computing something, say, a complicated hash:
int foo(int const* buffer, int size)
{
int a; // local variable
// perform heavy computations involving frequent reads and writes to a
return a;
}
Now assume that the buffer does not fit into memory. We write a class for computing the hash from chunks of data, calling foo multiple times:
struct A
{
void foo(int const* buffer, int size)
{
// perform heavy computations involving frequent reads and writes to a
}
int a;
};
A object;
while (...more data...)
{
A.foo(buffer, size);
}
// do something with object.a
The example may be a bit contrived. The important difference here is that a was a local variable in the free function and now is a member variable of the object, so the state is preserved across multiple calls.
Now the question: would it be legal for the compiler to load a at the beginning of the foo method into a register and store it back at the end? In effect this would mean that a second thread monitoring the object could never observe an intermediate value of a (synchronization and undefined behavior aside). Provided that speed is a major design goal of C++, this seems to be reasonable behavior. Is there anything in the standard that would keep a compiler from doing this? If no, do compilers actually do this? In other words, can we expect a (possibly small) performance penalty for using a member variable, aside from loading and storing it once at the beginning and the end of the function?
As far as I know, the C++ language itself does not even specify what a register is. However, I think that the question is clear anyway. Whereever this matters, I appreciate answers for a standard x86 or x64 architecture.
The compiler can do that if (and only if) it can prove that nothing else will access a during foo's execution.
That's a non-trivial problem in general; I don't think any compiler attempts to solve it.
Consider the (even more contrived) example
struct B
{
B (int& y) : x(y) {}
void bar() { x = 23; }
int& x;
};
struct A
{
int a;
void foo(B& b)
{
a = 12;
b.bar();
}
};
Looks innocent enough, but then we say
A baz;
B b(baz.a);
baz.foo(b);
"Optimising" this would leave 12 in baz.a, not 23, and that is clearly wrong.
Short answer to "Can a member variable (attribute) reside in a register?": yes.
When iterating through a buffer and writing the temporary result to any sort of primitive, wherever it resides, keeping the temporary result in a register would be a good optimization. This is done frequently in compilers. However, it is implementation based, even influenced by passed flags, so to know the result, you should check the generated assembly.
Consider the following example,
Aclass.h
class Aclass()
{
private:
int something;
double nothing;
};
Aclass.cpp
#include "Aclass.h"
Aclass::Aclass (int x) {
something = x;
nothing = y;
}
//Write some functions to manipulate x and y.
So now, what is the difference if I skip initializing y in the constructor? What is the downside and how does it affect the remainder of the code? Is this a good way to code? What I know is that a constructor will create an object anyway whether x and y are initialized or even if both are not (default constructor) and constructors are used to create versatile objects.
If there is no reason to initialize a variable, you don´t need this variable
=> Delete it entirely. Seriously, an uninitialized var is good for...? Nothing. (only for initializing it).
If you plan to initialize it later before it is used:
Can you guarantee that it will get a value before it is first read from, independent of how often and in what order the class methods are called? Then it´s not "wrong", but instead of tediously checking that (and risking bugs because it´s complicated), it´s far more easy to give it a value in the constructor.
No, making it more complicated on purpose is not a good way to code.
Leaving any variable uninitialized will allow it to acquire some garbage value.
Result = Undefined Behaviour. And it has no pros.
I've been programming C++ for a long time so I feel silly for not knowing this but...
I frequently write performance-sensitive code, and when I do I try to avoid heap allocations as much as possible. To that end I often re-use pre-allocated arrays of small objects instead of calling new and delete for each individual object.
In such cases I usually do this:
class MyClass
{
private:
int x, y;
public:
inline void Set(_x, _y) { x = _x; y = _y; }
};
...
MyClass &objectToReuse = someArray[someIndex];
objectToReuse.Set(someXValue, someYValue);
However I suspect this better-looking version would generate the same code:
class MyClass
{
private:
int x, y;
public:
inline MyClass(_x, _y) : x(_x), y(_y) {}
};
...
MyClass &objectToReuse = someArray[someIndex];
objectToReuse = MyClass(someXValue, someYValue);
Would a modern C++ compiler "get" this, or would it construct a temporary object and then copy it?
Yes, a good compiler will eliminate the extra overhead in this case.
I say "in this case" because it does very much depend on exactly what happens in the constructor (and the assignment operator - where it says "constructor/construction below, read as "or assignment operator"). If the constructor affects (or "might affect") global state, then the compiler can't remove the construction. Affecting global state would be reading or writing files, updating a global variable, almost any call to a function that the compiler doesn't "know" (doesn't have the source code for) will cause the constructor/copy elimination to "fail".
Naturally, if the constructor/copy is not eliminated, the code using a setter may well be more efficient. The exact measure, in a real scenario, can only really be determined by benchmarking, as it's often hard to judge exactly what effect one or many lines of code actually has when compiled with optimisation - something really simple looking can sometimes have quite an impact, where something looking complex can (although less often= ends up not taking much time at all.
I've noticed on a number of occasions in the past, C and C++ code that uses the following format for these structures:
class Vector3
{
float components[3];
//etc.
}
class Matrix4x4
{
float components[16];
//etc.
}
class Quaternion
{
float components[4];
//etc.
}
My question is, will this lead to any better cache performance than say, this:
class Quaternion
{
float x;
float y;
float z;
//etc.
}
...Since I'd assume the class members and functions are in contiguous memory space, anyway? I currently use the latter form because I find it more convenient (however I can also see the practical sense in the array form, since it allows one to treat axes as arbitrary dependant on the operation being performed).
Afer taking some advice from the respondents, I tested the difference and it is actually slower with the array -- I get about 3% difference in framerate. I implemented operator[] to wrap the array access inside the Vector3. Not sure if this has anything to do with it, but I doubt it since that should be inlined anyway. The only factor I could see was that I could no longer use a constructor initializer list on Vector3(x, y, z). However when I took the original version and changed it to no longer use constructor initialiser lists, it ran very marginally slower than before (less than 0.05%). No clue, but at least now I know the original approach was faster.
These declarations are not equivalent with respect to memory layout.
class Quaternion
{
float components[4];
//etc.
}
The above guarantees that the elements are continuous in memory, while, if they are individual members like in your last example, the compiler is allowed to insert padding between them (for instance to align the members with certain address-patterns).
Whether or not this results in better or worse performance depends on your mostly on your compiler, so you'd have to profile it.
I imagine the performance difference from an optimization like this is minimal. I would say something like this falls into premature optimization for most code. However, if you plan to do vector processing over your structs, say by using CUDA, struct composition makes an important difference. Look at page 23 on this if interested: http://www.eecis.udel.edu/~mpellegr/eleg662-09s/li.pdf
I am not sure if the compiler manages to optimize code better when using an array in this context (think at unions for example), but when using APIs like OpenGL, it can be an optimisation when calling functions like
void glVertex3fv(const GLfloat* v);
instead of calling
void glVertex3f(GLfloat x, GLfloat y, GLfloat z);
because, in the later case, each parameter is passed by value, whereas in the first example, only a pointer to the whole array is passed and the function can decide what to copy and when, this way reducing unnecessary copy operations.
I wrote a small coordinate class to handle both int and float coordinates.
template <class T>
class vector2
{
public:
vector2() { memset(this, 0, sizeof(this)); }
T x;
T y;
};
Then in main() I do:
vector2<int> v;
But according to my MSVC debugger, only the x value is set to 0, the y value is untouched. Ive never used sizeof() in a template class before, could that be whats causing the trouble?
No don't use memset -- it zeroes out the size of a pointer (4 bytes on my x86 Intel machine) bytes starting at the location pointed by this. This is a bad habit: you will also zero out virtual pointers and pointers to virtual bases when using memset with a complex class. Instead do:
template <class T>
class vector2
{
public:
// use initializer lists
vector2() : x(0), y(0) {}
T x;
T y;
};
As others are saying, memset() is not the right way to do this.
There are some subtleties, however, about why not.
First, your attempt to use memset() is only clearing sizeof(void *) bytes. For your sample case, that apparently is coincidentally the bytes occupied by the x member.
The simple fix would be to write memset(this, 0, sizeof(*this)), which in this case would set both x and y.
However, if your vector2 class has any virtual methods and the usual mechanism is used to represent them by your compiler, then that memset will destroy the vtable and break the instance by setting the vtable pointer to NULL. Which is bad.
Another problem is that if the type T requires some constructor action more complex than just settings its bits to 0, then the constructors for the members are not called, but their effect is ruined by overwriting the content of the members with memset().
The only correct action is to write your default constructor as
vector2(): x(0), y(0), {}
and to just forget about trying to use memset() for this at all.
Edit: D.Shawley pointed out in a comment that the default constructors for x and y were actually called before the memset() in the original code as presented. While technically true, calling memset() overwrites the members, which is at best really, really bad form, and at worst invokes the demons of Undefined Behavior.
As written, the vector2 class is POD, as long as the type T is also plain old data as would be the case if T were int or float.
However, all it would take is for T to be some sort of bignum value class to cause problems that could be really hard to diagnose. If you were lucky, they would manifest early through access violations from dereferencing the NULL pointers created by memset(). But Lady Luck is a fickle mistress, and the more likely outcome is that some memory is leaked, and the application gets "shaky". Or more likely, "shakier".
The OP asked in a comment on another answer "...Isn't there a way to make memset work?"
The answer there is simply, "No."
Having chosen the C++ language, and chosen to take full advantage of templates, you have to pay for those advantages by using the language correctly. It simply isn't correct to bypass the constructor (in the general case). While there are circumstances under which it is legal, safe, and sensible to call memset() in a C++ program, this just isn't one of them.
The problem is this is a Pointer type, which is 4 bytes (on 32bit systems), and ints are 4 bytes (on 32bit systems). Try:
sizeof(*this)
Edit: Though I agree with others that initializer lists in the constructor are probably the correct solution here.
Don't use memset. It'll break horribly on non-POD types (and won't necessarily be easy to debug), and in this case, it's likely to be much slower than simply initializing both members to zero (two assignments versus a function call).
Moreover, you do not usually want to zero out all members of a class. You want to zero out the ones for which zero is a meaningful default value. And you should get into the habit of initializing your members to a meaningful value in any case. Blanket zeroing everything and pretending the problem doesn't exist just guarantees a lot of headaches later. If you add a member to a class, decide whether that member should be initialized, and how.
If and when you do want memset-like functionality, at least use std::fill, which is compatible with non-POD types.
If you're programming in C++, use the tools C++ makes available. Otherwise, call it C.
dirkgently is correct. However rather that constructing x and y with 0, an explicit call to the default constructor will set intrinsic types to 0 and allow the template to be used for structs and classes with a default constructor.
template <class T>
class vector2
{
public:
// use initializer lists
vector2() : x(), y() {}
T x;
T y;
};
Don't try to be smarter than the compiler. Use the initializer lists as intended by the language. The compiler knows how to efficiently initialize basic types.
If you would try your memset hack on a class with virtual functions you would most likely overwrite the vtable ending up in a disaster. Don't use hack like that, they are a maintenance nightmare.
This might work instead:
char buffer[sizeof(vector2)];
memset(buffer, 0, sizeof(buffer));
vector2 *v2 = new (buffer) vector2();
..or replacing/overriding vector2::new to do something like that.
Still seems weird to me though.
Definitely go with
vector2(): x(0), y(0), {}