In a small game I'm writing, I have a class Weapon with two constructors, one which takes in some parameters to produce a custom weapon, and one that grabs a default one (the CHAIN_GUN):
Weapon::Weapon (void) {
// Standard weapon
*this = getWeapon(CHAIN_GUN);
return;
}
Question: Are there any negative consequences from using *this and operator= to initialise a class?
Imagine that someone asked you to draw a painting... would you;
first draw your default (1st) (that familiar smilie face you like so much),
then draw what that someone asked for (2nd),
only to draw the same thing one more time, but on the canvas containing your default,
and then burn the 2nd painting?
This post will try to explain why this simile is relevant.
WHY IS IT A BAD IDEA?
I've never seen a default-constructor implemented with the use of the assignment-operator, and honestly it's not something that I'd recommend, nor support during a code review.
The major problem with such code is that we are, by definition, constructing two objects (instead of one) and calling a member-function, meaning that we construct all our members two times, and later having to copy/move initialize all members by calling the assignment operator.
It's unintuitive that upon requesting construction of 1 object, we construct 2 objects, only to later copy the values from the 2nd to the 1st and discard the 2nd.
Conclusion: Don't do it.
( Note: In a case where Weapon has base-classes it will be even worse )
( Note: Another potential danger is that the factory function accidentially uses the default-constructor, resulting in an infinite recursion not caught during compilation, as noted by #Ratchet Freat )
PROPOSED SOLUTION
In your specific case you are far better off using a default-argument in your constructor, as in the below example.
class Weapon {
public:
Weapon(WeaponType w_type = CHAIN_GUN);
...
}
Weapon w1; // w_type = CHAIN_GUN
Weapon w2 (KNOWLEDGE); // the most powerful weapon
( Note: An alternative to the above would be to use a delegating constructor, available in C++11 )
Using the assignment operator to implement a constructor is rarely a good idea. In your case, for example, you could just use a default parameter:
Weapon::Weapon(GunType g = CHAIN_GUN)
: // Initialize members based on g
{
}
In other cases, you might use a delegating constructor (with C++11 or later):
Weapon::Weapon(GunType g)
: // Initialize members based on g
{
}
Weapon::Weapon()
: Weapon(CHAIN_GUN) // Delegate to other constructor
{
}
One thing to keep in mind is that if operator= - or any function it calls - is virtual, the derived-class version won't be invoked. This could result in uninitialised fields and later Undefined Behaviour, but it all depends on your data members.
More generally, your bases and data members are guaranteed to have been initialised if they have constructors or appear in the initialiser list (or with C++11 are assigned to in the class declaration) - so apart from the virtual issue above, operator= will often work without Undefined Behaviour.
If a base or member has been initialised before operator=() is invoked, then the initial value is overwritten before it's used anyway, the optimiser may or may not be able to remove the first initialisation. For example:
std::string s_;
Q* p_;
int i_;
X(const X& rhs)
: p_(nullptr) // have to initialise as operator= will delete
{
// s_ is default initialised then assigned - optimiser may help
// i_ not initialised but if operator= sets without reading, all's good
*this = rhs;
}
As you can see, it's a bit error prone, and even if you get it right someone coming along later to update operator= might not check for a constructor (ab)using it....
You could end up with infinite recursion leading to stack overflow if getWeapon() uses the Prototype or Flyweight Patterns and tries to copy-construct the Weapon it returns.
Taking a step back, there's the question of why getWeapon(CHAIN_GUN); exists in that form. If we need a function that creates a weapon based on a weapon type, on the face of it a Weapon(Weapon_Type); constructor seems a reasonable option. That said, there are rare but plentiful edge cases where getWeapon might return something other than a Weapon object that can never-the-less be assigned to a Weapon, or might be kept separate for build/deployment reasons....
If you have defined a non-copy = assignment operator that lets the Weapon change its type after construction, then implementing the constructor in terms of assignment works just fine, and is a good way to centralize your initialization code. But if a Weapon is not meant to change type after construction, then a non-copy = assignment operator does not make much sense to have, let alone use for initialization.
I'm sure yes.
You already created object inside your "getWeapon" function, and then you copying it, that may be long operation. So, at least you have to try move semantic.
But.
If inside "getWeapon" you call to constructor (and you do, somehow "getWeapon" have to create your class to return it to your copy operation), you create very unclear architecture, when one constructor calls function that calls another constructor.
I believe you have to separate your parameters initialization to private functions, that have to be called from your constructors the way you want to.
Related
I'm writing a piece of code that inherits from a user provided class (not for polymorphism reasons).
I need to write an explicit move constructor/assignment for it because I store pointers to some internal data of my class.
I don't know how to call move constructor well.
Example:
template <typename UserType>
class my_class : UserType {
other_data data;
public:
my_class(my_class&& x) UserType(/*...*/), data(std::move(x.data)) {}
};
Options:
I can call move(x) => but then it's use after move.
I can do some static_cast<UserType&&>(x) - but that's quite unusual.
What's a good solution here?
As you most likely already know, std::move is equal to your static_cast. Hence, technically, there is no difference.
As the underlying move constructor only knows your base class, it will be safe to access your own members. I guess the best solution here is to simply use = default; and ensure that all special behavior gets pushed into a separate class that doesn't have to deal with special behavior you need.
Call move(x) is the most canonical way: the base move constructor would never mess up with the derive class members without invoking undefined behavior.
Note that "use after move" is usually legit, the c++ standard demands that "Unless otherwise specified, all standard library objects that have been moved from are placed in a valid but unspecified state." This means their member functions are still callable if they require no precondition. Consider this example:
vector<int> a = { 1, 2, 3 };
vector<int> b(std::move(a));
//std::cout << a.back(); <- Invalid call. We don't know whether a is empty.
a.clear(); // Ok. clear() doesn't require preconditions.
a.push_back(0); //Ok. Now a is { 0 }
Also, the validity of an object is usually required for a destructor call (which cannot be avoided if the object has automatic storage duration).
why does creation of an object needs a constructor?
Even if I don't define a constructor a default constructor is generated ..but why a constructor necessary?
why can't objects be created without a constructor?
This is more of a discussion on terminology than a real argument about the behavior. As you mention, in some cases there is nothing to be done during construction, so there should be no need for a constructor, right? Well, there is the concept of a trivial-constructor which is a constructor that does not do anything at all. For the sake of the standard document, it is easier to treat everything as having a (possibly trivial) constructor than to have to provide all the cases and exceptions in all places where it currently just states 'constructor'.
Consider that every use of 'constructor' would have to be replaced by 'constructor or nothing if the type does not have any virtual functions or bases and no members that require the generation of a constructor'.
This is the same reason why all virtual functions are called overrider even the first one in the hierarchy that by definition does not override anything. This form of generalization makes the language slightly easier to interpret, although not too many people will claim that section 8.5 on initialization is simple...
Also note that, while all user defined types by definition have a constructor, the constructor is not required for the creation of objects. In particular for objects with trivial-constructor the lifetime starts when the memory for the object is allocated (the standard, knowing that trivial means nothing to be done does not even go through the hop of requiring that the constructor is run in that case.
3.8 [basic.life]/1
The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type T begins when:
-- storage with the proper alignment and size for type T is obtained, and
-- if the object has non-trivial initialization, its initialization is complete.
That second bullet used to read (C++03): if T is a class type with a non-trivial constructor (12.1), the constructor call has completed. Which more clearly stated that the constructor need not be executed. But the new wording expresses the intent in basically the same way. Only if the object has non-trivial initialization, the initialization needs to complete. For objects with trivial-constructor (trivial initialization) allocating the storage creates the object. Where does it matter?
struct T { int x; }; // implicitly defined trivial-constructor
T *p = static_cast<T*>(malloc(sizeof *p));
// *p is alive at this point, no need to do
T *q;
{ void *tmp = malloc(sizeof T);
q = new (tmp) T; // call constructor
}
In a way, this is a slightly philosophical question. You can think of a constructor as a subroutine that turns some uninitialized memory into an object. Or you can think of it as a language feature that makes initialization easier to follow, write, and understand. You could even answer the question circularly: why does creation of an object needs a constructor? Because that's what creation of an object is, in a sense. If you don't have a constructor, what you're creating isn't an object.
It may be that a particular constructor does nothing, but that's an implementation detail of that class. The fact that every class has a constructor means that the class encapsulates what initialization is necessary: to use the class safely, you don't need to know whether the constructor does anything. In fact, in the presence of inheritance, vtables, debug-tracking, and other compiler features, you might not even know whether the constructor does anything. (C++ complicates this slightly by calling some classes POD types, but the encapsulation holds as long as you don't need to know that something is of POD type.)
The invocation of a constructor defines the point at which an object is created. In terms of language semantics, when the constructor finishes, the constructed object exists: before that, the object does not exist, and it is an error to use it as if it did. This is why construction order (that is, the order that member object and base class sub-object constructors are called) is so important in C++ and similar languages: if your constructor can throw an exception, it's necessary to destroy exactly those objects that have been constructed.
To end up with a working program, the programmer, anyone who tries to understand the source code, the compiler and linker, the runtime library, and any other compilation tools, all need to have a shared idea of what the program means: the semantics of the program. Agreeing on the lifetime of an object—when the compiler can run extra code to help create it, and when you can safely use it—is actually a big part of this agreement. Constructors and destructors are part of defining this lifetime. Even if they happen to be empty sometimes, they provide a way for us to agree when an object is valid, thus making it easier (possible) to specify and understand the language.
Say you have a simple class :
class Foo
{
int bar;
}
What is the value of bar exactly? Maybe you don't care when your object gets its memory allocated, but the machine running your program needs to give it some value. That's what the constructor is for : initializing class members to some value.
Constructor is necessary to call the constructors on class members, except built-in types see
Parametrized Constructors can take arguments. For example:
class example
{
int p, q;
public:
example();
example(int a, int b); //parameterized constructor
};
example :: example()
{
}
example :: example(int a, int b)
{
p = a;
q = b;
}
When an object is declared in a parameterized constructor, the initial values have to be passed as arguments to the constructor function. The normal way of object declaration may not work. The constructors can be called explicitly or implicitly. The method of calling the constructor implicitly is also called the shorthand method.
example e = example(0, 50); //explicit call
example e(0, 50); //implicit call
This is particularly useful to provide initial values to the object.
Also you will find important stuff on this page :
http://en.wikipedia.org/wiki/Constructor_%28object-oriented_programming%29
is that correct to write a constructor like this?
class A
{
A(const A& a)
{
....
}
};
if yes, then is it correct to invoke it like this:
A* other;
...
A* instance = new A(*(other));
if not, what do you suggest?
Thanks
Almost correct. When declaring the constructor in the class, simply write A, not A::A; you would use A::A when giving a definition for the constructor outside of the class declaration. Otherwise, yes.
Also, as James points out, unless you are copying from an object that you are accessing via a pointer, you don't need to do any dereferencing (if it is a value or a reference). One typically does not use pointers unless it is necessary to do so. On that principle, you would have something like:
A x; // Invoke default constructor
// ...
// do some thing that modify x's state
// ...
A cpy(x); // Invokes copy constructor
// cpy now is a copy of x.
Note, though, that the first statement A x invokes the default constructor. C++ will provide a default implementation of that constructor, but it might not be what you want and even if it is what you want, it is better style, IMHO, to give one explicitly to let other programmers know that you've thought about it.
Edit
C++ will automatically provide an implementation of the default constructor, but only if you don't provide any user-defined constructors -- once you provide a constructor of your own, the compiler won't automatically generate the default constructor. Truth be told, I forgot about this as I've been in the habit of giving all constructor definitions myself, even when they aren't strictly necessary. In C++0x, it will be possible to use = default, which will provide the simplicity of using the compiler-generated constructor while at the same time making the intention to use it clear to other developers.
No, you are writing a copy constructor. It will not act as you expect (according to the examples provided).
class A{
A(arguments){ ...}
}
I would suggest reading a C++ book, differences between a constructor and a copy constructor are well explained.
When is it used ? When an instance is copied. For example, by returning /passing an instance by value
void foo (A a){
//copy constructor will be called to create a
}
A bar(){
return a; //Copy constructor will be called
}
I searched how to implement + operator properly all over the internet and all the results i found do the following steps :
const MyClass MyClass::operator+(const MyClass &other) const
{
MyClass result = *this; // Make a copy of myself. Same as MyClass result(*this);
result += other; // Use += to add other to the copy.
return result; // All done!
}
I have few questions about this "process" :
Isn't that stupid to implement + operator this way, it calls the assignment operator(which copies the class) in the first line and then the copy constructor in the return (which also copies the class , due to the fact that the return is by value, so it destroys the first copy and creates a new one.. which is frankly not really smart ... )
When i write a=b+c, the b+c part creates a new copy of the class, then the 'a=' part copies the copy to himself.
who deletes the copy that b+c created ?
Is there a better way to implement + operator without coping the class twice, and also without any memory issues ?
thanks in advance
That's effectively not an assignment operator, but a copy constructor. An operation like addition creates a new value, after all, so it has to be created somewhere. This is more efficient than it seems, since the compiler is free to do Return Value Optimization, which means it can construct the value directly where it will next be used.
The result is declared as a local variable, and hence goes away with the function call - except if RVO (see above) is used, in which case it was never actually created in the function, but in the caller.
Not really; this method is much more efficient than it looks at first.
Under the circumstances, I'd probably consider something like:
MyClass MyClass::operator+(MyClass other) {
other += *this;
return other;
}
Dave Abrahams wrote an article a while back explaining how this works and why this kind of code is usually quite efficient even though it initially seems like it shouldn't be.
Edit (thank you MSalters): Yes, this does assume/depend upon the commutative property holding for MyClass. If a+b != b+a, then the original code is what you want (most of the same reasoning applies).
it calls the assignment operator(which copies the class) in the first line
No, this is copy-initialization (through constructor).
then the copy constructor in the return (which also copies the class
Compilers can (and typically do) elide this copy using NRVO.
When i write a=b+c, the b+c part creates a new copy of the class, then the 'a=' part copies the copy to himself. who deletes the copy that b+c created
The compiler, as any other temporary value. They are deleted at the end of full-expression (in this case, it means at or after ; at the end of line.
Is there a better way to implement + operator without coping the class twice, and also without any memory issues ?
Not really. It's not that inefficient.
This appears to be the correct way to implement operator+. A few points:
MyClass result = *this does not use the assignment operator, it should be calling the copy constructor, as if it were written MyClass result(*this).
The returned value when used in a = b + c is called a temporary, and the compiler is responsible for deleting it (which will probably happen at the end of the statement ie. the semicolon, after everything else has been done). You don't have to worry about that, the compiler will always clean up temporaries.
There's no better way, you need the copy. The compiler, however, is allowed to optimise away the temporary copies, so not as many as you think may be made. In C++0x though, you can use move constructors to improve performance by transfering ownership of the content of a temporary rather than copying it in its entirity.
I'll try my best to answer:
Point (1): No, it does not call the assignment operator. Instead it calls a constructor. Since you need to construct the object anyway (since operator+ returns a copy), this does not introduce extra operations.
Point (2): The temporary result is created in stack and hence does not introduce memory problem (it is destroyed when function exits). On return, a temporary is created so that an assignment (or copy constructor) can be used to assign the results to a (in a=b+c;) even after result is destroyed. This temporary is destroyed automatically by the compiler.
Point (3): The above is what the standard prescribes. Remember that compiler implementors are allowed to optimize the implementation as long as the effect is the same as what the standard prescribed. I believe, compilers in reality optimize away many of the copying that occurs here. Using the idiom above is readable and is not actually inefficient.
P.S. I sometime prefer to implement operator+ as a non-member to leverage implicit conversion for both sides of the operators (only if it makes sense).
There are no memory issues (provided that the assignment operator, and copy constructor are well written). Simply because all the memory for these objects is taken on the stack and managed by the compiler. Furthermore, compilers do optimize this and perform all the operations directly on the final a instead of copying twice.
This is the proper way of implementing the operator+ in C++. Most of the copies you are so afraid of will get elided by the compiler and will be subject to move semantics in C++0x.
The class is a temporary and will be deleted. If you bind the temporary to a const& the life time of the temporary will be extended to the life time of the const reference.
May implementing it as a freefunction is a little more obvious. The first parameter in MyClass::operator+ is an implicit this and the compiler will rewrite the function to operator+(const MyClass&, const MyClass&) anyway.
As far as I remember, Stroustrup's 'The C++ Programming Language' recommends to implement operators as member functions only when internal representation is affected by operation and as external functions when not. operator+ does not need to access internal representation if implemented based on operator+=, which does.
So you would have:
class MyClass
{
public:
MyClass& operator+=(const MyClass &other)
{
// Implementation
return *this;
}
};
MyClass operator+(const MyClass &op1, const MyClass &op2)
{
MyClass r = op1;
return r += op2;
}
class Foo {
public:
Foo() { Foo(1)}
Foo(int x, int y = 0):i(x) {}
private:
int i;
}
Can anybody give me some reasonas about can I do this? If not why?
Because the language specification doesn't allow it. Just the way the language is. Very annoying if you're used to Java or other languages that allow it. However, you get used to it after a while. All languages have their quirks, this is just one of C++'s. I'm sure the writers of the specs have their reasons.
Best way around this I've found is to make a common initialization function and have both constructors call that.
Something like this:
class Foo {
public:
Foo() {initialize(1);}
Foo(int nX) { initialize(nx); }
private:
void initialize(int nx) { x=nx; }
int x;
};
It's a language design choice.
A constructor is a one time (per-object) operation that creates a new object in uninitialized memory. Only one constructor can be called for an object, once it has completed the object's lifetime begins and no other constructor can be called or resumed on that object.
At the other end of its life a destructor can only (validly) be called once per object and as soon as the destructor is entered the object's lifetime is over.
A prinicipal reason for this is to make explicit when an object destructor will be run and what state it can expect the object to be in.
If a class constructor completes successfully then it's destructor will be called, otherwise the object's lifetime has never begun and the destructor will not be called. This guarantee can be important when an object acquires resources in its constructor that need to be released in its destructor. If the resource acquisition fails then the constructor will usually be made to fail; if the destructor ran anyway it might attempt to release an resource that had never been successfully acquired.
If you allow constructors to call each other it may not be clear if a calling or a called constructor is responsible for the resource. For example, if the calling constructor fails after the called constructor returns, should the destructor run? The called constructor may have acquired something that needs releasing or perhaps that was what caused the calling construtor to fail and the destructor shouldn't be called because the resource handle was never valid.
For simplicity of the destruction rules it is simpler if each object is created by a single constructor and - if created successfully - destroyed by a single destructor.
Note that in C++11 a constructor will be able delegate to a different constructor, but there are limitations that don't really relax the principal of one construction per object. (The prinicipal constructor can forward to a target constructor, but if it does it must not name anything else (base classes or members) in its initializer list. These will be initialized by the target constructor, once the target constructor returns the body of the prinicipal constructor will complete (further initialization). It is not possible to re-construct any bases or members, although it allows you to share constructor code between constuctors.)
You cant do this. See section 10.3: http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.3. You can try to do it but doing so will construct a new temp object (not this) and be destroyed once control moves on.
What you can do however is to create a private function that initializes variables, one that your default constructor or a parameterized constructor can both call.
There is a really hideous hack I have seen used to call another ctor. It uses the placement new operation on the this pointer. erk
Like this:
Class::Class(int x) : x_(x) {}
Class::Class() {
new (this) Class(0);
}
Class::Class(const Class &other) : x_(other.x_) {}
Class& operator=(const Class &other) {
new (this) Class(other);
return *this;
}
Note that I am not recommending this, and I don't know what horrible effects it might have on C++ structures like virtual base classes, etc. But I expect that there are some.
Although as per standards vendors are free to implement data binding in their own ways, if we consider the most popular implementation: this pointer, we can see a reason why this can't be implemented.
Assume you have a class:
class A
{
public:
A(){}
A(int a){}
} ;
With the this pointer implementation, this class would look something like:
class A
{
public:
A(A *this){}
A(A *this,int a){}
} ;
Typically you would create objects like this:
A ob ;
Now when compiler sees this, it allocates memory for this and passes the address of this allocated memory block to the constructor of A which then constructs the object. It would try to do the same every time for each constructor called.
Now when you try calling a constructor within another constructor, instead of allocating new memory the compiler should pass the current objects this. Hence inconsistency!
Then another reason which i see is that even though you might want to call a constructor within another, u would still want a constructor to call default constructors for all the objects within the class. Now if one constructor were to call another, the default construction should happen for the first constructor and not for the subsequent one's. Implementing this behavior means that there would be several permutations which need to be handled. If not, then degraded performance as each constructor would default construct all the objects enclosed.
This is what i can think of as possible reasons for this behavior and do not have any standards to support.