Initialisation method versus constructor plus assignment -- any performance difference? (C++) - c++

I've been programming C++ for a long time so I feel silly for not knowing this but...
I frequently write performance-sensitive code, and when I do I try to avoid heap allocations as much as possible. To that end I often re-use pre-allocated arrays of small objects instead of calling new and delete for each individual object.
In such cases I usually do this:
class MyClass
{
private:
int x, y;
public:
inline void Set(_x, _y) { x = _x; y = _y; }
};
...
MyClass &objectToReuse = someArray[someIndex];
objectToReuse.Set(someXValue, someYValue);
However I suspect this better-looking version would generate the same code:
class MyClass
{
private:
int x, y;
public:
inline MyClass(_x, _y) : x(_x), y(_y) {}
};
...
MyClass &objectToReuse = someArray[someIndex];
objectToReuse = MyClass(someXValue, someYValue);
Would a modern C++ compiler "get" this, or would it construct a temporary object and then copy it?

Yes, a good compiler will eliminate the extra overhead in this case.
I say "in this case" because it does very much depend on exactly what happens in the constructor (and the assignment operator - where it says "constructor/construction below, read as "or assignment operator"). If the constructor affects (or "might affect") global state, then the compiler can't remove the construction. Affecting global state would be reading or writing files, updating a global variable, almost any call to a function that the compiler doesn't "know" (doesn't have the source code for) will cause the constructor/copy elimination to "fail".
Naturally, if the constructor/copy is not eliminated, the code using a setter may well be more efficient. The exact measure, in a real scenario, can only really be determined by benchmarking, as it's often hard to judge exactly what effect one or many lines of code actually has when compiled with optimisation - something really simple looking can sometimes have quite an impact, where something looking complex can (although less often= ends up not taking much time at all.

Related

Using memmove to initialize entire object in constructor in C++

Is it safe to use memmove/memcpy to initialize an object with constructor parameters?
No-one seems to use this method but it works fine when I tried it.
Does parameters being passed in a stack cause problems?
Say I have a class foo as follows,
class foo
{
int x,y;
float z;
foo();
foo(int,int,float);
};
Can I initialize the variables using memmove as follows?
foo::foo(int x,int y,float z)
{
memmove(this,&x, sizeof(foo));
}
This is undefined behavior.
The shown code does not attempt to initialize class variables. It attempts to memmove() onto the class pointer, and assumes that the size of the class is 2*sizeof(int)+sizeof(float). The C++ standard does not guarantee that.
Furthermore, the shown code also assumes the layout of the parameters that are passed to the constructor will be the same layout as the layout of the members of this POD. That, again, is not specified by the C++ standard.
It is safe to use memmove to initialize individual class members. For example, the following is safe:
foo::foo(int x_,int y_,float z_)
{
memmove(&x, &x_, sizeof(x));
memmove(&y, &y_, sizeof(y));
memmove(&z, &z_, sizeof(z));
}
Of course, this does nothing useful, but this would be safe.
No it is not safe, because based on the standard the members are not guaranteed to be immediately right after each other due to alignment/padding.
After your update, this is even worse because the location of passed arguments and their order are not safe to use.
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. - Donald Knuth
You should not try to optimize a code you are not sure you need to. I would suggest you to profile your code before you are able to perform this kind of optimizations. This way you don't lose time improving the performance of some code that is not going to impact the overall performance of your application.
Usually, compilers are smart enough to guess what are you trying to do with your code, and generate high efficient code that will keep the same functionality. For that purpose, you should be sure that you are enabling compiler optimizations (-Olevel flag or toggling individual ones through compiler command arguments).
For example, I've seen that some compilers transform std::copy into a memcpy when the compiler is sure that doing so is straightforward (e.g. data is contiguous).
No it is not safe. It is undefined behavior.
And the code
foo::foo(int x,int y,float z)
{
memmove(this,&x, sizeof(foo));
}
is not even saving you any typing compared to using an initializer list
foo::foo(int x,int y,float z) : x(x), y(y), z(z)
{ }

Comparison between constant accessors of private members

The main portion of this question is in regards to the proper and most computationally efficient method of creating a public read-only accessor for a private data member inside of a class. Specifically, utilizing a const type & reference to access the variables such as:
class MyClassReference
{
private:
int myPrivateInteger;
public:
const int & myIntegerAccessor;
// Assign myPrivateInteger to the constant accessor.
MyClassReference() : myIntegerAccessor(myPrivateInteger) {}
};
However, the current established method for solving this problem is to utilize a constant "getter" function as seen below:
class MyClassGetter
{
private:
int myPrivateInteger;
public:
int getMyInteger() const { return myPrivateInteger; }
};
The necessity (or lack thereof) for "getters/setters" has already been hashed out time and again on questions such as: Conventions for accessor methods (getters and setters) in C++ That however is not the issue at hand.
Both of these methods offer the same functionality using the syntax:
MyClassGetter a;
MyClassReference b;
int SomeValue = 5;
int A_i = a.getMyInteger(); // Allowed.
a.getMyInteger() = SomeValue; // Not allowed.
int B_i = b.myIntegerAccessor; // Allowed.
b.myIntegerAccessor = SomeValue; // Not allowed.
After discovering this, and finding nothing on the internet concerning it, I asked several of my mentors and professors for which is appropriate and what are the relative advantages/disadvantages of each. However, all responses I received fell nicely into two categories:
I have never even thought of that, but use a "getter" method as it is "Established Practice".
They function the same (They both run with the same efficiency), but use a "getter" method as it is "Established Practice".
While both of these answers were reasonable, as they both failed to explain the "why" I was left unsatisfied and decided to investigate this issue further. While I conducted several tests such as average character usage (they are roughly the same), average typing time (again roughly the same), one test showed an extreme discrepancy between these two methods. This was a run-time test for calling the accessor, and assigning it to an integer. Without any -OX flags (In debug mode), the MyClassReference performed roughly 15% faster. However, once a -OX flag was added, in addition to performing much faster both methods ran with the same efficiency.
My question is thus has two parts.
How do these two methods differ, and what causes one to be faster/slower than the others only with certain optimization flags?
Why is it that established practice is to use a constant "getter" function, while using a constant reference is rarely known let alone utilized?
As comments pointed out, my benchmark testing was flawed, and irrelevant to the matter at hand. However, for context it can be located in the revision history.
The answer to question #2 is that sometimes, you might want to change class internals. If you made all your attributes public, they're part of the interface, so even if you come up with a better implementation that doesn't need them (say, it can recompute the value on the fly quickly and shave the size of each instance so programs that make 100 million of them now use 400-800 MB less memory), you can't remove it without breaking dependent code.
With optimization turned on, the getter function should be indistinguishable from direct member access when the code for the getter is just a direct member access anyway. But if you ever want to change how the value is derived to remove the member variable and compute the value on the fly, you can change the getter implementation without changing the public interface (a recompile would fix up existing code using the API without code changes on their end), because a function isn't limited in the way a variable is.
There are semantic/behavioral differences that are far more significant than your (broken) benchmarks.
Copy semantics are broken
A live example:
#include <iostream>
class Broken {
public:
Broken(int i): read_only(read_write), read_write(i) {}
int const& read_only;
void set(int i) { read_write = i; }
private:
int read_write;
};
int main() {
Broken original(5);
Broken copy(original);
std::cout << copy.read_only << "\n";
original.set(42);
std::cout << copy.read_only << "\n";
return 0;
}
Yields:
5
42
The problem is that when doing a copy, copy.read_only points to original.read_write. This may lead to dangling references (and crashes).
This can be fixed by writing your own copy constructor, but it is painful.
Assignment is broken
A reference cannot be reseated (you can alter the content of its referee but not switch it to another referee), leading to:
int main() {
Broken original(5);
Broken copy(4);
copy = original;
std::cout << copy.read_only << "\n";
original.set(42);
std::cout << copy.read_only << "\n";
return 0;
}
generating an error:
prog.cpp: In function 'int main()':
prog.cpp:18:7: error: use of deleted function 'Broken& Broken::operator=(const Broken&)'
copy = original;
^
prog.cpp:3:7: note: 'Broken& Broken::operator=(const Broken&)' is implicitly deleted because the default definition would be ill-formed:
class Broken {
^
prog.cpp:3:7: error: non-static reference member 'const int& Broken::read_only', can't use default assignment operator
This can be fixed by writing your own copy constructor, but it is painful.
Unless you fix it, Broken can only be used in very restricted ways; you may never manage to put it inside a std::vector for example.
Increased coupling
Giving away a reference to your internals increases coupling. You leak an implementation detail (the fact that you are using an int and not a short, long or long long).
With a getter returning a value, you can switch the internal representation to another type, or even elide the member and compute it on the fly.
This is only significant if the interface is exposed to clients expecting binary/source-level compatibility; if the class is only used internally and you can afford to change all users if it changes, then this is not an issue.
Now that semantics are out of the way, we can speak about performance differences.
Increased object size
While references can sometimes be elided, it is unlikely to ever happen here. This means that each reference member will increase the size of an object by at least sizeof(void*), plus potentially some padding for alignment.
The original class MyClassA has a size of 4 on x86 or x86-64 platforms with mainstream compilers.
The Broken class has a size of 8 on x86 and 16 on x86-64 platforms (the latter because of padding, as pointers are aligned on 8-bytes boundaries).
An increased size can bust up CPU caches, with a large number of items you may quickly experience slow downs due to it (well, not that it'll be easy to have vectors of Broken due to its broken assignment operator).
Better performance in debug
As long as the implementation of the getter is inline in the class definition, then the compiler will strip the getter whenever you compile with a sufficient level of optimizations (-O2 or -O3 generally, -O1 may not enable inlining to preserve stack traces).
Thus, the performance of access should only vary in debug code, where performance is least necessary (and otherwise so crippled by plenty of other factors that it matters little).
In the end, use a getter. It's established convention for a good number of reasons :)
When implementing constant reference (or constant pointer) your object also stores a pointer, which makes it bigger in size. Accessor methods, on the other hand, are instantiated only once in program and are most likely optimized out (inlined), unless they are virtual or part of exported interface.
By the way, getter method can also be virtual.
To answer question 2:
const_cast<int&>(mcb.myIntegerAccessor) = 4;
Is a pretty good reason to hide it behind a getter function. It is a clever way to do a getter-like operation, but it completely breaks abstraction in the class.

Casting to a base reference and copying is a dirty hack, but what exactly is dirty about it?

Problem: I have a pretty big structure with POD variables, and I need to copy around some fields, but not others. Too lazy to write down a member-by-member copy function.
Solution: move the copy-able fields to the base, assign the base. Like this:
struct A
{
int a, b, c;
};
struct B : public A
{
int d, e, f;
};
//And copy:
B x, y;
(A&)x = y; //copies the part of B that is A
Now, this is dirty, I know. I had a live, livid discussion with co-workers re: this code, my competence, and my moral character. Yet the hardest specific charge I heard was "d, e, f are not initialized in the copy". Yes I know; that was the intent. Of course I initialize them elsewhere.
Another charge was "unsafe typecast". But this is a guaranteed-safe typecast to the base class! Is's almost like
((A*)&x)->operator=(b);
but less verbose. The derivation is public; so treating B as A is fair game. There's no undefined behavior, as far as I can see.
So, I'm appealing to the collective wizdom of SO. This is an invitation to criticism. Have a go at it, people.
EDIT: the final line of the snippet can be expanded into less offensive code in more than one way. For example:
void Copy(A& to, const A& from)
{
to = from;
}
B x, y;
Copy(x, y);
Functionally the same. Or like this:
x.A::operator=(y);
EDIT2: there's no maintenance programmer but me. It's from a hobby project. So stop pitying that poor soul.
It all depends on the context and on which part is supposedly "dirty" here.
Firstly, the "sliced copying" trick is technically legal. Formally, it is not really a "hack". You can also achieve the same result by using the qualified name of the assignment operator to refer to the operator from A
x.A::operator =(y); // same as `(A&) x = y` in your case
It is starting to look familiar, isn't it? Yes, that's exactly what you would do if you had to implement the assignment operator in the derived class, if you suddenly decided to do it manually
B& B::operator =(const B& rhs)
{
A::operator =(rhs); // or `this->A::operator =(rhs)`
// B-specific part goes here
}
The A::operator =(rhs); part is exactly the same "sliced copying" trick as yours above, however in this case it is used in a different context. Nobody would, of course, blame you for the latter use, since that's how it is normally done and how it should be done. So, again, the "dirtiness" of the specific application of the trick depends on the context. It is perfectly fine as an integral part of the implementation of the derived assignment operator, but it might look highly questionable when used "by itself" as in your case.
However, secondly and more importantly, what I would call "dirty" in your case is not the use of the trick with "sliced copying" itself. From what you described in your question, it looks like you actually split your data structure into two classes (A and B) specifically for the purpose of being able to use the aforementioned trick. This is what I would call "dirty" in this case. Not the "sliced copying" trick itself, but rather the use of inheritance for the sole purpose of enabling the "sliced copying". If you did it just to avoid writing the assignment operator manually, that's an instance of blatant laziness. That's what's "dirty" here. I wouldn't recommend using inheritance for such purely utilitarian purposes.
Yes; this is dirty - you're intentionally slicing because you're too lazy to write B::CopyVariablesThatIWant(const B&). You're abusing the type system in a way that works for you, however you will most likely confuse and/or enrage any future programmers who have to look at your code and figure out it's intent.
Your coworkers are right, you should be ashamed of yourself.
Why not just do a member struct and do things explicitly:
struct A
{
int a, b, c;
};
struct B
{
A top_;
int d, e, f;
};
//And copy:
B x, y;
x.top_ = y.top_;
And in my opinion the dirtiest part is unnecessary obfuscation of the code. Six month from now some poor soul will curse you trying to understand why in Batman's name it's done this way.
Problem: I have a pretty big structure
with POD variables, and I need to copy
around some fields, but not others.
Too lazy to write down a
member-by-member copy function.
Then don't. (But then assignment will assign everything.)
Also, if you want slicing, you could do:
B b;
A a;
a = b;
Just having uninitialized members seems dirty indeed.
What are the semantics that make some fields needed for the copy but not others?
The semantics of operator=() are that afterwards the two objects will be have equivalent observable state. (That is, after a = b;, a == b should return true.) Why you would want to violate those semantics and confuse your maintenance programmers is the real question. What possible long-term benefit do you see to not explicitly writing your MinimalClone() function, versus the long-term harm to ease of understanding your code?
Edit: There's always a maintenance programmer, unless you delete the code just after compilation. I can't count the number of times I've returned to something I wrote months prior and said "what was I thinking?!?" Be kind to your maintenance programmer, even if it's you.
The problem is that you are initialising a B using an A constructor. I'm surprised compilers allow this.
This means when you go to access the B, and use functions which expect it to have initialised c,d,e, it will not work as expected and may crash. Also if it's actually a class and has virtual functions, it will have a different vtable from that expected by the compiler.
Watch this:
#include <stdio.h>
void f(A& y)
{
B x;
x.d = 5;
printf("%i\n",x.d);
}
void g(A& y)
{
B x;
(A&)x = y;
printf("%i\n",x.d);
}
main()
{
B z;
z.d = 3;
f(z);
g(z);
}
and compile and run
g++ 1.cc
./a.out
5
5
explain that, please?
You could get in trouble in the following situation:
struct A {
char a;
int b;
};
struct B : public A {
char c;
};
a is allocated at offset 0 and b is allocated at offset 4 in the structure (padding to align b). c can be allocated at offset 1 because that space is not being used by A. If that happens, your assignment may clobber c (if A's copy constructor copies the unused padding, which it is allowed to do).
I'm not sure whether allocating c in A's padding space is allowed in C++. I wrote a Java compiler that does it, though, so it isn't unprecedented.
How about:
struct A
{
int a;
int b;
int c;
};
struct B: public A
{
int d;
int e;
int f;
};
int main()
{
B x;
B y;
static_cast<A&>(x) = y;
}
The big issue I have against this is that it saves you only a little bit of work, and leaves a big "Huh?" for the next guy to come around. Commenting it is no better, as it still leaves something extra to be comprehended and is more typing than just doing it right would be.
After decades of poring over assorted code, I've come to really resent anything that gets in the way of just reading and understanding.

Lazy/multi-stage construction in C++

What's a good existing class/design pattern for multi-stage construction/initialization of an object in C++?
I have a class with some data members which should be initialized in different points in the program's flow, so their initialization has to be delayed. For example one argument can be read from a file and another from the network.
Currently I am using boost::optional for the delayed construction of the data members, but it's bothering me that optional is semantically different than delay-constructed.
What I need reminds features of boost::bind and lambda partial function application, and using these libraries I can probably design multi-stage construction - but I prefer using existing, tested classes. (Or maybe there's another multi-stage construction pattern which I am not familiar with).
The key issue is whether or not you should distinguish completely populated objects from incompletely populated objects at the type level. If you decide not to make a distinction, then just use boost::optional or similar as you are doing: this makes it easy to get coding quickly. OTOH you can't get the compiler to enforce the requirement that a particular function requires a completely populated object; you need to perform run-time checking of fields each time.
Parameter-group Types
If you do distinguish completely populated objects from incompletely populated objects at the type level, you can enforce the requirement that a function be passed a complete object. To do this I would suggest creating a corresponding type XParams for each relevant type X. XParams has boost::optional members and setter functions for each parameter that can be set after initial construction. Then you can force X to have only one (non-copy) constructor, that takes an XParams as its sole argument and checks that each necessary parameter has been set inside that XParams object. (Not sure if this pattern has a name -- anybody like to edit this to fill us in?)
"Partial Object" Types
This works wonderfully if you don't really have to do anything with the object before it is completely populated (perhaps other than trivial stuff like get the field values back). If you do have to sometimes treat an incompletely populated X like a "full" X, you can instead make X derive from a type XPartial, which contains all the logic, plus protected virtual methods for performing precondition tests that test whether all necessary fields are populated. Then if X ensures that it can only ever be constructed in a completely-populated state, it can override those protected methods with trivial checks that always return true:
class XPartial {
optional<string> name_;
public:
void setName(string x) { name_.reset(x); } // Can add getters and/or ctors
string makeGreeting(string title) {
if (checkMakeGreeting_()) { // Is it safe?
return string("Hello, ") + title + " " + *name_;
} else {
throw domain_error("ZOINKS"); // Or similar
}
}
bool isComplete() const { return checkMakeGreeting_(); } // All tests here
protected:
virtual bool checkMakeGreeting_() const { return name_; } // Populated?
};
class X : public XPartial {
X(); // Forbid default-construction; or, you could supply a "full" ctor
public:
explicit X(XPartial const& x) : XPartial(x) { // Avoid implicit conversion
if (!x.isComplete()) throw domain_error("ZOINKS");
}
X& operator=(XPartial const& x) {
if (!x.isComplete()) throw domain_error("ZOINKS");
return static_cast<X&>(XPartial::operator=(x));
}
protected:
virtual bool checkMakeGreeting_() { return true; } // No checking needed!
};
Although it might seem the inheritance here is "back to front", doing it this way means that an X can safely be supplied anywhere an XPartial& is asked for, so this approach obeys the Liskov Substitution Principle. This means that a function can use a parameter type of X& to indicate it needs a complete X object, or XPartial& to indicate it can handle partially populated objects -- in which case either an XPartial object or a full X can be passed.
Originally I had isComplete() as protected, but found this didn't work since X's copy ctor and assignment operator must call this function on their XPartial& argument, and they don't have sufficient access. On reflection, it makes more sense to publically expose this functionality.
I must be missing something here - I do this kind of thing all the time. It's very common to have objects that are big and/or not needed by a class in all circumstances. So create them dynamically!
struct Big {
char a[1000000];
};
class A {
public:
A() : big(0) {}
~A() { delete big; }
void f() {
makebig();
big->a[42] = 66;
}
private:
Big * big;
void makebig() {
if ( ! big ) {
big = new Big;
}
}
};
I don't see the need for anything fancier than that, except that makebig() should probably be const (and maybe inline), and the Big pointer should probably be mutable. And of course A must be able to construct Big, which may in other cases mean caching the contained class's constructor parameters. You will also need to decide on a copying/assignment policy - I'd probably forbid both for this kind of class.
I don't know of any patterns to deal with this specific issue. It's a tricky design question, and one somewhat unique to languages like C++. Another issue is that the answer to this question is closely tied to your individual (or corporate) coding style.
I would use pointers for these members, and when they need to be constructed, allocate them at the same time. You can use auto_ptr for these, and check against NULL to see if they are initialized. (I think of pointers are a built-in "optional" type in C/C++/Java, there are other languages where NULL is not a valid pointer).
One issue as a matter of style is that you may be relying on your constructors to do too much work. When I'm coding OO, I have the constructors do just enough work to get the object in a consistent state. For example, if I have an Image class and I want to read from a file, I could do this:
image = new Image("unicorn.jpeg"); /* I'm not fond of this style */
or, I could do this:
image = new Image(); /* I like this better */
image->read("unicorn.jpeg");
It can get difficult to reason about how a C++ program works if the constructors have a lot of code in them, especially if you ask the question, "what happens if a constructor fails?" This is the main benefit of moving code out of the constructors.
I would have more to say, but I don't know what you're trying to do with delayed construction.
Edit: I remembered that there is a (somewhat perverse) way to call a constructor on an object at any arbitrary time. Here is an example:
class Counter {
public:
Counter(int &cref) : c(cref) { }
void incr(int x) { c += x; }
private:
int &c;
};
void dontTryThisAtHome() {
int i = 0, j = 0;
Counter c(i); // Call constructor first time on c
c.incr(5); // now i = 5
new(&c) Counter(j); // Call the constructor AGAIN on c
c.incr(3); // now j = 3
}
Note that doing something as reckless as this might earn you the scorn of your fellow programmers, unless you've got solid reasons for using this technique. This also doesn't delay the constructor, just lets you call it again later.
Using boost.optional looks like a good solution for some use cases. I haven't played much with it so I can't comment much. One thing I keep in mind when dealing with such functionality is whether I can use overloaded constructors instead of default and copy constructors.
When I need such functionality I would just use a pointer to the type of the necessary field like this:
public:
MyClass() : field_(0) { } // constructor, additional initializers and code omitted
~MyClass() {
if (field_)
delete field_; // free the constructed object only if initialized
}
...
private:
...
field_type* field_;
next, instead of using the pointer I would access the field through the following method:
private:
...
field_type& field() {
if (!field_)
field_ = new field_type(...);
return field_;
}
I have omitted const-access semantics
The easiest way I know is similar to the technique suggested by Dietrich Epp, except it allows you to truly delay the construction of an object until a moment of your choosing.
Basically: reserve the object using malloc instead of new (thereby bypassing the constructor), then call the overloaded new operator when you truly want to construct the object via placement new.
Example:
Object *x = (Object *) malloc(sizeof(Object));
//Use the object member items here. Be careful: no constructors have been called!
//This means you can assign values to ints, structs, etc... but nested objects can wreak havoc!
//Now we want to call the constructor of the object
new(x) Object(params);
//However, you must remember to also manually call the destructor!
x.~Object();
free(x);
//Note: if you're the malloc and new calls in your development stack
//store in the same heap, you can just call delete(x) instead of the
//destructor followed by free, but the above is the correct way of
//doing it
Personally, the only time I've ever used this syntax was when I had to use a custom C-based allocator for C++ objects. As Dietrich suggests, you should question whether you really, truly must delay the constructor call. The base constructor should perform the bare minimum to get your object into a serviceable state, whilst other overloaded constructors may perform more work as needed.
I don't know if there's a formal pattern for this. In places where I've seen it, we called it "lazy", "demand" or "on demand".

Returning Large Objects in Functions

Compare the following two pieces of code, the first using a reference to a large object, and the second has the large object as the return value. The emphasis on a "large object" refers to the fact that repeated copies of the object, unnecessarily, is wasted cycles.
Using a reference to a large object:
void getObjData( LargeObj& a )
{
a.reset() ;
a.fillWithData() ;
}
int main()
{
LargeObj a ;
getObjData( a ) ;
}
Using the large object as a return value:
LargeObj getObjData()
{
LargeObj a ;
a.fillWithData() ;
return a ;
}
int main()
{
LargeObj a = getObjData() ;
}
The first snippet of code does not require copying the large object.
In the second snippet, the object is created inside the function, and so in general, a copy is needed when returning the object. In this case, however, in main() the object is being declared. Will the compiler first create a default-constructed object, then copy the object returned by getObjData(), or will it be as efficient as the first snippet?
I think the second snippet is easier to read but I am afraid it is less efficient.
Edit: Typically, I am thinking of cases LargeObj to be generic container classes that, for the sake of argument, contains thousands of objects inside of them. For example,
typedef std::vector<HugeObj> LargeObj ;
so directly modifying/adding methods to LargeObj isn't a directly accessible solution.
The second approach is more idiomatic, and expressive. It is clear when reading the code that the function has no preconditions on the argument (it does not have an argument) and that it will actually create an object inside. The first approach is not so clear for the casual reader. The call implies that the object will be changed (pass by reference) but it is not so clear if there are any preconditions on the passed object.
About the copies. The code you posted is not using the assignment operator, but rather copy construction. The C++ defines the return value optimization that is implemented in all major compilers. If you are not sure you can run the following snippet in your compiler:
#include <iostream>
class X
{
public:
X() { std::cout << "X::X()" << std::endl; }
X( X const & ) { std::cout << "X::X( X const & )" << std::endl; }
X& operator=( X const & ) { std::cout << "X::operator=(X const &)" << std::endl; }
};
X f() {
X tmp;
return tmp;
}
int main() {
X x = f();
}
With g++ you will get a single line X::X(). The compiler reserves the space in the stack for the x object, then calls the function that constructs the tmp over x (in fact tmp is x. The operations inside f() are applied directly on x, being equivalent to your first code snippet (pass by reference).
If you were not using the copy constructor (had you written: X x; x = f();) then it would create both x and tmp and apply the assignment operator, yielding a three line output: X::X() / X::X() / X::operator=. So it could be a little less efficient in cases.
Use the second approach. It may seem that to be less efficient, but the C++ standard allows the copies to be evaded. This optimization is called Named Return Value Optimization and is implemented in most current compilers.
Yes in the second case it will make a copy of the object, possibly twice - once to return the value from the function, and again to assign it to the local copy in main. Some compilers will optimize out the second copy, but in general you can assume at least one copy will happen.
However, you could still use the second approach for clarity even if the data in the object is large without sacrificing performance with the proper use of smart pointers. Check out the suite of smart pointer classes in boost. This way the internal data is only allocated once and never copied, even when the outer object is.
The way to avoid any copying is to provide a special constructor. If you
can re-write your code so it looks like:
LargeObj getObjData()
{
return LargeObj( fillsomehow() );
}
If fillsomehow() returns the data (perhaps a "big string" then have a constructor that takes a "big string". If you have such a constructor, then the compiler will very likelt construct a single object and not make any copies at all to perform the return. Of course, whether this is userful in real life depends on your particular problem.
A somewhat idiomatic solution would be:
std::auto_ptr<LargeObj> getObjData()
{
std::auto_ptr<LargeObj> a(new LargeObj);
a->fillWithData();
return a;
}
int main()
{
std::auto_ptr<LargeObj> a(getObjData());
}
Alternatively, you can avoid this issue all together by letting the object get its own data, i. e. by making getObjData() a member function of LargeObj. Depending on what you are actually doing, this may be a good way to go.
Depending on how large the object really is and how often the operation happens, don't get too bogged down in efficiency when it will have no discernible effect either way. Optimization at the expense of clean, readable code should only happen when it is determined to be necessary.
The chances are that some cycles will be wasted when you return by copy. Whether it's worth worrying about depends on how large the object really is, and how often you invoke this code.
But I'd like to point out that if LargeObj is a large and non-trivial class, then in any case its empty constructor should be initializing it to a known state:
LargeObj::LargeObj() :
m_member1(),
m_member2(),
...
{}
That wastes a few cycles too. Re-writing the code as
LargeObj::LargeObj()
{
// (The body of fillWithData should ideally be re-written into
// the initializer list...)
fillWithData() ;
}
int main()
{
LargeObj a ;
}
would probably be a win-win for you: you'd have the LargeObj instances getting initialized into known and useful states, and you'd have fewer wasted cycles.
If you don't always want to use fillWithData() in the constructor, you could pass a flag into the constructor as an argument.
UPDATE (from your edit & comment) : Semantically, if it's worthwhile to create a typedef for LargeObj -- i.e., to give it a name, rather than referencing it simply as typedef std::vector<HugeObj> -- then you're already on the road to giving it its own behavioral semantics. You could, for example, define it as
class LargeObj : public std::vector<HugeObj> {
// constructor that fills the object with data
LargeObj() ;
// ... other standard methods ...
};
Only you can determine if this is appropriate for your app. My point is that even though LargeObj is "mostly" a container, you can still give it class behavior if doing so works for your application.
Your first snippet is especially useful when you do things like have getObjData() implemented in one DLL, call it from another DLL, and the two DLLs are implemented in different languages or different versions of the compiler for the same language. The reason is because when they are compiled in different compilers they often use different heaps. You must allocate and deallocate memory from within the same heap, else you will corrupt memory. </windows>
But if you don't do something like that, I would normally simply return a pointer (or smart pointer) to memory your function allocates:
LargeObj* getObjData()
{
LargeObj* ret = new LargeObj;
ret->fillWithData() ;
return ret;
}
...unless I have a specific reason not to.