Having
struct Person {
string name;
};
Person* p = ...
Assume that no operators are overloaded.
Which is more efficient (if any) ?
(*p).name vs. p->name
Somewhere in the back of my head I hear some bells ringing, that the * dereference operator may create a temporary copy of an object; is this true?
The background of this question are cases like this:
Person& Person::someFunction(){
...
return *this;
}
and I began to wonder, if changing the result to Person* and the last line to simply return this would make any difference (in performance)?
There's no difference. Even the standard says the two are equivalent, and if there's any compiler out there that doesn't generate the same binary for both versions, it's a bad one.
When you return a reference, that's exactly the same as passing back a pointer, pointer semantics excluded.
You pass back a sizeof(void*) element, not a sizeof(yourClass).
So when you do that:
Person& Person::someFunction(){
...
return *this;
}
You return a reference, and that reference has the same intrinsic size than a pointer, so there's no runtime difference.
Same goes for your use of (*i).name, but in that case you create an l-value, which has then the same semantics as a reference (see also here)
Yes, it's much harder to read and type, so you are much better off using the x->y than (*x).y - but other than typing efficiency, there is absolutely no difference. The compiler still needs to read the value of x and then add the offset to y, whether you use one form or the other [assuming there are no funny objects/classes involved that override the operator-> and operator* respectively, of course]
There is definitely no extra object created when (*x) is referenced. The value of the pointer is loaded into a register in the processor [1]. That's it.
Returning a reference is typically more efficient, as it returns a pointer (in disguise) to the object, rather than making a copy of the object. For objects that are bigger than the size of a pointer, this is typically a win.
[1] Yes, we can have a C++ compiler for a processor that doesn't have registers. I know of at least one processor from Rank-Xerox that I saw in about 1984, which doesn't have registers, it was a dedicated LiSP processor, and it just has a stack for LiSP objects... But they are far from common in todays world. If someone working on a processor that doesn't have registers, please don't downvote my answer simply because I don't cover that option. I'm trying to keep the answer simple.
Any good compiler will produce the same results. You can answer this yourself, compile both codes to assembler and check the produced code.
Related
In many examples I see code like this:
SomeObject* constructObject() {
SomeObject* obj = new SomeObject();
return obj;
}
But what speaks against doing it this way:
SomeObject constructObject() {
SomeObject obj = SomeObject();
return obj;
}
?
What is the general rule of thumb of when to return an object vs when to return a pointer?
Edit:
A little background:
I am rewriting a renderer that should be fast in both rendering itself aswell as providing the data.
The previous programmer stored pointers in a vector. something like:
vector<MeshModel*>. MeshModel itself doesnt have any inheritance.
In my opinion it would be better to use vector<MeshModel> instead, since I wouldn't jump around randomly in the memory.
Is my POV wrong?
std::vector<MeshModel> is more straightforward than std::vector<MeshModel*>.
For use in a std::vector, one might be concerned about the cost of copy/move-construction during vector growth reallocations. If your SomeObject can be moved cheaply, then I would go for the by-value storage. Otherwise, there might be a performance tradeoff during creation of the vector. But that is most likely not worth caring about.
Whether it brings speed while accessing the objects depends on too many other things (everything that affects caching, such as object size, access frequency/stride/predictability, target hardware... too much to list here) - profile if you care about performance. But there's no need to use indirection when you gain nothing from it.
And as pointed out in the comments - stay away from owning raw pointers. std::unique_ptr<MeshModel> would work just fine in the code shown.
Is my POV wrong?
No. Direct values are preferable to indirection whenever the indirection is unnecessary.
So, the question is: Is the indirection needed? We cannot tell that based on the limited context.
P.S. A function should pretty much never return a bare owning pointer such as in the example. Always use smart pointer for ownership.
Usually the only reason to dynamically allocate an object and return it by-pointer is because you need to use polymorphism (i.e. you're returning an object that is a subclass of the return-type declared in your function's return-type) and you want to avoid object-slicing. But even then, you should always return using a smart-pointer class (e.g. std::unique_ptr<BaseClass> or std::shared_ptr<BaseClass>) instead of returning a raw/C-style pointer, since returning a raw pointer is a recipe for memory leaks.
In older versions of C++ there was a second reason you might want to return an object by-pointer, and that was if the returned object was very large and/or expensive to copy, and your compiler wasn't smart enough to implement Return Value Optimization to avoid requiring an object-copy as part of the return. However, current versions of C++ support move-semantics so that is no longer a concern; returning a "large" object can now be done about as efficiently as returning an object by-pointer.
In my judgment, the change you propose is a "nice to have" which isn't engineering-justified if the application works now. It could in fact be a very, very pervasive change touching most of the code. "Just because you think it stinks" is not a valid engineering reason to change it.
I suggest that you begin by profiling the existing code, after confirming that it does indeed work now, in order to determine conclusively where and why it is right-now "not fast enough" in doing each particular thing that is required of it. You should also profile each of your changed areas to confirm that you did, indeed, obtain the necessary increases in performance. Don't Assume.
Your project-plan should then be strictly driven by the specific areas that the profile results reveal ... and, nothing else.
Arrow dereferencing p->m is syntactic sugar for (*p).m, which appears like it might involve two separate memory lookup operations--one to find the object on the heap and the second to then locate the member field offset.
This made me question whether there is any performance difference between these two code snippets. Assume classA has 30+ disparate fields of various types which need to be accessed in various orders (not necessarily consecutively or contiguously):
Version 1:
void func(classA* ptr)
{
std::string s = ptr->field1;
int i = ptr->field2;
float f = ptr->field3;
// etc...
}
Version 2:
void func(classA* ptr)
{
classA &a = *ptr;
std::string s = a.field1;
int i = a.field2;
float f = a.field3;
// etc...
}
So my question is whether or not there is a difference in performance (even if very slight) between these two versions, or if the compiler is smart enough to make them equivalent (even if the different field accesses are interrupted by many lines of other code in between them, which I did not show here).
Arrow dereferencing p->m is syntactic sugar for (*p).m
That isn't generally true, but is true in the limited context in which you are asking.
which appears like it might involve two separate memory lookup
operations--one to find the object on the heap and the second to then
locate the member field offset.
Not at all. It is one to read the parameter or local variable holding the pointer and the second to access the member. But any reasonable optimizer would keep the pointer in a register in the code you showed, so no extra access.
But your alternate version also has a local pointer, so no difference anyway (at least in the direction you're asking about):
classA &a = *ptr;
Assuming the whole function is not being inlined or assuming for some other reason the compiler doesn't know exactly where ptr points, the & must use a pointer, so either the compiler can deduce it is safe for a to be an alias of *ptr so there is NO difference, or the compiler must make a an alias of *copy_of_ptr so the version using a & is slower (not faster as you seem to have expected) by the cost of copying ptr.
even if the different field accesses are interrupted by many lines of
other code in between them, which I did not show here
That moves you toward the interesting case. If that intervening code could change ptr then obviously the two versions behave differently. But what if a human can see that the intervening code can't change ptr while a compiler can't see that: Then the two versions are semantically equal, but the compiler doesn't know that and the compiler may generate slower code for the version you tried to hand optimize by creation of a reference.
Most (?all) compilers implement references as pointers under the hood, so I would expect no difference in the generated assembly (apart from a possible copy to initialize a reference - but I would expect the optimizer to eliminate even that).
In general, this sort of micro-optimization is not worth it. It is always preferable to concentrate on clear and correct code. It is certainly not worth this sort of optimization until you have measured where the bottleneck is.
I am working on a project with a friend that does a lot of computation and we are using c++ for it. I havent used c++ in a while and he is suggesting some fixes. I hoped I could come here for a more in depth explanation and maybe could be linked to some more articles.
He says its more efficient instead of having this
Hand::Hand(Card one, Card two)
To have this
Hand::Hand(const Card &one, const Card &two)
Is this correct? What about passing a constant address rather than the object itself makes it more efficient? He mentioned passing a reference instead of making a copy. If I dont pass by address, will it construct a new card object as a copy of the one I've passed?
Also
Instead of
bool Hand::hasFourKind(Card board[])
Have this
bool Hand::hasFourKind(const Card *board)
This passes a pointer to the start of the array instead of making an array copy?
In most cases, if you don't need a local variable to be modified, the latter method of the first example is faster, because the entire object will not have to be copied on the stack for the function to use it. Instead, a pointer will be pushed onto the stack, which takes less time. (Although some calling conventions allow passing small arguments in registers, which is even faster.)
For small objects, the time spent in copying may not be an issue, but may be more evident for large classes/structs.
The second examples have identical operation (disregarding the constness), since the array would not be passed by value in any case. It would be passed as a simple pointer, or passed "by reference."
From an optimisation point of view, the compiler can not rely on const actually meaning "won't change". const_cast allows a function to alter the const-ness of something, such that it can be altered. It is useful for the programmer to know that "I get an error if I accidentally modify this" (in particular mistyping if (a = b) instead of if (a == b)).
If the source code of a function is available to the compiler, it can itself prove that a value isn't being changed, or is being changed, regardless of whether you mark it const or not.
In a video with Chandler Carruth (one of the currently most active developers of Clang and LLVM), he actually promotes using non-reference calls for any type that is reasonably small. Often, the compiler will optimise away the copy of the argument anyways, because the reference MAY be modified [so if the compiler doesn't have the source code available, it won't know if the value is being changed or not]
Whether you use Card board[] or Card *board will not change things, the compiler will generate exactly the same code either way - if you want others reading the code to understand if the function is expected to modify the value or not, then add const for values that aren't being changed.
This
Hand::Hand(const Card &one, const Card &two)
is more efficient than this
Hand::Hand(Card one, Card two)
For 2 reasons
it avoids making a copy of the Cards
If you wanted to maintain a Deck of cards, it is possibe with the first method but not possible with the second method for reasons explained by my first point
This
bool Hand::hasFourKind(Card board[])
is equivalent to this
bool Hand::hasFourKind(const Card *board)
because they both represent array of cards, but the first one is just syntatic sugar for pointer to cards, so they both actually represent pointer to cards.
The second is better because it implies that the board cannot be modified by anyone else. This is not really much of a guard because of const_cast which can be used to remove the constness of an object.
If you really wanted to ensure that the cards cannot be modified by anybody in that method, then I would suggest you change design a bit to enable this:
bool Hand::hasFourKind() const;
Where the cards are part of the Hand class and cannot be modified by anyone
Which is better for performance when calling a function that provides a simple datatype -- having it fill in a memory location (passed by pointer) or having it return the simple data?
I've oversimplified the example returning a static value of 5 here, but assume the lookup/functionality that determines the return value would be dynamic in real life...
Conventional logic would tell me the first approach is quicker since we are operating by reference instead of having to return a copy as in the 2nd approach... But, I'd like others' opinions.
Thanks
void func(int *a) {
*a = 5;
}
or...
int func() {
return 5;
}
In general, if your function acts like a function (that is, returning a single logical value), then it's probably best to use int func(). Even if the return value is a complex C++ object, there's a common optimisation called Return Value Optimisation that avoids unnecessary object copying and makes the two forms roughly equivalent in runtime performance.
Most compilers will return a value in a register as long as what you're returning is small enough to fit in a register. It's pretty unusual (and often nearly impossible) for anything else to be more efficient than that.
For PODs, there is no or almost no difference and I'd always go with a return value as I find those cleaner and easier to read.
For non-PODs the answer is "it depends" - a lot of compilers use Return Value Optimisation in this sort of scenario which tends to create an implicit reference parameter.
However unless you have measured - not "know", but actually measured with a profiler - that returning the results of the function using a return value is actually a bottleneck in your software, go for the more readable version of the code.
In my opinion, always go with return unless you know of a reason not to, or you have to return more than one value from the function. Returning a built-in type is very efficient, and whatever the difference vs. returning via pointer, it must be negligible. But the real benefit here is using return is clearer and simpler for those who read the code later.
Returning a simple value is just something like an instrution in assmbly ( ie MOV eax,xxxx ), passing a parameter introduce a little more overhead. in any case you should not worry about that, difference are hard to notice.
Another important issue is that a function returniong on the left is generally cleaner in term of design, and preferred when possible.
This is a low level thing, where it would be hard to see any difference.
Easy answer: it depends.
It depends on the types being used, whether they can be copied cheaply or not (or at all), whether the compiler can use RVO in some circumstances or not, inline things better with one form or another...
Use what makes sense in the context.
I've seen numerous arguments that using a return value is preferable to out parameters. I am convinced of the reasons why to avoid them, but I find myself unsure if I'm running into cases where it is unavoidable.
Part One of my question is: What are some of your favorite/common ways of getting around using an out parameter? Stuff along the lines: Man, in peer reviews I always see other programmers do this when they could have easily done it this way.
Part Two of my question deals with some specific cases I've encountered where I would like to avoid an out parameter but cannot think of a clean way to do so.
Example 1:
I have a class with an expensive copy that I would like to avoid. Work can be done on the object and this builds up the object to be expensive to copy. The work to build up the data is not exactly trivial either. Currently, I will pass this object into a function that will modify the state of the object. This to me is preferable to new'ing the object internal to the worker function and returning it back, as it allows me to keep things on the stack.
class ExpensiveCopy //Defines some interface I can't change.
{
public:
ExpensiveCopy(const ExpensiveCopy toCopy){ /*Ouch! This hurts.*/ };
ExpensiveCopy& operator=(const ExpensiveCopy& toCopy){/*Ouch! This hurts.*/};
void addToData(SomeData);
SomeData getData();
}
class B
{
public:
static void doWork(ExpensiveCopy& ec_out, int someParam);
//or
// Your Function Here.
}
Using my function, I get calling code like this:
const int SOME_PARAM = 5;
ExpensiveCopy toModify;
B::doWork(toModify, SOME_PARAM);
I'd like to have something like this:
ExpensiveCopy theResult = B::doWork(SOME_PARAM);
But I don't know if this is possible.
Second Example:
I have an array of objects. The objects in the array are a complex type, and I need to do work on each element, work that I'd like to keep separated from the main loop that accesses each element. The code currently looks like this:
std::vector<ComplexType> theCollection;
for(int index = 0; index < theCollection.size(); ++index)
{
doWork(theCollection[index]);
}
void doWork(ComplexType& ct_out)
{
//Do work on the individual element.
}
Any suggestions on how to deal with some of these situations? I work primarily in C++, but I'm interested to see if other languages facilitate an easier setup. I have encountered RVO as a possible solution, but I need to read up more on it and it sounds like a compiler specific feature.
I'm not sure why you're trying to avoid passing references here. It's pretty much these situations that pass-by-reference semantics exist.
The code
static void doWork(ExpensiveCopy& ec_out, int someParam);
looks perfectly fine to me.
If you really want to modify it then you've got a couple of options
Move doWork so that's it's a member of ExpensiveCopy (which you say you can't do, so that's out)
return a (smart) pointer from doWork instead of copying it. (which you don't want to do as you want to keep things on the stack)
Rely on RVO (which others have pointed out is supported by pretty much all modern compilers)
Every useful compiler does RVO (return value optimization) if optimizations are enabled, thus the following effectively doesn't result in copying:
Expensive work() {
// ... no branched returns here
return Expensive(foo);
}
Expensive e = work();
In some cases compilers can apply NRVO, named return value optimization, as well:
Expensive work() {
Expensive e; // named object
// ... no branched returns here
return e; // return named object
}
This however isn't exactly reliable, only works in more trivial cases and would have to be tested. If you're not up to testing every case, just use out-parameters with references in the second case.
IMO the first thing you should ask yourself is whether copying ExpensiveCopy really is so prohibitive expensive. And to answer that, you will usually need a profiler. Unless a profiler tells you that the copying really is a bottleneck, simply write the code that's easier to read: ExpensiveCopy obj = doWork(param);.
Of course, there are indeed cases where objects cannot be copied for performance or other reasons. Then Neil's answer applies.
In addition to all comments here I'd mention that in C++0x you'd rarely use output parameter for optimization purpose -- because of Move Constructors (see here)
Unless you are going down the "everything is immutable" route, which doesn't sit too well with C++. you cannot easily avoid out parameters. The C++ Standard Library uses them, and what's good enough for it is good enough for me.
As to your first example: return value optimization will often allow the returned object to be created directly in-place, instead of having to copy the object around. All modern compilers do this.
What platform are you working on?
The reason I ask is that many people have suggested Return Value Optimization, which is a very handy compiler optimization present in almost every compiler. Additionally Microsoft and Intel implement what they call Named Return Value Optimization which is even more handy.
In standard Return Value Optimization your return statement is a call to an object's constructor, which tells the compiler to eliminate the temporary values (not necessarily the copy operation).
In Named Return Value Optimization you can return a value by its name and the compiler will do the same thing. The advantage to NRVO is that you can do more complex operations on the created value (like calling functions on it) before returning it.
While neither of these really eliminate an expensive copy if your returned data is very large, they do help.
In terms of avoiding the copy the only real way to do that is with pointers or references because your function needs to be modifying the data in the place you want it to end up in. That means you probably want to have a pass-by-reference parameter.
Also I figure I should point out that pass-by-reference is very common in high-performance code for specifically this reason. Copying data can be incredibly expensive, and it is often something people overlook when optimizing their code.
As far as I can see, the reasons to prefer return values to out parameters are that it's clearer, and it works with pure functional programming (you can get some nice guarantees if a function depends only on input parameters, returns a value, and has no side effects). The first reason is stylistic, and in my opinion not all that important. The second isn't a good fit with C++. Therefore, I wouldn't try to distort anything to avoid out parameters.
The simple fact is that some functions have to return multiple things, and in most languages this suggests out parameters. Common Lisp has multiple-value-bind and multiple-value-return, in which a list of symbols is provided by the bind and a list of values is returned. In some cases, a function can return a composite value, such as a list of values which will then get deconstructed, and it isn't a big deal for a C++ function to return a std::pair. Returning more than two values this way in C++ gets awkward. It's always possible to define a struct, but defining and creating it will often be messier than out parameters.
In some cases, the return value gets overloaded. In C, getchar() returns an int, with the idea being that there are more int values than char (true in all implementations I know of, false in some I can easily imagine), so one of the values can be used to denote end-of-file. atoi() returns an integer, either the integer represented by the string it's passed or zero if there is none, so it returns the same thing for "0" and "frog". (If you want to know whether there was an int value or not, use strtol(), which does have an out parameter.)
There's always the technique of throwing an exception in case of an error, but not all multiple return values are errors, and not all errors are exceptional.
So, overloaded return values causes problems, multiple value returns aren't easy to use in all languages, and single returns don't always exist. Throwing an exception is often inappropriate. Using out parameters is very often the cleanest solution.
Ask yourself why you have some method that performs work on this expensive to copy object in the first place. Say you have a tree, would you send the tree off into some building method or else give the tree its own building method? Situations like this come up constantly when you have a little bit off design but tend to fold into themselves when you have it down pat.
I know in practicality we don't always get to change every object at all, but passing in out parameters is a side effect operation, and it makes it much harder to figure out what's going on, and you never really have to do it (except as forced by working within others' code frameworks).
Sometimes it is easier, but it's definitely not desirable to use it for no reason (if you've suffered through a few large projects where there's always half a dozen out parameters you'll know what I mean).