Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Specifically, I'm wondering which of these I should write:
{
shared_ptr<GuiContextMenu> subMenu = items[j].subMenu.lock();
if (subMenu)
subMenu->setVisible(false);
}
or:
{
if (items[j].subMenu.lock())
items[j].subMenu.lock()->setVisible(false);
}
I am not required to follow any style guidelines. After optimization, I don't think either choice makes a difference in performance. What is generally the preferred style and why?
EDIT: the type of items[j].subMenu is boost::weak_ptr. lock() creates a shared_ptr out of it. There is actually an ambiguous difference in the two versions above, regarding how long the temporary shared_ptr lasts, so I wrapped my two examples in { braces } to resolve the ambiguity there.
An alternative method:
if(shared_ptr<GuiContextMenu> subMenu = items[j].subMenu.lock()) {
subMenu->setVisible(false);
}
//subMenu is no longer in scope
I'm assuming subMenu is a weak_ptr, in which case your second method creates two temporaries, which might or might not be an issue. And your first method adds a variable to a wider scope than it needs to. Personally, I try to avoid assignments within if statements, but this is one of the few cases where I feel its more useful than the alternatives.
In this particular case, you really should use the version with the temporary variable. The reason is not performance, but correctness - basically, you are not guaranteed that the two x.lock() calls return the same value (eg. if another thread releases the last strong reference on the object just between the two calls). By holding the strong reference in the temporary variable, you ensure it won't go away.
Other than that:
the compilers usually can't optimise out function calls, unless they are provably side-effect free (this is hard to do, but attributes may help) or inlined. In this case, the call has side-effects.
using temporaries can lead to shorter, more readable and more maintainable programs (eg. in case of error, you fix it in one place)
I think you're correct about either choice being no different after optimisation.
Personally, I would declare a new variable if it makes the code more readable, such as when you're chaining calls, or putting function calls inside function calls. As long as it's maintainable and the code achieves the same effect at no speed difference, it all boils down to readable code.
Edit:
mmyers bought up a good comment. Yes, be careful about calling lock() twice, as opposed to just once. They will have different effects depending on your implementation.
The choice is essentially up to you, but the basic thing you should look out for is maintainability.
When the return value is anything other that a boolean, assigning it to an intermediate variable can often simplify debugging. For example, if you step over the following:
if( fn() > 0 ) ...
all you will know after the fact was that the function returned a value either less than zero, or zero or more. Even if the return value were incorrect, the code may still appear to work. Assigning it to a variable that can be inspected in your debugger will allow you to determine whether the return value was expected.
When the return is boolean, the actual value is entirely implicit by the code flow, so it is less critical; however under code maintenance you may find later you need that result, so you may decide to make it a habit in any case.
Even where the return value is boolean, another issue to consider is whether the function has required side-effects, and whether this may be affected by short-circuit evaluation. For example in the statement:
if( isValid && fn() ) ...
the function will never be called is isValid is false.
The circumstances under which the code could be broken under maintenance by the unwary programmer (and it is often the less experienced programmers that get the maintenance tasks) are many, and probably best avoided.
In this specific example, I think it depends on what lock() does. Is the function expensive? Could it return different things each time the function is called (could it return a pointer the first time and NULL the second time)? Is there another thread running that could interleave between the two calls to lock()?
For this example, you need to understand the behavior of lock() and the rest of your code to make an intelligent decision.
I prefer the first one most of the time because it makes the code more clear and easy to read, therefore less error prone. For example, you forgot a parenthesis on that second example :)
In this case, actually, I'd probably do what you did in the second example, however if I needed to use that submenu more than a few times I'd go with the first one to make the code easier to read. As for performance, I thing any sane compiler would be able to optimize that (which is probably why you saw no difference in performance).
Also, as mmyers pointed out, that also depends on what lock() does. In general, if it's a simple getter method or something like that, you'll be fine.
Whatever YOU prefer. For me, it depends on how much I'll use it; for two lines, I might just write it out both times, whereas I create a variable if I use it more. However, YOU are the one who will most likely have to maintain this code and continue looking at it, so use whatever works for you. Of course, if you're at a company with a coding guideline, follow it.
I think the preferred style is whatever style you think makes your code more readable and maintainable. If you're a team of more than one, the only other consideration is that it's generally a good idea for everyone to adopt the same style, again for readability and ease of maintenance.
In this case I think you should use the temporary. Even if you know the implementation to .lock() is inexpensive, that can change. If you don't need to call lock() twice, don't. The value here is that it decouples your code from the implementation of lock(). And that's a good thing generally.
Related
For example I have a code like this:
void func(const QString& str)
{
QString s = str.replace(QRegexp("[abc]+"), " ");
......
}
will the compiler optimize the var QRegep("[abc]+"), just construct it once instead of construct for each time func invoked? Or in other words, do I need to reimplement the coding for performance like this:
void func(const QString& str)
{
static const QRegexp sc_re("[abc]+");
QString s = str.replace(sc_re, " ");
......
}
make the QRegexp as an static const variable.
will the compiler optimize the var QRegep("[abc]+"), just construct it once instead of construct for each time func invoked?
You are assuming that each invocation of func will construct an identical QRegexp object, but how do you know that? How do you know, for example, that these objects do not contain a serial number, an integer member that is set to the number of QRegexp objects previously constructed? If such a serial number was being used, it would be wrong for the compiler to construct your temporary variable just once.
OK, we can reasonably guess that nothing like that is going on. The point, though, is that we are guessing, and the compiler is not allowed to guess. So a prerequisite for the compiler considering such an optimization would be that the definition of the constructor is available (which is an implementation detail of that class, something you should not make your code dependent on).
If the constructor's definition is available, and if that definition provably produces the same results given the same input (and probably some other technical restrictions that slip my mind at the moment), then a compiler would be allowed to make this optimization.
I do not know if any compilers choose to provide this sort of optimization when it would be both allowed and beneficial (another assumption you've made). Performance testing of the two candidates with and without optimizations enabled should reveal if your particular compiler is likely taking advantage of this.
Or in other words, do I need to reimplement the coding for performance like this:
You almost never need to re-implement for performance. (One exception would be if your code is so inefficient it would take centuries to finish. I'm pretty sure we're not in that ballpark.) A better question is "should". I'll go with that.
In this specific case I would guess "no, that looks like premature optimization". However, that is just a guess, so I'll proceed to general guidelines that you can apply.
You should re-implement for performance only if:
1) the performance gain is noticeable to an end user, or
2) the new code is easier for a programmer to read and understand.
In other cases, rely on the compiler to make appropriate optimizations.
In your case, I see the variable name sc_re and think "what is that?" So point 2 is out. That leaves the question of a noticeable performance gain. This usually is not something one can determine by simply asking around. Typically, it involves performance testing, probably of at least two types. One test would time the two candidates in an artificial heavy loop to see how large the performance gain is (if there is one at all). The other test would profile your actual program to see if this code is called often enough for the gain to be noticed by an end user. A good third test would be to give the actual program to an end user and see if they notice the difference.
Of these tests, profiling might be the most productive use of your time. (Programmers are notoriously bad at identifying true performance roadblocks without the aid of a profiler.) If you spend 2 milliseconds in this function every 5 minutes, why spend time trying to improve that? On the other hand, if you spend 1 second in this function each time it is called, the profiler might tell you whether or not this constructor is the main culprit.
Lets say I have a very costly function that checks if an object has a certain property. Another function would then, depending on whether the object has the property, do different things.
If I have previously checked for the property, would it be recomputed by the second function, or is it known?
I'm thinking of something like:
bool check_property(object){
// very costly operations...
}
void do_something(object){
if(check_property) {do thing}
else {do different thing}
}
Would the if in do_something recompute check_property?
There are several factors that have to come together for the compiler to avoid recomputing the function's result:
The compiler has to know which input values the function's result depends on. This knowledge is very difficult to extract from the code in general case. In some implementations you can help the compiler by using compiler-specific means to declare your function as "pure" or "const" (GCC function attributes)
The compiler has to make sure that the above input values did not change since the previous call to the same function. This might be very easy in some specific case, but is also very difficult in general case.
The compiler has to have the result of previous computation readily available. Normally, compilers do not deliberately "cache" such results in some dedicated storage for future reuse. The optimization in question is typically applied only when you make multiple calls to the same function in "close proximity" to each other, meaning that the previous result is easy to keep till the moment of the next call.
So, the optimization in question is certainly possible. But it is something you should expect to see in simple and very localized cases, like calling sqrt(x) several times in a row for the same value of x (in the same expression, in the same cycle and such). But for more complicated functions it is typically going to be your responsibility to either somehow avoid making multiple calls to the same expensive function, or maybe memoize the results if you believe it can benefit your code.
Unless the compiler can prove that check_property has no side effects and that all the data it depends from is the same, it is not allowed to remove the call; for all practical purposes, unless your function body is known in the current TU, it is pretty much trivial and the multiple calls happen in the same function, calling again will execute its code again. I don't know of any compiler that establish automatically a cross-call cache, because it's not trivial at all.
If you need to cache the computed values, in general you will have to do it yourself; keep in mind that it's not always trivial - generally the ugly beasts to tackle are cache invalidation (how do I know that the data used to calculate the value didn't change from the last time I calculated it? how do I avoid the cache size getting out of hand?) and multithreading concerns (is this code going to be called from multiple threads? if so, I have to synchronize the access to the cache, possibly adding coupling between unrelated threads and, in extreme cases, killing the efficiency of the cache itself).
To answer your question, yes. It will rerun it. If you want to make sure that the code doesn't run it again every time you call do_something, try adding a variable in your class that will tell you if you already ran it:
bool check_property(object){
// very costly operations...
return true;
}
void do_something(object,bool has_run){
if(has_run) {do thing}
else {do different thing}
}
void main() {
bool has_run = false;
has_run = check_property(object);
do_something(object,has_run);
}
There are of course multiple ways of doing this, and this might not fit your criteria, but it is a possible way of doing it!
I just realized that this isn't really how C++ works since everything is not in classes unlike Java. Instead you can just pass the value as an argument to the function itself. So, I have edited my code.
I often see functions where other functions are called multiple times instead of storing the result of the function once.
i.e (1):
void ExampleFunction()
{
if (TestFunction() > x || TestFunction() < y || TestFunction() == z)
{
a = TestFunction();
return;
}
b = TestFunction();
}
Instead I would write it that way, (2):
void ExampleFunction()
{
int test = TestFunction();
if (test > x || test < y || test == z)
{
a = test;
return;
}
b = test;
}
I think version 2 is much better to read and better to debug.
But I'm wondering why people do it like in number 1?
Is there anything I don't see? Performance Issue?
When I look at it, I see in the worst case 4 function calls in number (1) instead of 1 function call in number (2), so performance should be worse in number (1), shouldn't it?
I'd use (2) if I wanted to emphasize that the same value is used throughout the code, or if I wanted to emphasize that the type of that value is int. Emphasizing things that are true but not obvious can assist readers to understand the code quickly.
I'd use (1) if I didn't want to emphasize either of those things, especially if they weren't true, or if the number of times that TestFunction() is called is important due to side-effects.
Obviously if you emphasize something that's currently true, but then in future TestFunction() changes and it becomes false, then you have a bug. So I'd also want either to have control of TestFunction() myself, or to have some confidence in the author's plans for future compatibility. Often that confidence is easy: if TestFunction() returns the number of CPUs then you're happy to take a snapshot of the value, and you're also reasonably happy to store it in an int regardless of what type it actually returns. You have to have minimal confidence in future compatibility to use a function at all, e.g. be confident that it won't in future return the number of keyboards. But different people sometimes have different ideas what's a "breaking change", especially when the interface isn't documented precisely. So the repeated calls to TestFunction() might sometimes be a kind of defensive programming.
When a temporary is used to store the result of a very simple expression like this one, it can be argued that the temporary introduces unecessary noise that should be eliminated.
In his book "Refactoring: Improving the Design of Existing Code", Martin Fowler lists this elimination of temporaries as a possibly beneficial refactoring (Inline temp).
Whether or not this is a good idea depends on many aspects:
Does the temporary provides more information than the original expression, for example through a meaningful name?
Is performance important? As you noted, the second version without temporary might be more efficient (most compilers should be able to optimize such code so that the function is called only once, assuming it is free of side-effects).
Is the temporary modified later in the function? (If not, it should probably be const)
etc.
In the end, the choice to introduce or remove such temporary is a decision that should be made on a case by case basis. If it makes the code more readable, leave it. If it is just noise, remove it. In your particular example, I would say that the temporary does not add much, but this is hard to tell without knowing the real names used in your actual code, and you may feel otherwise.
The second option is clearly superior.
You want to emphasize and ensure that you have three times the same value in the if-statement.
Performance should not be a bottleneck in this example. In conclusion minimizing the chance for errors plus emphasize same values are much more important then a potential small performance gain.
The two are not equivalent. Take for example:
int TestFunction()
{
static int x;
return x++;
}
In a sane world though, this wouldn't be the case, and I agree that the second version is better. :)
If the function, for some reason, can't be inlined, the second will even be more efficient.
I think version 2 is much better to read and better to debug.
Agreed.
so performance should be worse in number (1), shouldn't it?
Not necessarily. If TestFunction is small enough, then the compiler may decide to optimize the multiple calls away. In other cases, whether performance matters depends on how often ExampleFunction is called. If not often, then optimize for maintainability.
Also, TestFunction may have side-effects, but in that case, the code or comments should make that clear in some way.
Say you see a loop like this one:
for(int i=0;
i<thing.getParent().getObjectModel().getElements(SOME_TYPE).count();
++i)
{
thing.getData().insert(
thing.GetData().Count(),
thing.getParent().getObjectModel().getElements(SOME_TYPE)[i].getName()
);
}
if this was Java I'd probably not think twice. But in performance-critical sections of C++, it makes me want to tinker with it... however I don't know if the compiler is smart enough to make it futile.
This is a made up example but all it's doing is inserting strings into a container. Please don't assume any of these are STL types, think in general terms about the following:
Is having a messy condition in the for loop going to get evaluated each time, or only once?
If those get methods are simply returning references to member variables on the objects, will they be inlined away?
Would you expect custom [] operators to get optimized at all?
In other words is it worth the time (in performance only, not readability) to convert it to something like:
ElementContainer &source =
thing.getParent().getObjectModel().getElements(SOME_TYPE);
int num = source.count();
Store &destination = thing.getData();
for(int i=0;i<num;++i)
{
destination.insert(thing.GetData().Count(), source[i].getName());
}
Remember, this is a tight loop, called millions of times a second. What I wonder is if all this will shave a couple of cycles per loop or something more substantial?
Yes I know the quote about "premature optimisation". And I know that profiling is important. But this is a more general question about modern compilers, Visual Studio in particular.
The general way to answer such questions is to looked at the produced assembly. With gcc, this involve replacing the -c flag with -S.
My own rule is not to fight the compiler. If something is to be inlined, then I make sure that the compiler has all the information needed to perform such an inline, and (possibly) I try to urge him to do so with an explicit inline keyword.
Also, inlining saves a few opcodes but makes the code grow, which, as far as L1 cache is concerned, can be very bad for performance.
All the questions you are asking are compiler-specific, so the only sensible answer is "it depends". If it is important to you, you should (as always) look at the code the compiler is emitting and do some timing experiments. Make sure your code is compiled with all optimisations turned on - this can make a big difference for things like operator[](), which is often implemented as an inline function, but which won't be inlined (in GCC at least) unless you turn on optimisation.
If the loop is that critical, I can only suggest that you look at the code generated. If the compiler is allowed to aggressively optimise the calls away then perhaps it will not be an issue. Sorry to say this but modern compilers can optimise incredibly well and the I really would suggest profiling to find the best solution in your particular case.
If the methods are small and can and will be inlined, then the compiler may do the same optimizations that you have done. So, look at the generated code and compare.
Edit: It is also important to mark const methods as const, e.g. in your example count() and getName() should be const to let the compiler know that these methods do not alter the contents of the given object.
As a rule, you should not have all that garbage in your "for condition" unless the result is going to be changing during your loop execution.
Use another variable set outside the loop. This will eliminate the WTF when reading the code, it will not negatively impact performance, and it will sidestep the question of how well the functions get optimized. If those calls are not optimized this will also result in performance increase.
I think in this case you are asking the compiler to do more than it legitimately can given the scope of compile-time information it has access to. So, in particular cases the messy condition may be optimized away, but really, the compiler has no particularly good way to know what kind of side effects you might have from that long chain of function calls. I would assume that breaking out the test would be faster unless I have benchmarking (or disassembly) that shows otherwise.
This is one of the cases where the JIT compiler has a big advantage over a C++ compiler. It can in principle optimize for the most common case seen at runtime and provide optimized bytecode for that (plus checks to make sure that one falls into that case). This sort of thing is used all the time in polymorphic method calls that turn out not to actually be used polymorphically; whether it could catch something as complex as your example, though, I'm not certain.
For what it's worth, if speed really mattered, I'd split it up in Java too.
I've seen numerous arguments that using a return value is preferable to out parameters. I am convinced of the reasons why to avoid them, but I find myself unsure if I'm running into cases where it is unavoidable.
Part One of my question is: What are some of your favorite/common ways of getting around using an out parameter? Stuff along the lines: Man, in peer reviews I always see other programmers do this when they could have easily done it this way.
Part Two of my question deals with some specific cases I've encountered where I would like to avoid an out parameter but cannot think of a clean way to do so.
Example 1:
I have a class with an expensive copy that I would like to avoid. Work can be done on the object and this builds up the object to be expensive to copy. The work to build up the data is not exactly trivial either. Currently, I will pass this object into a function that will modify the state of the object. This to me is preferable to new'ing the object internal to the worker function and returning it back, as it allows me to keep things on the stack.
class ExpensiveCopy //Defines some interface I can't change.
{
public:
ExpensiveCopy(const ExpensiveCopy toCopy){ /*Ouch! This hurts.*/ };
ExpensiveCopy& operator=(const ExpensiveCopy& toCopy){/*Ouch! This hurts.*/};
void addToData(SomeData);
SomeData getData();
}
class B
{
public:
static void doWork(ExpensiveCopy& ec_out, int someParam);
//or
// Your Function Here.
}
Using my function, I get calling code like this:
const int SOME_PARAM = 5;
ExpensiveCopy toModify;
B::doWork(toModify, SOME_PARAM);
I'd like to have something like this:
ExpensiveCopy theResult = B::doWork(SOME_PARAM);
But I don't know if this is possible.
Second Example:
I have an array of objects. The objects in the array are a complex type, and I need to do work on each element, work that I'd like to keep separated from the main loop that accesses each element. The code currently looks like this:
std::vector<ComplexType> theCollection;
for(int index = 0; index < theCollection.size(); ++index)
{
doWork(theCollection[index]);
}
void doWork(ComplexType& ct_out)
{
//Do work on the individual element.
}
Any suggestions on how to deal with some of these situations? I work primarily in C++, but I'm interested to see if other languages facilitate an easier setup. I have encountered RVO as a possible solution, but I need to read up more on it and it sounds like a compiler specific feature.
I'm not sure why you're trying to avoid passing references here. It's pretty much these situations that pass-by-reference semantics exist.
The code
static void doWork(ExpensiveCopy& ec_out, int someParam);
looks perfectly fine to me.
If you really want to modify it then you've got a couple of options
Move doWork so that's it's a member of ExpensiveCopy (which you say you can't do, so that's out)
return a (smart) pointer from doWork instead of copying it. (which you don't want to do as you want to keep things on the stack)
Rely on RVO (which others have pointed out is supported by pretty much all modern compilers)
Every useful compiler does RVO (return value optimization) if optimizations are enabled, thus the following effectively doesn't result in copying:
Expensive work() {
// ... no branched returns here
return Expensive(foo);
}
Expensive e = work();
In some cases compilers can apply NRVO, named return value optimization, as well:
Expensive work() {
Expensive e; // named object
// ... no branched returns here
return e; // return named object
}
This however isn't exactly reliable, only works in more trivial cases and would have to be tested. If you're not up to testing every case, just use out-parameters with references in the second case.
IMO the first thing you should ask yourself is whether copying ExpensiveCopy really is so prohibitive expensive. And to answer that, you will usually need a profiler. Unless a profiler tells you that the copying really is a bottleneck, simply write the code that's easier to read: ExpensiveCopy obj = doWork(param);.
Of course, there are indeed cases where objects cannot be copied for performance or other reasons. Then Neil's answer applies.
In addition to all comments here I'd mention that in C++0x you'd rarely use output parameter for optimization purpose -- because of Move Constructors (see here)
Unless you are going down the "everything is immutable" route, which doesn't sit too well with C++. you cannot easily avoid out parameters. The C++ Standard Library uses them, and what's good enough for it is good enough for me.
As to your first example: return value optimization will often allow the returned object to be created directly in-place, instead of having to copy the object around. All modern compilers do this.
What platform are you working on?
The reason I ask is that many people have suggested Return Value Optimization, which is a very handy compiler optimization present in almost every compiler. Additionally Microsoft and Intel implement what they call Named Return Value Optimization which is even more handy.
In standard Return Value Optimization your return statement is a call to an object's constructor, which tells the compiler to eliminate the temporary values (not necessarily the copy operation).
In Named Return Value Optimization you can return a value by its name and the compiler will do the same thing. The advantage to NRVO is that you can do more complex operations on the created value (like calling functions on it) before returning it.
While neither of these really eliminate an expensive copy if your returned data is very large, they do help.
In terms of avoiding the copy the only real way to do that is with pointers or references because your function needs to be modifying the data in the place you want it to end up in. That means you probably want to have a pass-by-reference parameter.
Also I figure I should point out that pass-by-reference is very common in high-performance code for specifically this reason. Copying data can be incredibly expensive, and it is often something people overlook when optimizing their code.
As far as I can see, the reasons to prefer return values to out parameters are that it's clearer, and it works with pure functional programming (you can get some nice guarantees if a function depends only on input parameters, returns a value, and has no side effects). The first reason is stylistic, and in my opinion not all that important. The second isn't a good fit with C++. Therefore, I wouldn't try to distort anything to avoid out parameters.
The simple fact is that some functions have to return multiple things, and in most languages this suggests out parameters. Common Lisp has multiple-value-bind and multiple-value-return, in which a list of symbols is provided by the bind and a list of values is returned. In some cases, a function can return a composite value, such as a list of values which will then get deconstructed, and it isn't a big deal for a C++ function to return a std::pair. Returning more than two values this way in C++ gets awkward. It's always possible to define a struct, but defining and creating it will often be messier than out parameters.
In some cases, the return value gets overloaded. In C, getchar() returns an int, with the idea being that there are more int values than char (true in all implementations I know of, false in some I can easily imagine), so one of the values can be used to denote end-of-file. atoi() returns an integer, either the integer represented by the string it's passed or zero if there is none, so it returns the same thing for "0" and "frog". (If you want to know whether there was an int value or not, use strtol(), which does have an out parameter.)
There's always the technique of throwing an exception in case of an error, but not all multiple return values are errors, and not all errors are exceptional.
So, overloaded return values causes problems, multiple value returns aren't easy to use in all languages, and single returns don't always exist. Throwing an exception is often inappropriate. Using out parameters is very often the cleanest solution.
Ask yourself why you have some method that performs work on this expensive to copy object in the first place. Say you have a tree, would you send the tree off into some building method or else give the tree its own building method? Situations like this come up constantly when you have a little bit off design but tend to fold into themselves when you have it down pat.
I know in practicality we don't always get to change every object at all, but passing in out parameters is a side effect operation, and it makes it much harder to figure out what's going on, and you never really have to do it (except as forced by working within others' code frameworks).
Sometimes it is easier, but it's definitely not desirable to use it for no reason (if you've suffered through a few large projects where there's always half a dozen out parameters you'll know what I mean).