I a writing a Simulation program and wondering if the use of const double is of any use when storing intermediate results. Consider this snippet:
double DoSomeCalculation(const AcModel &model) {
(...)
const double V = model.GetVelocity();
const double m = model.GetMass();
const double cos_gamma = cos(model.GetFlightPathAngleRad());
(...)
return m*V*cos_gamma*Chi_dot;
}
Note that the sample is there only to illustrate -- it might not make to much sense from the engineering side of things. The motivation of storing for example cos_gamma in a variable is that this cosine is used many time in other expressions covered by (...) and I feel that the code gets more readable when using
cos_gamma
rather than
cos(model.GetFlightPathAngleRad())
in various expressions. Now the actual is question is this: since I expect the cosine to be the same througout the code section and I actually created the thing only as a placeholder and for convenience I tend to declare it const. Is there a etablished opinion on wether this is good or bad practive or whether it might bite me in the end? Does a compiler make any use of this additional information or am I actually hindering the compiler from performing useful optimizations?
Arne
I am not sure about the optimization part, but I think it is good to declare it as const. This is because if code is large, then if somebody incorrectly does a cos_gamma = 1 in between you will get a compiler error instead of run time surprises.
You can certainly help things by only computing the cosine once and using it everywhere. Making that result const is a great way to ensure you (or someone else) don't try to change it somewhere down the road.
A good rule of thumb here is to make it correct and readable first. Don't worry about any optimizations the compiler might or might not make. Only after profiling and discovering that a piece of code is indeed too slow should you worry about helping the compiler optimize things.
Given your code:
const double V = model.GetVelocity();
const double m = model.GetMass();
const double cos_gamma = cos(model.GetFlightPathAngleRad());
I would probably leave cos_gamma as it is. I'd consider changing V and m to references though:
const double &V = model.GetVelocity();
const double &m = model.GetMass();
This way you're making it clear that these are strictly placeholders. It does, however, raise the possibility of lifetime issues -- if you use a reference, you clearly have to ensure that what it refers to has sufficient lifetime. At least from the looks of things, this probably won't be a problem though. First of all, GetVelocity() and GetMass() probably return values, not references (in which case you're initializing the references with temporaries, and the lifetime of the temporary is extended to the lifetime of the reference it initializes). Second, even if you return an actual reference, it's apparently to a member of the model, which (at a guess) will exist throughout the entire calculation in question anyway.
I wish I worked with more code written like this!
Anything you can do to make your code more readable can only be a good thing. Optimize only when you need to optimize.
My biggest beef with C++ is that values are not const by default. Use const liberally and keep your feet bullet-hole free!
Whether the compiler makes use of this is an interesting question - once you're at the stage of optimizing a fully working program. Until then, you write the program mostly for the humans coming later and having to look at the code. The compiler will happily swallow (objectively) very unreadable code, lacking spaces, newlines, and sporting hilarious indentation. Humans won't.
The most important question is, how readable is the code, and how easy it is to make a mistake changing it. Here, const helps tremendously, because it makes the compiler bark at everyone who mistakenly changes anything that shouldn't change. I always make everything const unless I really, really want it to be changeable.
Related
2013 Keynote: Chandler Carruth: Optimizing the Emergent Structures of C++
42:45
You don't need output parameters, we have value semantics in C++. ... Anytime you see someone arguing that nonono I'm not going to return by value because copy would cost too much, someone working on an optimizer says they're wrong. All right? I have never yet seen a piece of code where that argument was correct. ... People don't realize how important value semantics are to the optimizer because it completely clarifies the aliasing scenarios.
Can anyone put this in the context of this answer: https://stackoverflow.com/a/14229152
I hear that being repeated on and on but, well, for me a function returning something is a source. Output parameters by reference take that characteristic out of a function, and removing such hard coded characteristic from a function allows one to manage outside instead, how output will be stored/reused.
My question is, even in the context of that SO answer, is there a way to tell, restructuring the code some other equivalent way, "ok now see, value semantics in this way doesn't lose for the output param version", or Chandler comments were specific to some contrived situations? I had even seen Andrei Alexandrescu arguing this in a talk and telling you can't escape using by ref output for better performance.
For another take on Andrei's comments see Eric Niebler: Out Parameters, Move Semantics, and Stateful Algorithms.
It's either an overstatement, generalization, a joke, or Chandler's idea of "Perfectly Reasonable Performance" (using modern C++ toolchains/libs) is unacceptable to my programs.
I find it a fairly narrow scope of optimizations. Penalties exist beyond that scope which cannot be ignored due to the actual complexities and designs found in programs - heap allocations were an example for the getline example. The specific optimizations may or may not always be applicable to the program in question, despite your attempts to reduce them. Real world structures will reference memory which may alias. You can reduce that, but it's not practical to believe that you can eliminate aliasing (from an optimizer's perspective).
Of course, RBV can be a great thing - it's just not appropriate for all cases. Even the link you referenced pointed out how one can avoid a ton of allocations/frees. Real programs and the data structures found in them are far more complex.
Later in the talk, he goes on to criticize the use of member functions (ref: S::compute()). Sure, there is a point to take away, but is it really reasonable to avoid using these language features entirely because it makes the optimizer's job easier? No. Will it always result in more readable programs? No. Will these code transformations always result in measurably faster programs? No. Are the changes required to transform your codebase worth the time you invest? Sometimes. Can you take away some points and make better informed decisions which impact some of your existing or future codebase? Yes.
Sometimes it helps to break down how exactly your program would execute, or what it would look like in C.
The optimizer will not solve all performance problems, and you should not rewrite programs with the assumption that the programs you are dealing with are "completely brain dead and broken designs", nor should you believe that using RBV will always result in "Perfectly Reasonable Performance". You can utilize new language features and make the optimizer's job easier, though there is much to gain there are often more important optimizations to invest your time on.
It's fine to consider the proposed changes; ideally you would measure the impact of such changes in real world execution times and impact on your source code before adopting these suggestions.
For your example: Even copying+assigning large structures by value can have significant costs. Beyond the cost of running constructors and destructors (along with their associated creation/cleanup of the resources they acquire and own, as pointed out in the link you reference), even things as simple as avoiding unnecessary structure copies can save you a ton of CPU time if you use references (where appropriate). The structure copy may be as simple as a memcpy. These aren't contrived problems; they appear in actual programs and the complexity can increase greatly with your program's complexity. Is reducing aliasing of some memory and other optimizations worth the costs, and does it result in "Perfectly Reasonable Performance"? Not always.
The problem with output parameters as described in the linked question is that they generally make the common calling case (i.e., you don't have a vector storage to use already) much more verbose than normal. For example, if you used return by value:
auto a = split(s, r);
if you used output parameters:
std::vector<std::string> a;
split(s,r,a);
The second one looks much less pretty to my eyes. Also, as Chandler mentioned, the optimizer can do much more with the first one than the second, depending on the rest of your code.
Is there a way we can get the best of both worlds? Emphatically yes, using move semantics:
std::vector<std::string> split(const std::string &s, const std::regex &r, std::vector<std::string> v = {})
{
auto rit = std::sregex_token_iterator(s.begin(), s.end(), r, -1);
auto rend = std::sregex_token_iterator();
v.clear();
while(rit != rend)
{
v.push_back(*rit);
++rit;
}
return v;
}
Now, in the common case, we can call split as normal (i.e., the first example) and it will allocate a new vector storage for us. In the important but rare case when we have to split repeatedly and we want to re-use the same storage, we can just move in a storage that persists between calls:
int main()
{
const std::regex r(" +");
std::vector<std::string> a;
for(auto i=0; i < 1000000; ++i)
a = split("a b c", r, std::move(a));
return 0;
}
This runs just as fast as the output-argument method and what's going on is quite clear. You don't have to make your function hard to use all the time just to get good performance some of the time.
I was about to implement a solution using string_view and ranges and then found this:
std::split(): An algorithm for splitting strings
This endorses value semantics output by return and I accept it as prettier than the current choice in the referred SO answer. This design also would take the characteristic of being a source out of a function even though it is returning. Simple illustration: one could have an outer reserved vector being repetitively filled by a returned range.
In any case, I'm not sure whether such version of split would help the optimizer in any sense (I'm talking in context to Chandler's talk here).
Notice
An output param version enforces the existence of a named variable at call site, which may be ugly to the eyes, but may turn debugging the call site always easier.
Sample solution
While std::split doesn't arrive, I've exercised the value semantics output by return version this way:
#include <string>
#include <string_view>
#include <boost/regex.hpp>
#include <boost/range/iterator_range.hpp>
#include <boost/iterator/transform_iterator.hpp>
using namespace std;
using namespace std::experimental;
using namespace boost;
string_view stringfier(const cregex_token_iterator::value_type &match) {
return {match.first, static_cast<size_t>(match.length())};
}
using string_view_iterator =
transform_iterator<decltype(&stringfier), cregex_token_iterator>;
iterator_range<string_view_iterator> split(string_view s, const regex &r) {
return {
string_view_iterator(
cregex_token_iterator(s.begin(), s.end(), r, -1),
stringfier
),
string_view_iterator()
};
}
int main() {
const regex r(" +");
for (size_t i = 0; i < 1000000; ++i) {
split("a b c", r);
}
}
I have used Marshall Clow's string_view libc++ implementation found at https://github.com/mclow/string_view.
I have posted the timings at the bottom of the referred answer.
I often use const for local variables that are not being modified, like this:
const float height = person.getHeight();
I think it can make the compiled code potentially faster, allowing the compiler to do some more optimization. Or am I wrong, and compilers can figure out by themselves that the local variable is never modified?
Or am I wrong, and compilers can figure out by themselves that the local variable is never modified?
Most of the compilers are smart enough to figure this out themselves.
You should rather use const for ensuring const-correctness and not for micro-optimization.
const correctness lets compiler help you guard against making honest mistakes, so you should use const wherever possible but for maintainability reasons & preventing yourself from doing stupid mistakes.
It is good to understand the performance implications of code we write but excessive micro-optimization should be avoided. With regards to performance one should follow the,
80-20 Rule:
Identify the 20% of your code which uses 80% of your resources, through profiling on representative data sets and only then attempt to optimize those bottlenecks.
This performance difference will almost certainly be negligible, however you should be using const whenever possible for code documentation reasons. Often times, compilers can figure this out for your anyway and make the optimizations automatically. const is really more about code readability and clarity than performance.
If there is a value type on the left hand, you may safely assume that it will have a negligible effect, or none at all. It's not going to influence overload resolution, and what is actually const can easily be deduced from the scope.
It's an entirely different matter with reference types:
std::vector<int> v(1);
const auto& a = v[0];
auto& b = v[0];
These two assignments resolve to two entirely different operators, and similar overload pairs are found in many libraries aside STL too. Even in this simple example, optimizations which depend on v having been immutable for the scope of b are already no longer trivial and less likely to be found.
The STL is still quite tame in these terms though, in such that at least the behavior doesn't change based on choosing the const_reference overload or not. For most of STL, the const_reference overload is only tied to the object being const itself.
Some other libraries (e.g. Qt) make heavy use of copy-on-write semantics. In these const-correctness with references is no longer optional, but necessary:
QVector<int> v1(1);
auto v2 = v1; // Backing storage of v2 and v1 is still linked
const auto& a = v1[0]; // Still linked
const auto& b = v2[0]; // Still linked
auto& c = v2[0]; // Deep copy from v1 to v2 is happening now :(
// Even worse, &b != &c
Copy-on-write semantics are something commonly found in large-matrix or image manipulation libraries, and something to watch out for.
It's also something where the compiler is no longer able to save you, the overload resolution is mandated by C++ standard and there is no leeway for eliminating the costly side effects.
I don't think it is a good practice to make local variables, including function parameters, constant by default.
The main reason is brevity. Good coding practices allow you to make your code short, this one doesn't.
Quite similarly, you can write void foo(void) in your function declarations, and you can justify it by increased clarity, being explicit about not intending to pass a parameter to the function, etc, but it is essentially a waste of space, and eventually almost fell out of use. I think the same thing will happen to the trend of using const everywhere.
Marking local variables with a const qualifier is not very useful for most of the collaborators working with the code you create. Unlike class members, global variables, or the data pointed to a by a pointer, a local variable doesn't have any external effects, and no one would ever be restricted by the qualifier of the local variable or learn anything useful from it (unless he is going to change the particular function where the local variable is).
If he needs to change your function, the task should not normally require him to try to deduce valuable information from the constant qualifier of a variable there. The function should not be so large or difficult to understand; if it is, probably you have more serious problems with your design, consider refactoring. One exception is functions that implement some hardcore math calculations, but for those you would need to put some details or a link to the paper in your comments.
You might ask why not still put the const qualifier if it doesn't cost you much effort. Unfortunately, it does. If I had to put all const qualifiers, I would most likely have to go through my function after I am done and put the qualifiers in place - a waste of time. The reason for that is that you don't have to plan the use of local variables carefully, unlike with the members or the data pointed to by pointers.
They are mostly a convenience tool, technically, most of them can be avoided by either factoring them into expressions or reusing variables. So, since they are a convenience tool, the very existence of a particular local variable is merely a matter of taste.
Particularly, I can write:
int d = foo(b) + c;
const int a = foo(b); int d = a + c;
int a = foo(b); a += c
Each of the variants is identical in every respect, except that the variable a is either constant or not, or doesn't exist at all. It is hard to commit to some of the choices early.
There is one major problem with local constant values - as shown in the code below:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
// const uint32_t const_dummy = 0;
void func1(const uint32_t *ptr);
int main(void) {
const uint32_t const_dummy = 0;
func1(&const_dummy);
printf("0x%x\n", const_dummy);
return EXIT_SUCCESS;
}
void func1(const uint32_t *ptr) {
uint32_t *tmp = (uint32_t *)ptr;
*tmp = 1;
}
This code was compiled on Ubuntu 18.04.
As you can see, const_dummy's value can be modified in this case!
But, if you modify the code and set const_dummy scope to global - by commenting out the local definition and remove the comment from the global definition - you will get an exception and your our program will crash - which is good, because you can debug it and find the problem.
What is the reason? Well global const values are located in the ro (read only) section of the program. The OS - protects this area using the MMU.
It is not possible to do it with constants defined in the stack.
With systems that don't use the MMU - you will not even "feel" that there is a problem.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Constants and compiler optimization in C++
Let the holy wars begin:
I've heard a number of differing opinions on the usefulness of const in C++. Of course it has uses in member function declarations, etc. But how useful is it as a modifier on variables (or rather, constants)? Does it indeed help the optimizer, if the rest of the code is left the same?
There are a lot of cases where the const modifier won't help the optimizer, for the simple fact that the compiler can already tell if you modified a variable or not. The biggest benefit of const, in my opinion, is that it tells the compiler whether the programmer intended to modify that variable, which is useful in finding certain types of semantic errors at compile time instead of run time. Any error you can move to compile time is a huge boost in programmer productivity.
const does not help the optimizer.
Since const can be cast away with const_cast, it's possible to write programs that use const in a number of places, then cast it away and modify variables anyway, with defined behavior according to the standard. The compiler therefore must look at the actual code of the program to determine which variables are modified when, and it's probably pretty good at this anyway (for example it might determine a non-const variable is invariant over a certain block of code and optimize accordingly).
If the compiler blindly treated const as a guarantee that something won't change, the optimizer would break some well-formed programs.
const is a compile-time feature to help programmers write correct code, by adding some compile-time constraints, and indicating a code contract (eg. 'I promise not to change this parameter'). It has nothing to do with optimization. While invariants are important to optimizers, this has nothing to do with the const keyword.
There is one exception: objects declared with const. These cannot be modified; even if they are via casting, the behavior is undefined. There's some subtlety here:
const int ci = 5;
const_cast<int&>(ci) = 5; // undefined behavior, original object declared const
int i = 5;
const int& ci2 = i; // cannot modify i through ci2, const reference
const_cast<int&>(ci2) = 5; // OK, original object not declared const
So when the compiler sees const int ci it probably does assume it will never, ever change, because modifying it is undefined behavior. However, chances are this isn't the bottleneck in your program, it's just a more sophisticated #define. Apart from that, const is weak - just a keyword for the type system.
In general, no, it will not help the compiler. Since the const-ness can be casted away in a second in both C and C++, it'd be hard for the compiler to make the necessary assumptions about fulfilled code requirements for optimizations.
That said, const-correctness should always be used for its other benefits.
It can't hurt and in theory could allow for some optimizations so you might as well use it - don't know if any production compilers do.
I've seen statements like this
if(SomeBoolReturningFunc())
{
//do some stuff
//do some more stuff
}
and am wondering if putting a function in an if statement is efficient, or if there are cases when it would be better to leave them separate, like this
bool AwesomeResult = SomeBoolReturningFunc();
if(AwesomeResult)
{
//do some other, more important stuff
}
...?
I'm not sure what makes you think that assigning the result of the expression to a variable first would be more efficient than evaluating the expression itself, but it's never going to matter, so choose the option that enhances the readability of your code. If you really want to know, look at the output of your compiler and see if there is any difference. On the vast majority of systems out there this will likely result in identical machine code.
either way it should not matter. the basic idea is that the result will be stored in a temporary variable no matter what, whether you name it or not. Readability is more important nowadays because computers are normally so fast that small tweaks don't matter as much.
I've certainly seen if (f()) {blah;} produce more efficient code than bool r = f(); if (r) {blah;}, but that was many years ago on a 68000.
These days, I'd definitely pick the code that's simpler to debug. Your inefficiencies are far more likely to be your algorithm vs. the code your compiler generates.
as others have said, basically makes no real difference in performance, generally in C++ most performance gains at a pure code level (as opposed to algorithm) are made in loop unrolling.
also, avoiding branching altogether can give way more performance if the condition is in a loop.
for instance by having separate loops for each conditional case or by having a statements that inherently take into account the condition (perhaps by multiplying a term by 0 if its not wanted)
and then you can get more by unrolling that loop
templating your code can help a lot with this in a "semi" clean way.
There is also the possibility of
if (bool AwesomeResult = SomeBoolRetuningFunc()) { ... }
:)
IMO, the kind of statements you have seen are clearer to read, and less error prone:
bool result = Foo();
if (result) {
//some stuff
}
bool other_result = Bar();
if (result) { //should be other_result? Hopefully caught by "unused variable" warning...
//more stuff
}
Both variants usually produce the same machine code and run exactly the same. Very rarely there will a performance difference and even in this cases it will unlikely be a bottleneck (which translates into don't bother prior to profiling).
The significant difference is in debugging and readability. With a temporary variable it's easier to debug. Without the variable the code is shorter and perhaps more readable.
If you want both easy to debug and easier to read code you should better declare the variable as const:
const bool AwesomeResult = SomeBoolReturningFunc();
if(AwesomeResult)
{
//do some other, more important stuff
}
this way it's clearer that the variable is never assigned to again and there's no other logic behind its declaration.
Putting aside any debugging ease or readability problems, and as long as the function's returned value is not used again in the if-block; it seems to me that assigning the returned value to a variable only causes an extra use of the = operator and an extra bool variable stored in the stack space - I might speculate further that an extra variable in the stack space will cause latency in further stack accesses (not sure though).
The thing is, these are really minor problems and as long as the compiler has an optimization flag on, should cause no inefficiency. A different case I'd consider would be an embedded system - then again, how much damage could a single, 8-bit variable cause? (I have absolutely no knowledge concerning embedded systems, so maybe someone else could elaborate on this?)
The AwesomeResult version can be faster if SomeBoolReturningFunc() is fairly slow and you are able to use AwesomeResult more than once rather than calling SomeBoolReturningFunc() again.
Placing the function inside or outside the if-statement doesn't matter. There is no performance gain or loss. This is because the compiler will automatically create a place on the stack for the return value - whether or not you've explicitly defined a variable.
In the performance tuning I've done, this would only be thought about in the final stage of cycle-shaving after cleaning out a series of significant performance issues.
I seem to recall that one statement per line was the recommendation of the book Code Complete, where it was argued that such code is easier to understand. Make each statement do one and only one thing so that it is very easy to see very quickly at a glance what is happening in the code.
I personally like having the return types in variables to make them easier to inspect (or even change) in the debugger.
One answer stated that a difference was observed in generated code. I sincerely doubt that with an optimizing compiler.
A disadvantage, prior to C++11, for multiple lines is that you have to know the type of the return for the variable declaration. For example, if the return is changed from bool to int then depending on the types involved you could have a truncated value in the local variable (which could make the if malfunction). If compiling with C++11 enabled, this can be dealt with by using the auto keyword, as in:
auto AwesomeResult = SomeBoolReturningFunc() ;
if ( AwesomeResult )
{
//do some other, more important stuff
}
Stroustrup's C++ 4th edition recommends putting the variable declaration in the if statement itself, as in:
if ( auto AwesomeResult = SomeBoolReturningFunc() )
{
//do some other, more important stuff
}
His argument is that this limits the scope of the variable to the greatest extent possible. Whether or not that is more readable (or debuggable) is a judgement call.
This is a C++ disaster, check out this code sample:
#include <iostream>
void func(const int* shouldnotChange)
{
int* canChange = (int*) shouldnotChange;
*canChange += 2;
return;
}
int main() {
int i = 5;
func(&i);
std::cout << i;
return 0;
}
The output was 7!
So, how can we make sure of the behavior of C++ functions, if it was able to change a supposed-to-be-constant parameter!?
EDIT: I am not asking how can I make sure that my code is working as expected, rather I am wondering how to believe that someone else's function (for instance some function in some dll library) isn't going to change a parameter or posses some behavior...
Based on your edit, your question is "how can I trust 3rd party code not to be stupid?"
The short answer is "you can't." If you don't have access to the source, or don't have time to inspect it, you can only trust the author to have written sane code. In your example, the author of the function declaration specifically claims that the code will not change the contents of the pointer by using the const keyword. You can either trust that claim, or not. There are ways of testing this, as suggested by others, but if you need to test large amounts of code, it will be very labour intensive. Perhaps moreso than reading the code.
If you are working on a team and you have a team member writing stuff like this, then you can talk to them about it and explain why it is bad.
By writing sane code.
If you write code you can't trust, then obviously your code won't be trustworthy.
Similar stupid tricks are possible in pretty much any language. In C#, you can modify the code at runtime through reflection. You can inspect and change private class members. How do you protect against that? You don't, you just have to write code that behaves as you expect.
Apart from that, write a unittest testing that the function does not change its parameter.
The general rule in C++ is that the language is designed to protect you from Murphy, not Machiavelli. In other words, its meant to keep a maintainance programmer from accidentally changing a variable marked as const, not to keep someone from deliberatly changing it, which can be done in many ways.
A C-style cast means all bets are off. It's sort of like telling the compiler "Trust me, I know this looks bad, but I need to do this, so don't tell me I'm wrong." Also, what you've done is actually undefined. Casting off const-ness and then modifying the value means the compiler/runtime can do anything, including e.g. crash your program.
The only thing I can suggest is to allocate the variable shouldNotChange from a memory page that is marked as read-only. This will force the OS/CPU to raise an error if the application attempts to write to that memory. I don't really recommend this as a general method of validating functions just as an idea you may find useful.
The simplest way to enforce this would be to just not pass a pointer:
void func(int shouldnotChange);
Now a copy will be made of the argument. The function can change the value all it likes, but the original value will not be modified.
If you can't change the function's interface then you could make a copy of the value before calling the function:
int i = 5;
int copy = i
func(©);
Don't use C style casts in C++.
We have 4 cast operators in C++ (listed here in order of danger)
static_cast<> Safe (When used to 'convert numeric data types').
dynamic_cast<> Safe (but throws exceptions/returns NULL)
const_cast<> Dangerous (when removing const).
static_cast<> Very Dangerous (When used to cast pointer types. Not a very good idea!!!!!)
reinterpret_cast<> Very Dangerous. Use this only if you understand the consequences.
You can always tell the compiler that you know better than it does and the compiler will accept you at face value (the reason being that you don't want the compiler getting in the way when you actually do know better).
Power over the compiler is a two edged sword. If you know what you are doing it is a powerful tool the will help, but if you get things wrong it will blow up in your face.
Unfortunately, the compiler has reasons for most things so if you over-ride its default behavior then you better know what you are doing. Cast is one the things. A lot of the time it is fine. But if you start casting away const(ness) then you better know what you are doing.
(int*) is the casting syntax from C. C++ supports it fully, but it is not recommended.
In C++ the equivalent cast should've been written like this:
int* canChange = static_cast<int*>(shouldnotChange);
And indeed, if you wrote that, the compiler would NOT have allowed such a cast.
What you're doing is writing C code and expecting the C++ compiler to catch your mistake, which is sort of unfair if you think about it.