Is const a lie? (since const can be cast away) [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Sell me on const correctness
What is the usefulness of keyword const in C or C++ since it's allowed such a thing?
void const_is_a_lie(const int* n)
{
*((int*) n) = 0;
}
int main()
{
int n = 1;
const_is_a_lie(&n);
printf("%d", n);
return 0;
}
Output: 0
It is clear that const cannot guarante the non-modifiability of the argument.

const is a promise you make to the compiler, not something it guarantees you.
For example,
void const_is_a_lie(const int* n)
{
*((int*) n) = 0;
}
#include <stdio.h>
int main()
{
const int n = 1;
const_is_a_lie(&n);
printf("%d", n);
return 0;
}
Output shown at http://ideone.com/Ejogb is
1
Because of the const, the compiler is allowed to assume that the value won't change, and therefore it can skip rereading it, if that would make the program faster.
In this case, since const_is_a_lie() violates its contract, weird things happen. Don't violate the contract. And be glad that the compiler gives you help keeping the contract. Casts are evil.

In this case, n is a pointer to a constant int. When you cast it to int* you remove the const qualifier, and so the operation is allowed.
If you tell the compiler to remove the const qualifier, it will happily do so. The compiler will help ensure that your code is correct, if you let it do its job. By casting the const-ness away, you are telling the compiler that you know that the target of n is non-constant and you really do want to change it.
If the thing that your pointer points to was in fact declared const in the first place, then you are invoking undefined behavior by attempting to change it, and anything could happen. It might work. The write operation might not be visible. The program could crash. Your monitor could punch you. (Ok, probably not that last one.)
void const_is_a_lie(const char * c) {
*((char *)c) = '5';
}
int main() {
const char * text = "12345";
const_is_a_lie(text);
printf("%s\n", text);
return 0;
}
Depending on your specific environment, there may be a segfault (aka access violation) in const_is_a_lie since the compiler/runtime may store string literal values in memory pages that are not writable.
The Standard has this to say about modifying const objects.
7.1.6.1/4 The cv-qualifiers [dcl.type.cv]
Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const object during its lifetime (3.8) results in undefined behavior
"Doctor, it hurts when I do this!" "So don't do that."

Your...
int n = 1;
...ensures n exists in read/write memory; it's a non-const variable, so a later attempt to modify it will have defined behaviour. Given such a variable, you can have a mix of const and/or non-const pointers and references to it - the constness of each is simply a way for the programmer to guard against accidental change in that "branch" of code. I say "branch" because you can visualise the access given to n as being a tree in which - once a branch is marked const, all the sub-branches (further pointers/references to n whether additional local variables, function parameters etc. initialised therefrom) will need to remain const, unless of course you explicitly cast that notion of constness away. Casting away const is safe (if potentially confusing) for variables that are mutable like your n, because they're ultimately still writing back into a memory address that is modifiable/mutable/non-const. All the bizarre optimisations and caching you could imagine causing trouble in these scenarios aren't allowed as the Standard requires and guarantees sane behaviour in the case I've just described.
Sadly it's also possible to cast away constness of genuinely inherently const variables like say const int o = 1;, and any attempt to modify them will have undefined behaviour. There are many practical reasons for this, including the compiler's right to place them in memory it then marks read only (e.g. see UNIX mprotect(2)) such that an attempted write will cause a CPU trap/interrupt, or read from the variable whenever the originally-set value is needed (even if the variable's identifier was never mentioned in the code using the value), or use an inlined-at-compile-time copy of the original value - ignoring any runtime change to the variable itself. So, the Standard leaves the behaviour undefined. Even if they happen to be modified as you might have intended, the rest of the program will have undefined behaviour thereafter.
But, that shouldn't be surprising. It's the same situation with types - if you have...
double d = 1;
*(int*)&d = my_int;
d += 1;
...have you have lied to the compiler about the type of d? Ultimately d occupies memory that's probably untyped at a hardware level, so all the compiler ever has is a perspective on it, shuffling bit patterns in and out. But, depending on the value of my_int and the double representation on your hardware, you may have created an invalid combination of bits in d that don't represent any valid double value, such that subsequent attempts to read the memory back into a CPU register and/or do something with d such as += 1 have undefined behaviour and might, for example, generate a CPU trap / interrupt.
This is not a bug in C or C++... they're designed to let you make dubious requests of your hardware so that if you know what you're doing you can do some weird but useful things and rarely need to fall back on assembly language to write low level code, even for device drivers and Operating Systems.
Still, it's precisely because casts can be unsafe that a more explicit and targeted casting notation has been introduced in C++. There's no denying the risk - you just need to understand what you're asking for, why it's ok sometimes and not others, and live with it.

The type system is there to help, not to babysit you. You can circumvent the type system in many ways, not only regarding const, and each time that you do that what you are doing is taking one safety out of your program. You can ignore const-correctness or even the basic type system by passing void* around and casting as needed. That does not mean that const or types are a lie, only that you can force your way over the compiler's.
const is there as a way of making the compiler aware of the contract of your function, and let it help you not violate it. In the same way that a variable being typed is there so that you don't need to guess how to interpret the data as the compiler will help you. But it won't baby sit, and if you force your way and tell it to remove const-ness, or how the data is to be retrieved the compiler will just let you, after all you did design the application, who is it to second guess your judgement...
Additionally, in some cases, you might actually cause undefined behavior and your application might even crash (for example if you cast away const from an object that is really const and you modify the object you might find out that the side effects are not seen in some places (the compiler assumed that the value would not change and thus performed constant folding) or your application might crash if the constant was loaded into a read-only memory page.

Never did const guarantee immutability: the standard defines a const_cast that allows modifying const data.
const is useful for you to declare more intent and avoid changing data that is you meant to be read only. You'll get a compilation error asking you to think twice if you do otherwise. You can change your mind, but that's not recommended.
As mentionned by other answers the compiler may optimize a bit more if you use const-ness but the benefits are not always significant.

Related

Undefined behaviour with const_cast

I was hoping that someone could clarify exactly what is meant by undefined behaviour in C++. Given the following class definition:
class Foo
{
public:
explicit Foo(int Value): m_Int(Value) { }
void SetValue(int Value) { m_Int = Value; }
private:
Foo(const Foo& rhs);
const Foo& operator=(const Foo& rhs);
private:
int m_Int;
};
If I've understood correctly the two const_casts to both a reference and a pointer in the following code will remove the const-ness of the original object of type Foo, but any attempts made to modify this object through either the pointer or the reference will result in undefined behaviour.
int main()
{
const Foo MyConstFoo(0);
Foo& rFoo = const_cast<Foo&>(MyConstFoo);
Foo* pFoo = const_cast<Foo*>(&MyConstFoo);
//MyConstFoo.SetValue(1); //Error as MyConstFoo is const
rFoo.SetValue(2); //Undefined behaviour
pFoo->SetValue(3); //Undefined behaviour
return 0;
}
What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined. I know that const_casts are, broadly speaking, frowned upon, but I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed, for example:
Foo& rAnotherFoo = (Foo&)MyConstFoo;
Foo* pAnotherFoo = (Foo*)&MyConstFoo;
rAnotherFoo->SetValue(4);
pAnotherFoo->SetValue(5);
In what circumstances might this behaviour cause a fatal runtime error? Is there some compiler setting that I can set to warn me of this (potentially) dangerous behaviour?
NB: I use MSVC2008.
I was hoping that someone could clarify exactly what is meant by undefined behaviour in C++.
Technically, "Undefined Behaviour" means that the language defines no semantics for doing such a thing.
In practice, this usually means "don't do it; it can break when your compiler performs optimisations, or for other reasons".
What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined.
In this specific example, attempting to modify any non-mutable object may "appear to work", or it may overwrite memory that doesn't belong to the program or that belongs to [part of] some other object, because the non-mutable object might have been optimised away at compile-time, or it may exist in some read-only data segment in memory.
The factors that may lead to these things happening are simply too complex to list. Consider the case of dereferencing an uninitialised pointer (also UB): the "object" you're then working with will have some arbitrary memory address that depends on whatever value happened to be in memory at the pointer's location; that "value" is potentially dependent on previous program invocations, previous work in the same program, storage of user-provided input etc. It's simply not feasible to try to rationalise the possible outcomes of invoking Undefined Behaviour so, again, we usually don't bother and instead just say "don't do it".
What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined.
As a further complication, compilers are not required to diagnose (emit warnings/errors) for Undefined Behaviour, because code that invokes Undefined Behaviour is not the same as code that is ill-formed (i.e. explicitly illegal). In many cases, it's not tractible for the compiler to even detect UB, so this is an area where it is the programmer's responsibility to write the code properly.
The type system — including the existence and semantics of the const keyword — presents basic protection against writing code that will break; a C++ programmer should always remain aware that subverting this system — e.g. by hacking away constness — is done at your own risk, and is generally A Bad Idea.™
I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed.
Absolutely. With warning levels set high enough, a sane compiler may choose to warn you about this, but it doesn't have to and it may not. In general, this is a good reason why C-style casts are frowned upon, but they are still supported for backwards compatibility with C. It's just one of those unfortunate things.
Undefined behaviour depends on the way the object was born, you can see Stephan explaining it at around 00:10:00 but essentially, follow the code below:
void f(int const &arg)
{
int &danger( const_cast<int&>(arg);
danger = 23; // When is this UB?
}
Now there are two cases for calling f
int K(1);
f(k); // OK
const int AK(1);
f(AK); // triggers undefined behaviour
To sum up, K was born a non const, so the cast is ok when calling f, whereas AK was born a const so ... UB it is.
Undefined behaviour literally means just that: behaviour which is not defined by the language standard. It typically occurs in situations where the code is doing something wrong, but the error can't be detected by the compiler. The only way to catch the error would be to introduce a run-time test - which would hurt performance. So instead, the language specification tells you that you mustn't do certain things and, if you do, then anything could happen.
In the case of writing to a constant object, using const_cast to subvert the compile-time checks, there are three likely scenarios:
it is treated just like a non-constant object, and writing to it modifies it;
it is placed in write-protected memory, and writing to it causes a protection fault;
it is replaced (during optimisation) by constant values embedded in the compiled code, so after writing to it, it will still have its initial value.
In your test, you ended up in the first scenario - the object was (almost certainly) created on the stack, which is not write protected. You may find that you get the second scenario if the object is static, and the third if you enable more optimisation.
In general, the compiler can't diagnose this error - there is no way to tell (except in very simple examples like yours) whether the target of a reference or pointer is constant or not. It's up to you to make sure that you only use const_cast when you can guarantee that it's safe - either when the object isn't constant, or when you're not actually going to modify it anyway.
What is puzzling me is why this appears to work
That is what undefined behavior means.
It can do anything including appear to work.
If you increase your optimization level to its top value it will probably stop working.
but doesn't even prompt me with a warning to notify me that this behaviour is undefined.
At the point it were it does the modification the object is not const. In the general case it can not tell that the object was originally a const, therefore it is not possible to warn you. Even if it was each statement is evaluated on its own without reference to the others (when looking at that kind of warning generation).
Secondly by using cast you are telling the compiler "I know what I am doing override all your safety features and just do it".
For example the following works just fine: (or will seem too (in the nasal deamon type of way))
float aFloat;
int& anIntRef = (int&)aFloat; // I know what I am doing ignore the fact that this is sensable
int* anIntPtr = (int*)&aFloat;
anIntRef = 12;
*anIntPtr = 13;
I know that const_casts are, broadly speaking, frowned upon
That is the wrong way to look at them. They are a way of documenting in the code that you are doing something strange that needs to be validated by smart people (as the compiler will obey the cast without question). The reason you need a smart person to validate is that it can lead to undefined behavior, but the good thing you have now explicitly documented this in your code (and people will definitely look closely at what you have done).
but I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed, for example:
In C++ there is no need to use a C style cast.
In the worst case the C-Style cast can be replaced by reinterpret_cast<> but when porting code you want to see if you could have used static_cast<>. The point of the C++ casts is to make them stand out so you can see them and at a glance spot the difference between the dangerous casts the benign casts.
A classic example would be trying to modify a const string literal, which may exist in a protected data segment.
Compilers may place const data in read only parts of memory for optimization reasons and attempt to modify this data will result in UB.
Static and const data are often stored in another part of you program than local variables. For const variables, these areas are often in read-only mode to enforce the constness of the variables. Attempting to write in a read-only memory results in an "undefined behavior" because the reaction depends on your operating system. "Undefined beheavior" means that the language doesn't specify how this case is to be handled.
If you want a more detailed explanation about memory, I suggest you read this. It's an explanation based on UNIX but similar mecanism are used on all OS.

Does const-correctness give the compiler more room for optimization?

I know that it improves readability and makes the program less error-prone, but how much does it improve the performance?
And on a side note, what's the major difference between a reference and a const pointer? I would assume they're stored in the memory differently, but how so?
[Edit: OK so this question is more subtle than I thought at first.]
Declaring a pointer-to-const or reference-of-const never helps any compiler to optimize anything. (Although see the Update at the bottom of this answer.)
The const declaration only indicates how an identifier will be used within the scope of its declaration; it does not say that the underlying object can not change.
Example:
int foo(const int *p) {
int x = *p;
bar(x);
x = *p;
return x;
}
The compiler cannot assume that *p is not modified by the call to bar(), because p could be (e.g.) a pointer to a global int and bar() might modify it.
If the compiler knows enough about the caller of foo() and the contents of bar() that it can prove bar() does not modify *p, then it can also perform that proof without the const declaration.
But this is true in general. Because const only has an effect within the scope of the declaration, the compiler can already see how you are treating the pointer or reference within that scope; it already knows that you are not modifying the underlying object.
So in short, all const does in this context is prevent you from making mistakes. It does not tell the compiler anything it does not already know, and therefore it is irrelevant for optimization.
What about functions that call foo()? Like:
int x = 37;
foo(&x);
printf("%d\n", x);
Can the compiler prove that this prints 37, since foo() takes a const int *?
No. Even though foo() takes a pointer-to-const, it might cast the const-ness away and modify the int. (This is not undefined behavior.) Here again, the compiler cannot make any assumptions in general; and if it knows enough about foo() to make such an optimization, it will know that even without the const.
The only time const might allow optimizations is cases like this:
const int x = 37;
foo(&x);
printf("%d\n", x);
Here, to modify x through any mechanism whatsoever (e.g., by taking a pointer to it and casting away the const) is to invoke Undefined Behavior. So the compiler is free to assume you do not do that, and it can propagate the constant 37 into the printf(). This sort of optimization is legal for any object you declare const. (In practice, a local variable to which you never take a reference will not benefit, because the compiler can already see whether you modify it within its scope.)
To answer your "side note" question, (a) a const pointer is a pointer; and (b) a const pointer can equal NULL. You are correct that the internal representation (i.e. an address) is most likely the same.
[update]
As Christoph points out in the comments, my answer is incomplete because it does not mention restrict.
Section 6.7.3.1 (4) of the C99 standard says:
During each execution of B, let L be any lvalue that has &L based on P. If L is used to
access the value of the object X that it designates, and X is also modified (by any means),
then the following requirements apply: T shall not be const-qualified. ...
(Here B is a basic block over which P, a restrict-pointer-to-T, is in scope.)
So if a C function foo() is declared like this:
foo(const int * restrict p)
...then the compiler may assume that no modifications to *p occur during the lifetime of p -- i.e., during the execution of foo() -- because otherwise the Behavior would be Undefined.
So in principle, combining restrict with a pointer-to-const could enable both of the optimizations that are dismissed above. Do any compilers actually implement such an optimization, I wonder? (GCC 4.5.2, at least, does not.)
Note that restrict only exists in C, not C++ (not even C++0x), except as a compiler-specific extension.
Off the top of my head, I can think of two cases where proper const-qualification allows additional optimizations (in cases where whole-program analysis is unavailable):
const int foo = 42;
bar(&foo);
printf("%i", foo);
Here, the compiler knows to print 42 without having to examine the body of bar() (which might not be visible in the curent translation unit) because all modifications to foo are illegal (this is the same as Nemo's example).
However, this is also possible without marking foo as const by declaring bar() as
extern void bar(const int *restrict p);
In many cases, the programmer actually wants restrict-qualified pointers-to-const and not plain pointers-to-const as function parameters, as only the former make any guarantees about the mutability of the pointed-to objects.
As to the second part of your question: For all practical purposes, a C++ reference can be thought of as a constant pointer (not a pointer to a constant value!) with automatic indirection - it is not any 'safer' or 'faster' than a pointer, just more convenient.
There are two issues with const in C++ (as far as optimization is concerned):
const_cast
mutable
const_cast mean that even though you pass an object by const reference or const pointer, the function might cast the const-ness away and modify the object (allowed if the object is not const to begin with).
mutable mean that even though an object is const, some of its parts may be modified (caching behavior). Also, objects pointed to (instead of being owned) can be modified in const methods, even when they logically are part of the object state. And finally global variables can be modified too...
const is here to help the developer catch logical mistakes early.
const-correctness generally doesn't help performance; most compilers don't even bother to track constness beyond the frontend. Marking variables as const can help, depending on the situation.
References and pointers are stored exactly the same way in memory.
This really depends on the compiler/platform (it may help optimisation on some compiler that has not yet been written, or on some platform that you never use). The C and C++ standards say nothing about performance other than giving complexity requirements for some functions.
Pointers and references to const usually do not help optimisation, as the const qualification can legally be cast away in some situations, and it is possible that the object can be modified by a different non-const reference. On the other hand, declaring an object to be const can be helpful, as it guarantees that the object cannot be modified (even when passed to functions that the compiler does not know the definition of). This allows the compiler to store the const object in read-only memory, or cache its value in a centralised place, reducing the need for copies and checks for modifications.
Pointers and references are usually implemented in the exact same way, but again this is totally platform dependant. If you are really interested then you should look at the generated machine code for your platform and compiler in your program (if indeed you are using a compiler that generates machine code).
One thing is, if you declare a global variable const, it may be possible to put it in the read-only portion of a library or executable and thus share it among multiple processes with a read-only mmap. This can be a big memory win on Linux at least if you have a lot of data declared in global variables.
Another situation, if you declare a constant global integer, or float or enum, the compiler may be able to just put the constant inline rather than using a variable reference. That's a bit faster I believe though I'm not a compiler expert.
References are just pointers underneath, implementation-wise.
It can help performance a little bit, but only if you are accessing the object directly through its declaration. Reference parameters and such cannot be optimized, since there might be other paths to an object not originally declared const, and the compiler generally can't tell if the object you are referencing was actually declared const or not unless that's the declaration you are using.
If you are using a const declaration, the compiler will know that externally-compiled function bodies, etc. cannot modify it, so you get a benefit there. And of course things like const int's are propagated at compile time, so that's a huge win (compared to just an int).
References and pointers are stored exactly the same, they just behave differently syntactically. References are basically renamings, and so are relatively safe, whereas pointers can point to lots of different things, and are thus more powerful and error-prone.
I guess the const pointer would be architecturally identical to the reference, so the machine code and efficiency would be the same. the real difference is syntax -- references are a cleaner, easier to read syntax, and since you don't need the extra machinery provided by a pointer, a reference would be stylistically preferred.

Const variables in c++

AFAIK we cannot change a value of a constant variable in C.
But i faced this interview question as below:
In C++, we have procedure to change the value of a constant variable.
Could anybody tell me how could we do it?
You can modify mutable data members of a const-qualified class-type object:
struct awesome_struct {
awesome_struct() : x(0) { }
mutable int x;
};
int main() {
const awesome_struct a;
a.x = 42;
}
The behavior here is well-defined.
Under the circumstances, I think I'd have explained the situation: attempting to change the value of a variable that's const gives undefined behavior. What he's probably asking about is how to change a variable that isn't itself const, but to which you've received a pointer or reference to const. In this case, when you're sure the variable itself is not const-qualified, you can cast away the const-ness with const_cast, and proceed to modify.
If you do the same when the variable itself is const-qualified the compiler will allow the code to compile, but the result will be undefined behavior. The attempt at modifying the variable might succeed -- or it might throw an exception, abort the program, re-format your NAS appliance's hard drives, or pretty much anything else.
It's probably also worth mentioning that when/if a variable is likely to need to be used this way, you can specify that the variable itself is mutable. This basically means that the variable in question is never const qualified, even if the object of which it's a part is const qualified.
There is no perfect way to cast away const-ness of a variable(which is const by definition) without invoking UB
Most probably the interviewers have no idea what they are talking about. ;-)
You cannot change the value of a constant variable. Trying to do so by cheating the compiler, invokes undefined behavior. The compiler may not be smart enough to tell you what you're doing is illegal, though!
Or maybe, you mean this:
const char *ptr= "Nawaz";
ptr = "Sarfaraz";
??
If so, then it's changing the value of the pointer itself, which is not constant, rather the data ptr points to is constant, and you cannot change that for example by writing ptr[2]='W';.
The interviewer was looking for const_cast. As an added bonus, you can explain scenarios where const_cast make sense, such as const_cast<MyClass *>(this)->doSomething() (which can be perfectly valid, although IMO not very clean), and scenarios where it can be dangerous , such as casting away constness of a passed const reference and therefore breaking the contract your functions signature provides.
If a const variable requires a change in value, then you have discovered a flaw in the design (under most, if not all, situations). If the design is yours, great! - you have an opportunity to improve your code.
If the design is not yours, good luck because the UB comment above is spot-on. In embedded systems, const variables may be placed in ROM. Performing tricks in an attempt to make such a variable writable lead to some spectacular failures. Furthermore, the problem extends beyond simple member variables. Methods and/or functions that are marked const may be optimized in such away that "lack of const-ness" becomes nonsensical.
(Take this "comment-as-answer" as a statement on the poor interview question.)

C++ optimization of reference-to-pointer argument

I'm wondering with functions like the following, whether to use a temporary variable (p):
void parse_foo(const char*& p_in_out,
foo& out) {
const char* p = p_in_out;
/* Parse, p gets incremented etc. */
p_in_out = p;
}
or can I just use the original argument and expect it to be optimized similarly to the above anyway? It seems like there should be such an optimization, but I've seen the above done in a few places such as Mozilla's code, with vague comments about "avoiding aliasing".
All good answers, but if you're worried about performance optimization, the actual parsing is going to take nearly all of the time, so pointer aliasing will probably be "in the noise".
The variant with a temporary variable could be faster since it doesn't imply that every change to the pointer is reflected back to the argument and the compiler has better chances on generating faster code. However the right way to test this is to compile and look at the disassembly.
Meanwhile this has noting to do with avoiding aliasing. In fact, the variant with a temporary variant does employ aliasing - now you have two pointers into the same array and that's exactly what aliasing is.
I would use a temporary if there is a possibility that the function is transactional.
i.e. the function succeeds or fails completely (no middle ground).
In this case I would use a temp to maintain state while the function executes and only assign back to the in_out parameter when the function completes successfully.
If the function exits prematurely (ie via exception) then we have two situations:
With a temporary (the external pointer is unchanged)
Using the parameter directly the external state is modified to reflect position.
I don't see any optimization advantages to either method.
Yes, you should assign it to a local that you mark restrict (__restrict in MSVC).
The reason for this is that if the compiler cannot be absolutely sure that nothing else in the scope points at p_in_out, it cannot store the contents under the pointer in a local register. It must read the data back every time you write to any other char * in the same scope. This is not an issue of whether it is a "smart" compiler or not; it is a consequence of correctness requirements.
By writing char* __restrict p you promise the compiler that no other pointer in the same scope points to the same address as p. Without this guarantee, the value of *p can change any time any other pointer is written to, or it may change the contents of some other pointer every time *p is written to. Thus, the compiler has to write out every assignment to *p back to memory immediately, and it has to read them back after every time another pointer is written through.
So, guaranteeing the compiler that this cannot happen — that it can load *p exactly once and assume no other pointer affects it — can be an improvement in performance. Exactly how much depends on the particular compiler and situation: on processors subject to a load-hit-store penalty, it's massive; on most x86 CPUs, it's modest.
The reason to prefer a pointer to a reference here is simply that a pointer can be marked restrict and a reference cannot. That's just the way C++ is.
You can try it both ways and measure the results to see which is really faster. And if you're curious, I've written in depth on restrict and the load-hit-store elsewhere.
addendum: after writing the above I realize that the people at Moz were more worried about the reference itself being aliased -- that is, that something else might point at the same address where const char *p is stored, rather than the char to which p points. But my answer is the same: under the hood, const char *&p means const char **p, and that's subject to the same aliasing issues as any other pointer.
How does the compiler know that p_in_out isn't aliased somehow? It really can't optimize away writing the data back through the reference.
struct foo {
setX(int); setY(int);
const char* current_pos;
} x;
parse_foo(x.current_pos, x);
I look at this and ask why you didn't just return the pointer Then you don't have a reference to a pointer and you don't have to worry about modify the original.
const char* parse_foo(const char* p, foo& out) {
//use p;
return p;
}
It also means you can call the function with an rvalue:
p = parse_foo(p+2, out);
One thought that comes immediately in mind: exception safety. If you throw an exception during parsing, the use of a temporary variable is what you should do to provide strong exception safety: Either the function call succeeded completely or it didn't do anything (from a user's perspective).

Constants and compiler optimization in C++

I've read all the advice on const-correctness in C++ and that it is important (in part) because it helps the compiler to optimize your code. What I've never seen is a good explanation on how the compiler uses this information to optimize the code, not even the good books go on explaining what happens behind the curtains.
For example, how does the compiler optimize a method that is declared const vs one that isn't but should be. What happens when you introduce mutable variables? Do they affect these optimizations of const methods?
I think that the const keyword was primarily introduced for compilation checking of the program semantic, not for optimization.
Herb Sutter, in the GotW #81 article, explains very well why the compiler can't optimize anything when passing parameters by const reference, or when declaring const return value. The reason is that the compiler has no way to be sure that the object referenced won't be changed, even if declared const : one could use a const_cast, or some other code can have a non-const reference on the same object.
However, quoting Herb Sutter's article :
There is [only] one case where saying
"const" can really mean something, and
that is when objects are made const at
the point they are defined. In that
case, the compiler can often
successfully put such "really const"
objects into read-only memory[...].
There is a lot more in this article, so I encourage you reading it: you'll have a better understanding of constant optimization after that.
Let's disregard methods and look only at const objects; the compiler has much more opportunity for optimization here. If an object is declared const, then (ISO/IEC 14882:2003 7.1.5.1(4)):
Except that any class member declared
mutable (7.1.1) can be modified, any
attempt to modify a const object
during its lifetime (3.8) results in
undefined behavior.
Lets disregard objects that may have mutable members - the compiler is free to assume that the object will not be modified, therefore it can produce significant optimizations. These optimizations can include things like:
incorporating the object's value directly into the machines instruction opcodes
complete elimination of code that can never be reached because the const object is used in a conditional expression that is known at compile time
loop unrolling if the const object is controlling the number of iterations of a loop
Note that this stuff applies only if the actual object is const - it does not apply to objects that are accessed through const pointers or references because those access paths can lead to objects that are not const (it's even well-defined to change objects though const pointers/references as long as the actual object is non-const and you cast away the constness of the access path to the object).
In practice, I don't think there are compilers out there that perform any significant optimizations for all kinds of const objects. but for objects that are primitive types (ints, chars, etc.) I think that compilers can be quite aggressive in optimizing
the use of those items.
handwaving begins
Essentially, the earlier the data is fixed, the more the compiler can move around the actual assignment of the data, ensuring that the pipeline doesn't stall out
end handwaving
Meh. Const-correctness is more of a style / error-checking thing than an optimisation. A full-on optimising compiler will follow variable usage and can detect when a variable is effectively const or not.
Added to that, the compiler cannot rely on you telling it the truth - you could be casting away the const inside a library function it doesn't know about.
So yes, const-correctness is a worthy thing to aim for, but it doesn't tell the compiler anything it won't figure out for itself, assuming a good optimising compiler.
It does not optimize the function that is declared const.
It can optimize functions that call the function that is declared const.
void someType::somefunc();
void MyFunc()
{
someType A(4); //
Fling(A.m_val);
A.someFunc();
Flong(A.m_val);
}
Here to call Fling, the valud A.m_val had to be loaded into a CPU register. If someFunc() is not const, the value would have to be reloaded before calling Flong(). If someFunc is const, then we can call Flong with the value that's still in the register.
The main reason for having methods as const is for const correctness, not for possible compilation optimization of the method itself.
If variables are const they can (in theory) be optimized away. But only is the scope can be seen by the compiler. After all the compiler must allow for them to be modified with a const_cast elsewhere.
Those are all true answers, but the answers and the question seem to presume one thing: that compiler optimization actually matters.
There is only one kind of code where compiler optimization matters, that is in code that is
a tight inner loop,
in code that you compile, as opposed to a 3rd-party library,
not containing function or method calls (even hidden ones),
where the program counter spends a noticeable fraction of its time
If the other 99% of the code is optimized to the Nth degree, it won't make a hoot of difference, because it only matters in code where the program counter actually spends time (which you can find by sampling).
I would be surprised if the optimizer actually puts much stock into a const declaration. There is a lot of code that will end up casting const-ness away, it would be a very reckless optimizer that relied on the programmer declaration to assume when the state may change.
const-correctness is also useful as documentation. If a function or parameter is listed as const, I don't need to worry about the value changing out from under my code (unless somebody else on the team is being very naughty). I'm not sure it would be actually worth it if it wasn't built into the library, though.
The most obvious point where const is a direct optimization is in passing arguments to a function. It's often important to ensure that the function doesn't modify the data so the only real choices for the function signature are these:
void f(Type dont_modify); // or
void f(Type const& dont_modify);
Of course, the real magic here is passing a reference rather than creating an (expensive) copy of the object. But if the reference weren't marked as const, this would weaken the semantics of this function and have negative effects (such as making error-tracking harder). Therefore, const enables an optimization here.
/EDIT: actually, a good compiler can analyze the control flow of the function, determine that it doesn't modify the argument and make the optimization (passing a reference rather than a copy) itself. const here is merely a help for the compiler. However, since C++ has some pretty complicated semantics and such control flow analysis can be very expensive for big functions, we probably shouldn't rely on compilers for this. Does anybody have any data to back me up / prove me wrong?
/EDIT2: and yes, as soon as custom copy constructors come into play, it gets even trickier because compilers unfortunately aren't allowed to omit calling them in this situation.
This code,
class Test
{
public:
Test (int value) : m_value (value)
{
}
void SetValue (int value) const
{
const_cast <Test&>(*this).MySetValue (value);
}
int Value () const
{
return m_value;
}
private:
void MySetValue (int value)
{
m_value = value;
}
int
m_value;
};
void modify (const Test &test, int value)
{
test.SetValue (value);
}
void main ()
{
const Test
test (100);
cout << test.Value () << endl;
modify (test, 50);
cout << test.Value () << endl;
}
outputs:
100
50
which means the const declared object has been altered in a const member function. The presence of const_cast (and the mutable keyword) in the C++ language means that the const keyword can't aide the compiler in generating optimised code. And as I pointed out in my previous posts, it can even produce unexpected results.
As a general rule:
const != optimisation
In fact, this is a legal C++ modifier:
volatile const
const helps compilers optimize mainly because it makes you write optimizable code. Unless you throw in const_cast.