Consider this snippet:
void foo(const int&);
int bar();
int test1()
{
int x = bar();
int y = x;
foo(x);
return x - y;
}
int test2()
{
const int x = bar();
const int y = x;
foo(x);
return x - y;
}
In my understanding of the standard, neither x nor y are allowed to be changed by foo in test2, whereas they could be changed by foo in test1 (with e.g. a const_cast to remove const from the const int& because the referenced objects aren't actually const in test1).
Now, neither gcc nor clang nor MSVC seem to optimize test2 to foo(bar()); return 0;, and I can understand that they do not want to waste optimization passes on an optimization that only rarely applies in practice.
But am I at least correct in my understanding of this situation, or am I missing some legal way for x to be modified in test2?
The standard says in [dcl.type.cv]:
Except that any class member declared mutable […] can be modified, any attempt to modify […] a const object […] during its lifetime […] results in undefined behavior.
It is also not possible to make this defined by ending the lifetime of the object prematurely, according to [basic.life]:
Creating a new object within the storage that a const complete object with […] automatic storage duration occupies, or within the storage that such a const object used to occupy before its lifetime ended, results in undefined behavior.
This means that the optimization of x - y to zero is valid because any attempt to modify x in foo would result in undefined behavior.
The interesting question is if there is a reason for not performing this optimization in existing compilers. Considering that the const object definition is local to test2 and the fact is used within the same function, usual exceptions such as support for symbol interposition do not apply here.
Related
dcl.type.cv provides an interesting example:
For another example,
struct X {
mutable int i;
int j;
};
struct Y {
X x;
Y();
};
const Y y;
y.x.i++; // well-formed: mutable member can be modified
y.x.j++; // ill-formed: const-qualified member modified
Y* p = const_cast<Y*>(&y); // cast away const-ness of y
p->x.i = 99; // well-formed: mutable member can be modified
p->x.j = 99; // undefined: modifies a const member
which indicates that, via const_cast, one may modify mutable members of a const qualified object, while you can't do that with non-mutable members.
To my understanding, this is because of the original constness of y itself. What would happen if we got rid of the mutable keyword, the const qualifier fot y, but modified the fields in a const method?
Example below:
#include <vector>
struct foo {
std::vector<int> vec{};
void bar() const {
auto& raw_ref = const_cast<std::vector<int>&>(vec);
raw_ref.push_back(0); // ok?
auto* raw_this = const_cast<foo*>(this);
raw_this->vec.push_back(0); // ok?
}
};
int main() {
foo f{};
f.bar();
}
Does it exhibit Undefined Behaviour? I would think that it does not, since we're modifying an originally non-const, but in a const context.
Additionally, notice that I provided two ways of modifying the vec. One with non-const reference and one with non-const pointer to this (which was originally const in this constext due to foo::bar being a const method). Do they differ in any particular way, given the question's context? I would assume that both are okay here.
Disclaimer: I am aware of mutable keyword, but this desing (not only being flawed) is simply an example. One could assume that the author of the code wanted to prohibit every single way of modifying vec except for push_backs.
Your quoted paragraph actually spelled out exactly what is undefined [dcl.type.cv]
Except that any class member declared mutable can be modified, any attempt to modify a const object during its lifetime results in undefined behavior.
A const reference/pointer to a non-const object doesn't make that object a const object, all your accesses are well-formed.
I understand that returning a reference to a function argument could invoke undefined behaviour as in the below example. The first 'MyType' created goes out of scope after the function call and is destroyed, leading to a dangling reference.
#include <iostream>
struct MyType {
std::string data;
inline ~MyType() {
data = "Destroyed!!";
}
};
const MyType& getref(const MyType& x) {
return x;
}
int main(int argc, char *argv[]) {
const MyType& test = getref(MyType {"test"});
std::cout << test.data << std::endl;
return 0;
}
My questions are:
Why isn't there a clang warning for this? There is a warning for returning a ref to a local variable.
No matter how hard I tried allocating other things on the stack before the print, I couldn't get it to print the wrong thing without making the destructor explicitly change the data. Why is this?
Are there any valid (safe) use cases of returning a reference to an argument?
No matter how hard I tried allocating other things on the stack before the print, I couldn't get it to print the wrong thing without making the destructor explicitly change the data. Why is this?
You're invoking undefined behaviour; anything could happen at this point!
Why isn't there a clang warning for this? There is a warning for returning a ref to a local variable.
Are there any valid (safe) use cases of returning a reference to an argument?
std::max
Suppose you want the max of two values:
auto foo = std::max(x, y); // foo is a copy of the result
You might not want a copy to be returned (for performance or semantic reasons). Luckily, std::max returns a reference, allowing for both use cases:
auto foo = std::max(x, y); // foo is a copy of the result
auto& bar = std::max(x, y); // bar is a reference to the result
An example of where it matters for semantics reasons is when used in conjunction with std::swap:
std::swap(x, std::max(y, z));
Imagine that std::max returned a copy of y. Rather than swapping with x with y, x would we swapped with a copy of y. In other words, y would remain unchanged.
Assignment
A common use case of this is the assignment operator. Take this for example:
#include <iostream>
class T {
public:
T(int _x) : x(_x) { }
T& operator=(const T& rhs) { x = rhs.x; return *this; }
int getX() const { return x; }
private:
int x = 0;
};
int main() {
T instanceA(42);
T instanceB(180);
std::cout << (instanceA = instanceB).getX() << std::endl;
}
You can think of the hidden this parameter as a non-null pointer (so close enough to a reference for the purposes of this question).
Defining a copy-assignment operator as such is generally considered idomatic. One of the reasons this is so is because the automatically generated copy-assignment operator has that signature:
N4618 12.8.2.2 Copy/move assignment operator [class.copy.assign]
The implicitly-declared copy assignment operator for a class will have the form X& X::operator=(const X&)
Making this a warning would penalize canonical code! As for why one might want to do such a thing (aside from it being canonical), that's another topic...
There's no warning because a const T& is just as likely to be a reference to an lvalue as it is to be a reference to an rvalue.
void func() {
MyType mt;
const MyType& l = getref(mt);
const MyType& r = getref(MyType{});
}
Considering that getref(mt) is entirely valid, this leaves us with three options:
Emit a warning for any function that both takes and returns a (cv) T&, regardless of whether it was called with an lvalue or an rvalue. This penalises legal code, so it's not a good option. [While this would be better served by only emitting a warning if the function specifically returns its parameter, this is infeasible for the reason mentioned below. Thus, the compiler would emit an error if the parameter and return value are the same type.]
[Note that this could cause problems with perfect forwarding. For example, std::move() uses perfect forwarding to take a parameter by reference or rvalue reference, then return an rvalue reference to that parameter. Any algorithm that doesn't account for this would see it as both taking and returning a T&&.]
Emit a warning when getref() is passed an rvalue. This requires that the compiler keep the information "getref() is a potential source of UB if passed an rvalue" in memory at all times, and that it examine every call. Not only does this cause additional overhead, it effectively requires getref() to be inline (due to compilers typically being designed to only operate on a single translation unit at a time, getref() must be defined in every module where it's used for the compiler to retain this information). This is infeasible at the moment, but may become more practical with future compilers (such as if cross-module optimisation becomes standard).
Never emit a warning, on the assumption that the programmer either is wise enough to never pass getref() an rvalue, or specifically intends for getref() to be a potential source of UB. This is the most viable option, as it neither penalises legal code, nor requires a compiler capable of cross-module optimisation.
Thus, most compilers will typically choose not to emit a warning, on the assumption that the programmer knows what they're doing. Even with -Wall specified, Clang, GCC, ICC, and MSVC won't emit any warnings for this.
The sole purpose of references is aliasing. Assigning a reference (referring to a constant int) to a integer seems absurd since it is not an alias (and it doesn't give an error!). I suppose it is similar to defining a constant int itself. Is there any difference?
Within a function body or file scope, the only difference is decltype(x). In one case, it is int const and the other int const&.
The const int & x=7; creates a temporary anonymous int with value 7. It then binds a reference x to it. The lifetime of the temporary is the extended to that of the reference. This is basically indistinguishable from x being the name of a const int with value 7.
An exception to it being nigh identical is when the binding occurs within an object's constructor as part of member initialization. In that case, the lifetime is not extended.
I suspect you can induce this with:
struct Foo{
int const& x=7;
Foo(){};
};
Either the above syntax is illegal or it dangles (I do not recall if there is a corner case in the standard for references), while:
struct Foo{
int const x=7;
Foo(){};
};
is both legal and does not dangle. So there is a difference.
There would also be a difference as a parameter to a function, wbhere =7 simply provides a default.
Is it possible to have a member variable only be considered mutable for a given function/code block?
e.g.
class Foo() {
int blah;
void bar() const {
blah = 5; // compiler error
}
void mutable_bar() const {
blah = 5; // no compiler error
}
}
note: in this case I do NOT want to get rid of the const at mutable_bar since logical const will be preserved.
Same question, but different perspective: Can I somehow apply the mutable keyword to a method instead of a variable?
No, it is not possible, at least in C++. You need either mutable or non const function.
Also there is const_cast don't use it to modify things. In case if you modify const_casted const value you get Undefined Behaviour.
5.2.11 Const cast
7 [ Note: Depending on the type of the object, a write operation through the pointer, lvalue or pointer
to data member resulting from a const_cast that casts away a const-qualifier73 may produce undefined
behavior (7.1.6.1). —end note ]
7.1.6.1 The cv-qualifiers
4 Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const
object during its lifetime (3.8) results in undefined behavior.
....
5 For another example
struct X {
mutable int i;
int j;
};
struct Y {
X x;
Y();
};
const Y y;
y.x.i++; // well-formed: mutable member can be modified
y.x.j++; // ill-formed: const-qualified member modified
Y* p = const_cast<Y*>(&y); // cast away const-ness of y
p->x.i = 99; // well-formed: mutable member can be modified
p->x.j = 99; // undefined: modifies a const member
—end example ]
It is technically possible to bypass the const in this case, by for example:
void mutable_bar() const {
int& p_blah = const_cast<int&>(blah);
p_blah = 5; // no compiler error
}
Or some similar construct. But you are really jumping through hoops to do something that you shouldn't be able to do. And as a comment on another post says, this is "undefined behavior", which means that in some cases it may not even work (or do what you expect it to do).
You could use a const_cast to make the member not-const in selected cases. This, along with an according comment, might even be a relatively clean solution. At least it's explicit that you break the const in a restricted scope, instead of making it world-wide mutable.
The following code yields a Segmentation Fault on the y = anotherFunctor() line. As far as I understand, this happens because the globalFunctor variable does not exist when anotherFunctor is created. But why does it work if I replace std::function<int(int)> with GlobalFunctor? How would I fix it?
#include <functional>
struct GlobalFunctor
{
int operator()() const { return 42; }
};
extern GlobalFunctor globalFunctor;
struct AnotherFunctor
{
AnotherFunctor() : g_(globalFunctor) {}
int operator()() const { return g_(); }
const std::function<int()>& g_;
} anotherFunctor;
GlobalFunctor globalFunctor;
int main()
{
AnotherFunctor af;
int x = af();
int y = anotherFunctor();
int z = x + y;
return 0;
}
Edit: I tried compiling this with clang instead of gcc and it warns me about binding reference member 'g_' to a temporary value -- but it crashes when compiling this. Would the cast to std::function create a temporary reference?
At g_(globalFunctor), globalFunctor has to be converted to an std::function because it is of type GlobalFunctor. So a temporary is produced and this is bound to the constant reference. You could think of the code as doing g_(std::function<int()>(globalFunctor)). However, this temporary only lives until the end of the constructor, as there is a special rule in C++ saying that temporaries in member initializer lists only live until the end of the constructor. This leaves you with a dangling reference.
The code works when you replace std::function<int(int)> with GlobalFunctor because no conversion is involved. Therefore, no temporaries are produced and the reference directly refers to the global object.
You either need to not use references and store a std::function internally or make a global std::function and have a reference to that.